Methods/Statistics
Statistical inference and effect measures in abstracts of major HIV and AIDS journals, 1987-2022: A systematic review Andreas Stang* Andreas Stang Henning Schäfer Ahmad Idrissi-Yaghir Christoph M. Friedrich Matthew P. Fox
Introduction: With the emergence of HIV/AIDS journals, the development of the reporting of statistical inference and effect measures in published abstracts can be examined from the beginning in a new field. The aim of this study was to describe time trends of statistical inference and effect measure reporting in major HIV/AIDS journals.
Methods: We included 10 major HIV/AIDS journals as ranked by the journal citation report in 2022 and analyzed all available PubMed entries for the period 1987 through 2022. We applied rule-based text mining and machine learning methodology to detect the presence of confidence intervals, numerical p-values or comparisons of p-values with thresholds, language describing statistical significance, and effect measures for dichotomous outcomes.
Results: Among 41,730 PubMed entries from the major HIV/AIDS journals (AIDS, AIDS Behav, AIDS Patient Care STDS, AIDS Res Ther, Curr HIV/AIDS Rep, Current Opin HIV AIDS, HIV Med, J Acquir Immune Deifc Syndr, J Int AIDS Soc, Lancet HIV), 31,665 contained an abstract. In the early years, most abstracts reporting statistical inference contained only significance terminology without confidence intervals and p-values. From 1988 to 2005, each year 30% of all abstracts contained p-values without confidence intervals. Thereafter, this reporting style continued to decline. The reporting of confidence intervals increased steadily from 1988 (11%) to 2022 (56%). Of the 17% of abstracts in 2017-2022 that included any effect measure, half reported odds ratios (51%), followed by hazard ratios (28%) and risk ratios (16%). Difference measures and number needed to treat or harm were very uncommon.
Discussion: Within the HIV/AIDS literature, there has been widespread use of confidence intervals. Most of the journals that we reviewed had a decrease in reporting only statistical significance without confidence intervals over time. The distribution of p-values shows little indication of p-hacking and this distribution looks very similar to the p-value distribution in the entire PubMed database. The HIV literature appears to more rely on confidence intervals than the previously reviewed literature.