Users' guide to detecting misleading claims in clinical research reports
http://www.100md.com
《英国医生杂志》
1 Department of Medicine, Mayo Clinic College of Medicine, Rochester MN, USA, 2 Department of Medicine, McMaster University, Hamilton ON, Canada, 3 Department of Clinical Epidemiology and Biostatistics, McMaster University, 4 Polish Institute for Evidence Based Medicine, Jagiellonian University Medical School, Krakow, Poland
Correspondence to: G H Guyatt guyatt@mcmaster.ca
Plenty of advice is available to help readers identify studies with weak methods, but would you be able to identify misleading claims in a report of a well conducted study?
Introduction
The discussion section of research reports often offers inferences that differ from those a dispassionate reader would draw from the methods and results.4 The table gives details of two systematic reviews summarising a similar set of randomised trials assessing the effect of albumin for fluid resuscitation. The trials included in both reviews were small and methodologically weak, and their results are heterogeneous. Both the reviews provide point estimates suggesting that albumin may increase mortality and confidence intervals that include the possibility of a considerable increase in mortality. Nevertheless, one set of authors took a strong position that albumin is dangerous, the other that it is not. Their positions were consistent with the interests of funders of their reviews.5
Comparison of two systematic reviews of albumin for fluid resuscitation
This is not an idiosyncratic example. Systematic examinations of the relation between funding and conclusions have found that the odds of recommending an experimental drug as treatment of choice increases fivefold with for-profit organisation funding (odds ratio 5.3, 95% confidence interval 2.0 to 14.4) compared with not-for-profit funding).6
If editors insisted that discussion sections of original articles included a systematic review of the relevant literature, this first pointer would no longer be relevant. However, few original trial reports include systematic reviews,7 and this may not change in the foreseeable future.
To follow this advice, readers must be able to make sense of the methods and results. Fortunately, clinicians can access many educational materials to acquire skills in interpreting studies' designs and their findings.2 3
Read the abstract reported in pre-appraised resources
Several systematic reviews have shown that industry funded studies typically yield larger treatment effects than not-for-profit funded studies.9 10 One likely explanation is choice of comparators.11 Researchers with an interest in a positive result may choose a placebo comparator rather than an alternative drug with proved effectiveness. For instance, in a study of 136 trials of new treatments for multiple myeloma, 60% of studies funded by for-profit organisations, but only 21% of trials funded by not-for-profit organisations, compared their new interventions against placebo or no treatment.12 Box A on bmj.com gives other examples.
When reading reports of randomised trials, clinicians should ask themselves: "Should the comparator have been another active agent rather than placebo; if investigators chose an active comparator, was the dose, formulation, and administration regimen optimal?"
Beware composite end points
Increasingly, investigators are conducting very large trials to detect small treatment effects. Results suggest small treatment effects when either the point estimate is close to no effect (a relative or absolute risk reduction close to 0; a relative risk or odds ratio close to 1) or the confidence interval includes values close to no effect. In one large trial, investigators randomly allocated just over 6000 participants to receive angiotensin converting enzyme inhibitors or diuretics for hypertension and concluded "initiation of antihypertensive treatment involving ACE inhibitors in older subjects... seems to lead to better outcomes than treatment with diuretic agents."15 In absolute terms, the difference between the regimens was very small: there were 4.2 events per 100 patient years in the angiotensin converting enzyme group and 4.6 events per 100 patient years in the diuretic group. The relative risk reduction corresponding to this absolute difference (11%) had an associated 95% confidence interval of - 1% to 21%.
Here, we have two reasons to doubt the importance of the apparent difference between the two types of antihypertensive drug. Firstly, the point estimate suggests a very small absolute difference (0.4 events per 100 patient years) and, secondly, the confidence interval suggests it may have been even smaller—indeed, there may have been no true difference at all.
When the absolute risk of adverse events in untreated patients is low, the presentation may focus on relative risk reduction and de-emphasise or ignore absolute risk reduction. Other techniques for making treatment effects seem large include misleading graphical representations,16 and using different time frames to present harms and benefits (see box C on bmj.com).
Beware subgroup analyses
We have presented six pointers to help clinicians protect themselves and their patients from potentially misleading presentations and interpretations of research findings. These strategies are unlikely to be foolproof. Decreasing the dependence of the research endeavour on for-profit funding, implementing a requirement for mandatory registration of clinical trials, and instituting more structured approaches to reviewing and reporting research18 19 may reduce biased reporting. At the same time, it is likely that potentially misleading reporting will always be with us, and the guide we have presented will help clinicians to stay armed.
Illustrative examples and references w1-w15 are on bmj.com
Contributors and sources: The authors are clinical epidemiologists who are involved in critical review of medical and surgical studies as part of their educational and editorial activities. In that context, they have collected some of the examples presented here. Other examples were collected by searching titles of records in PubMed using the textwords "misleading," "biased," and "spin" using the Related Articles feature for every pertinent hit, and by noting relevant references from retrieved papers. VMM and GHG wrote the first draft of the article. All authors contributed to the ideas represented in the article, made critical contributions and revisions to the first draft, and approved the final version. GHG is guarantor.
Funding: VMM is a Mayo Foundation scholar. PJD is supported by a Canadian Institutes of Health Research senior research fellowship award. MB was supported by a Detweiler fellowship, Royal College of Physicians and Surgeons of Canada and is currently supported by a Canada Research Chair.
Competing interests: VMM, RJ, HJS, PJD, and GHG are associate editors of the ACP Journal Club and Evidence-Based Medicine. JLB and RJ edit an evidence based medicine journal in Poland. MB is editor of the evidence based orthopaedic trauma section in the Journal of Orthopaedic Trauma.
References
Horton R. The rhetoric of research. BMJ 1995;310: 985-7.
Greenhalgh T. How to read a paper. London: BMJ Books, 2001.
Guyatt G, Rennie D, eds. Users' guides to the medical literature. A manual for evidence-based clinical practice. Chicago, IL: AMA Press, 2002.
Bero LA, Rennie D. Influences on the quality of published drug studies. Int J Technol Assess Health Care 1996;12: 209-37.
Cook D, Guyatt G. Colloid use for fluid resuscitation: evidence and spin. Ann Intern Med 2001;135: 205-8.
Als-Nielsen B, Chen W, Gluud C, Kjaergard LL. Association of funding and conclusions in randomized drug trials: a reflection of treatment effect or adverse events? JAMA 2003;290: 921-8.
Clarke M, Alderson P, Chalmers I. Discussion sections in reports of controlled trials published in general medical journals. JAMA 2002;287: 2799-801.
Devereaux PJ, Manns BJ, Ghali WA, Quan H, Guyatt GH. Reviewing the reviewers: the quality of reporting in three secondary journals. CMAJ 2001;164: 1573-6.
Bekelman JE, Li Y, Gross CP. Scope and impact of financial conflicts of interest in biomedical research: a systematic review. JAMA 2003;289: 454-65.
Lexchin J, Bero LA, Djulbegovic B, Clark O. Pharmaceutical industry sponsorship and research outcome and quality: systematic review. BMJ 2003;326: 1167-70.
Mann H, Djulbegovic B. Biases due to differences in the treatments selected for comparison (comparator bias). James Lind Library. www.jameslindlibrary.org/essays/bias/comparator-_bias.html (accessed 2 Oct 2004).
Djulbegovic B, Lacevic M, Cantor A, Fields KK, Bennett CL, Adams JR, et al. The uncertainty principle and industry-sponsored research. Lancet 2000;356: 635-8.
Lewis EJ, Hunsicker LG, Clarke WR, Berl T, Pohl MA, Lewis JB, et al. Renoprotective effect of the angiotensin-receptor antagonist irbesartan in patients with nephropathy due to type 2 diabetes. N Engl J Med 2001;345: 851-60.
Waksman R, Ajani AE, White RL, Chan RC, Satler LF, Kent KM, et al. Intravascular gamma radiation for in-stent restenosis in saphenous-vein bypass grafts. N Engl J Med 2002;346: 1194-9.
Wing LM, Reid CM, Ryan P, Beilin LJ, Brown MA, Jennings GL, et al. A comparison of outcomes with angiotensin-converting-enzyme inhibitors and diuretics for hypertension in the elderly. N Engl J Med 2003;348: 583-92.
Tufte E. The visual display of quantitative information. Cheshire, CT: Graphics Press, 1983.
Oxman A, Guyatt G, Green L, Craig J, Walter S, Cook D. When to believe a subgroup analysis. In: Guyatt G, Rennie D, eds. Users' guides to the medical literature. A manual for evidence-based clinical practice. Chicago, IL: AMA Press, 2002: 553-65.
Docherty M, Smith R. The case for structuring the discussion of scientific papers. BMJ 1999;318: 1224-5.
Moher D, Schulz KF, Altman DG. The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials. Lancet 2001;357: 1191-4.(Victor M Montori, Assista)
Correspondence to: G H Guyatt guyatt@mcmaster.ca
Plenty of advice is available to help readers identify studies with weak methods, but would you be able to identify misleading claims in a report of a well conducted study?
Introduction
The discussion section of research reports often offers inferences that differ from those a dispassionate reader would draw from the methods and results.4 The table gives details of two systematic reviews summarising a similar set of randomised trials assessing the effect of albumin for fluid resuscitation. The trials included in both reviews were small and methodologically weak, and their results are heterogeneous. Both the reviews provide point estimates suggesting that albumin may increase mortality and confidence intervals that include the possibility of a considerable increase in mortality. Nevertheless, one set of authors took a strong position that albumin is dangerous, the other that it is not. Their positions were consistent with the interests of funders of their reviews.5
Comparison of two systematic reviews of albumin for fluid resuscitation
This is not an idiosyncratic example. Systematic examinations of the relation between funding and conclusions have found that the odds of recommending an experimental drug as treatment of choice increases fivefold with for-profit organisation funding (odds ratio 5.3, 95% confidence interval 2.0 to 14.4) compared with not-for-profit funding).6
If editors insisted that discussion sections of original articles included a systematic review of the relevant literature, this first pointer would no longer be relevant. However, few original trial reports include systematic reviews,7 and this may not change in the foreseeable future.
To follow this advice, readers must be able to make sense of the methods and results. Fortunately, clinicians can access many educational materials to acquire skills in interpreting studies' designs and their findings.2 3
Read the abstract reported in pre-appraised resources
Several systematic reviews have shown that industry funded studies typically yield larger treatment effects than not-for-profit funded studies.9 10 One likely explanation is choice of comparators.11 Researchers with an interest in a positive result may choose a placebo comparator rather than an alternative drug with proved effectiveness. For instance, in a study of 136 trials of new treatments for multiple myeloma, 60% of studies funded by for-profit organisations, but only 21% of trials funded by not-for-profit organisations, compared their new interventions against placebo or no treatment.12 Box A on bmj.com gives other examples.
When reading reports of randomised trials, clinicians should ask themselves: "Should the comparator have been another active agent rather than placebo; if investigators chose an active comparator, was the dose, formulation, and administration regimen optimal?"
Beware composite end points
Increasingly, investigators are conducting very large trials to detect small treatment effects. Results suggest small treatment effects when either the point estimate is close to no effect (a relative or absolute risk reduction close to 0; a relative risk or odds ratio close to 1) or the confidence interval includes values close to no effect. In one large trial, investigators randomly allocated just over 6000 participants to receive angiotensin converting enzyme inhibitors or diuretics for hypertension and concluded "initiation of antihypertensive treatment involving ACE inhibitors in older subjects... seems to lead to better outcomes than treatment with diuretic agents."15 In absolute terms, the difference between the regimens was very small: there were 4.2 events per 100 patient years in the angiotensin converting enzyme group and 4.6 events per 100 patient years in the diuretic group. The relative risk reduction corresponding to this absolute difference (11%) had an associated 95% confidence interval of - 1% to 21%.
Here, we have two reasons to doubt the importance of the apparent difference between the two types of antihypertensive drug. Firstly, the point estimate suggests a very small absolute difference (0.4 events per 100 patient years) and, secondly, the confidence interval suggests it may have been even smaller—indeed, there may have been no true difference at all.
When the absolute risk of adverse events in untreated patients is low, the presentation may focus on relative risk reduction and de-emphasise or ignore absolute risk reduction. Other techniques for making treatment effects seem large include misleading graphical representations,16 and using different time frames to present harms and benefits (see box C on bmj.com).
Beware subgroup analyses
We have presented six pointers to help clinicians protect themselves and their patients from potentially misleading presentations and interpretations of research findings. These strategies are unlikely to be foolproof. Decreasing the dependence of the research endeavour on for-profit funding, implementing a requirement for mandatory registration of clinical trials, and instituting more structured approaches to reviewing and reporting research18 19 may reduce biased reporting. At the same time, it is likely that potentially misleading reporting will always be with us, and the guide we have presented will help clinicians to stay armed.
Illustrative examples and references w1-w15 are on bmj.com
Contributors and sources: The authors are clinical epidemiologists who are involved in critical review of medical and surgical studies as part of their educational and editorial activities. In that context, they have collected some of the examples presented here. Other examples were collected by searching titles of records in PubMed using the textwords "misleading," "biased," and "spin" using the Related Articles feature for every pertinent hit, and by noting relevant references from retrieved papers. VMM and GHG wrote the first draft of the article. All authors contributed to the ideas represented in the article, made critical contributions and revisions to the first draft, and approved the final version. GHG is guarantor.
Funding: VMM is a Mayo Foundation scholar. PJD is supported by a Canadian Institutes of Health Research senior research fellowship award. MB was supported by a Detweiler fellowship, Royal College of Physicians and Surgeons of Canada and is currently supported by a Canada Research Chair.
Competing interests: VMM, RJ, HJS, PJD, and GHG are associate editors of the ACP Journal Club and Evidence-Based Medicine. JLB and RJ edit an evidence based medicine journal in Poland. MB is editor of the evidence based orthopaedic trauma section in the Journal of Orthopaedic Trauma.
References
Horton R. The rhetoric of research. BMJ 1995;310: 985-7.
Greenhalgh T. How to read a paper. London: BMJ Books, 2001.
Guyatt G, Rennie D, eds. Users' guides to the medical literature. A manual for evidence-based clinical practice. Chicago, IL: AMA Press, 2002.
Bero LA, Rennie D. Influences on the quality of published drug studies. Int J Technol Assess Health Care 1996;12: 209-37.
Cook D, Guyatt G. Colloid use for fluid resuscitation: evidence and spin. Ann Intern Med 2001;135: 205-8.
Als-Nielsen B, Chen W, Gluud C, Kjaergard LL. Association of funding and conclusions in randomized drug trials: a reflection of treatment effect or adverse events? JAMA 2003;290: 921-8.
Clarke M, Alderson P, Chalmers I. Discussion sections in reports of controlled trials published in general medical journals. JAMA 2002;287: 2799-801.
Devereaux PJ, Manns BJ, Ghali WA, Quan H, Guyatt GH. Reviewing the reviewers: the quality of reporting in three secondary journals. CMAJ 2001;164: 1573-6.
Bekelman JE, Li Y, Gross CP. Scope and impact of financial conflicts of interest in biomedical research: a systematic review. JAMA 2003;289: 454-65.
Lexchin J, Bero LA, Djulbegovic B, Clark O. Pharmaceutical industry sponsorship and research outcome and quality: systematic review. BMJ 2003;326: 1167-70.
Mann H, Djulbegovic B. Biases due to differences in the treatments selected for comparison (comparator bias). James Lind Library. www.jameslindlibrary.org/essays/bias/comparator-_bias.html (accessed 2 Oct 2004).
Djulbegovic B, Lacevic M, Cantor A, Fields KK, Bennett CL, Adams JR, et al. The uncertainty principle and industry-sponsored research. Lancet 2000;356: 635-8.
Lewis EJ, Hunsicker LG, Clarke WR, Berl T, Pohl MA, Lewis JB, et al. Renoprotective effect of the angiotensin-receptor antagonist irbesartan in patients with nephropathy due to type 2 diabetes. N Engl J Med 2001;345: 851-60.
Waksman R, Ajani AE, White RL, Chan RC, Satler LF, Kent KM, et al. Intravascular gamma radiation for in-stent restenosis in saphenous-vein bypass grafts. N Engl J Med 2002;346: 1194-9.
Wing LM, Reid CM, Ryan P, Beilin LJ, Brown MA, Jennings GL, et al. A comparison of outcomes with angiotensin-converting-enzyme inhibitors and diuretics for hypertension in the elderly. N Engl J Med 2003;348: 583-92.
Tufte E. The visual display of quantitative information. Cheshire, CT: Graphics Press, 1983.
Oxman A, Guyatt G, Green L, Craig J, Walter S, Cook D. When to believe a subgroup analysis. In: Guyatt G, Rennie D, eds. Users' guides to the medical literature. A manual for evidence-based clinical practice. Chicago, IL: AMA Press, 2002: 553-65.
Docherty M, Smith R. The case for structuring the discussion of scientific papers. BMJ 1999;318: 1224-5.
Moher D, Schulz KF, Altman DG. The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials. Lancet 2001;357: 1191-4.(Victor M Montori, Assista)