Impact of the ICD-10 Primary Health Care(PHC) diagnostic and management guidelines for mental disorders on detection and outcome in primary careClust
Department of Psychiatry University of Cambridge, UKw'(y/1, 百拇医药
JONATHAN EVANS, MRCPsych and GLYNN HARRISON, FRCPsychw'(y/1, 百拇医药
Division of Psychiatry, University of Bristol, UKw'(y/1, 百拇医药
DEBORAH J. SHARP, FRCGPw'(y/1, 百拇医药
Division of Primary Health Care, University of Bristol, UKw'(y/1, 百拇医药
ELLEN WILKINSON, MRCPsych, GEMMA McCANN, BA, MATHEW SPENCE, BSc, CATHERINE CRILLY, MRCGP and LUCY BRINDLE, BScw'(y/1, 百拇医药
Division of Psychiatry, University of Bristol, UKw'(y/1, 百拇医药
Correspondence: Tim Croudace, Department of Psychiatry, University of Cambridge, Box 189, Addenbrooke's Hospital, Cambridge CB2 2QQ, UKw'(y/1, 百拇医药
Declaration of interest D.J.S. was involved in the development of the WHO guidelines.w'(y/1, 百拇医药
{dagger} See editorial, pp. 1–2, this issue. w'(y/1, 百拇医药
ABSTRACTw'(y/1, 百拇医药
Background The World Health Organization (WHO) ICD-10 Primary Health Care (PHC) Guidelines for Diagnosis and Management of Mental Disorders (1996) have not been evaluated in a pragmatic randomised controlled trial (RCT).
Aims To evaluate the effect of local adaptation and dissemination of the guidelines.a&8n, 百拇医药
Method Pragmatic, pair-matched, cluster RCT involving 30 practices.a&8n, 百拇医药
Results Guideline practices were less sensitive but more specific in identifying morbidity, but these differences were not significant. Guideline patients did not differ from usual-care patients on 12-item General Health Questionnaire scores at 3-month follow-up or in the proportion who were still cases. There were no significant differences in secondary outcomes.a&8n, 百拇医药
Conclusions Attempts to influence clinician behaviour through a process of adaptation and extension of guidelines are unlikely to change detection rates or outcomes.a&8n, 百拇医药
INTRODUCTIONa&8n, 百拇医药
The majority of patients with mental health problems present to primary health care (PHC) services , yet general practitioners' (GPs') detection and management are often considered deficient . Improvement in the knowledge and skills of primary care practitioners (Gask et al, ) has been sought through the development of clinical guidelines educational programmes , on-site mental health workers and shared care Evidence for the effectiveness of such approaches is contradictory, with benefits observed in some settings but not others. Current emphasis focuses on educational interventions based on clinical practice guidelines . The World Health Organization (WHO) undertook a major review of Chapter V of ICD-10 (on mental and behavioural disorders) specifically for primary health care practitioners. The new PHC version (ICD-10 PHC; ) proposed both a general diagnostic classification for use in PHC and recommendations on management. This system was subjected to international field trials in which it was evaluated for acceptability and ease of application. No study has evaluated the impact of introducing such guidelines in a pragmatic randomised controlled trial (RCT). We developed a process for local adaptation and dissemination of the ICD-10 PHC (1996), intended to engender shared ownership between primary and secondary care practitioners. We evaluated this development of the guidelines in a pragmatic cluster RCT. Our hypotheses were that enabling GPs to adapt and extend the guidelines in conjunction with health care professionals from secondary services would improve practice detection rates of minor psychiatric morbidity, and patient outcomes at 3 months.
METHODmq{&$, 百拇医药
Study area and eligibility of practicesmq{&$, 百拇医药
The study was conducted in Bristol, UK (pre-intervention data collection: 9 October 1997 to 9 April 1998; post-intervention data collection: 2 September 1998 to 13 May 1999) in a mixed urban and rural area (population 178 000 aged 16-64). Mental Illness Needs Index social deprivation scores for electoral wards ranged from 83 to 118. All 43 general practices located within the catchment area of South Bristol Mental Health Services were eligible and invited to participate (by letter from G.H. and D.J.S.). Participating practices were reimbursed to cover costs of time spent in guideline adaptation meetings and administrative support for the study. Approval was obtained from local ethics committees.mq{&$, 百拇医药
Design and process of randomisationmq{&$, 百拇医药
We used a pair-matched, cluster RCT design . Practices were randomised in pairs after stratifying by social deprivation score. It was considered a priori that the socio-economic characteristics of patients and practice settings might influence outcomes. Using the rand function in Excel, 15 random numbers between 0 and 1 were generated (by T.C.). In each pair, the first practice was assigned to the intervention group if the number was "0.5, and the second if >0.5; 30 practices (70%) consented to randomisation. summarises the trial design and the recruitment and retention of practices.
fig.ommittedh&9hdq, 百拇医药
Fig. 1 Trial design, recruitment and retention of practices. CMHT, community mental health team; GP, general practitioner; GHQ, General Health Questionnaire.h&9hdq, 百拇医药
Sample sizeh&9hdq, 百拇医药
Sample size calculations were based on patient-level outcomes at 3 months among those with General Health Questionnaire 12-item version (GHQ-12) scores >3 at the screen. We aimed to detect a mean difference of 1 point (standard deviation=3) in the GHQ-12 score at 3-month follow-up using a two-tailed test, alpha=0.05, beta=0.20. This required 143 patients (in each group), and therefore an initial screen of approximately 1000 surgery attenders (assuming 30% score >3 at the screen).h&9hdq, 百拇医药
Intracluster correlationh&9hdq, 百拇医药
Baseline data (n=30 practices) were used to estimate the variance inflation factors, e.g. the intracluster correlation for the GHQ-12 scores from the screen was 0.012 (average cluster size, 37.04; design effect, 1.43). The intraclass correlation for change in GHQ-12 scores among those scoring >3 at the screen (during baseline) was 0.038 when clustered by general practice. The average cluster size was 8.4 patients per practice followed up. The design effect for patient outcomes at follow-up was therefore 1.3, requiring 186 patients in each group or 372 in total.
Baseline screening and follow-updo, 百拇医药
During baseline and post-intervention periods we screened separate cross-sectional samples of consecutive attenders and followed them up by postal questionnaire at 3 months. Research workers visited each practice for at least two randomly selected surgeries to distribute copies of the GHQ-12 () to all surgery attenders aged between 16 and 64 years who gave verbal consent. During these surgeries, GPs completed a Physician Encounter Form () for each patient. Practitioners were asked to record reasons for consultation, presenting symptoms, severity of disorder and diagnoses selected from a list based on the ICD-10 PHC chapter headings. Where no disorder was present, they were asked to indicate ‘No diagnosis of psychological disorder’. This process was repeated post-intervention. All consecutive attenders who scored >3 on the GHQ-12 at initial screening were followed up at 3 months (regardless of GP detection). Outcomes were collected via postal administration of four self-report questionnaires, which were returned in the stamped, addressed envelopes provided. Non-responders were sent second and third reminders.
The interventionm, 百拇医药
The intervention comprised the local development and dissemination of the WHO ICD-10 PHC guidelines (1996 version, which was ‘current’ at that time). Acknowledging evidence that emphasised the need for ownership of guidelines and active participation in their development (), we provided participating GPs with the opportunity to adapt the WHO guidelines in a shared-ownership model with colleagues from local psychiatric services. One GP from each intervention practice volunteered to become the guideline advocate, and took part in a series of guideline revision workshops based on a modified nominal group technique (). During these workshops, attended by professionals from primary and secondary care (some jointly) the guidelines were:m, 百拇医药
revised to reflect the consensus of participating practitioners from primary and secondary services;m, 百拇医药
amended, e.g. to include recommendations concerning use of practice-based counsellors;
extended, to include thresholds for specialist referral and to incorporate a list of local statutory National Health Service (NHS) and non-statutory services to which referrals could be made or who offered specific help.m9-{, 百拇医药
An editorial team comprising primary care and psychiatric representatives of the research team incorporated the changes into a final document (‘the purple book’). In addition to the (indirect) dissemination through guideline-advocate participation in the above, participating GPs received a personal, desktop copy of the guidelines. Educational meetings (approved for Post-Graduate Education Allowance accreditation) were then organised in each intervention practice, facilitated by the guideline advocate and attended by a GP (C.C.) and psychiatrist (E.W.) from the research team. At these meetings the process of adaptation was described, and the guidelines were introduced and discussed.m9-{, 百拇医药
Outcomesm9-{, 百拇医药
Primary outcomes were: detection of minor psychiatric morbidity (sensitivity) at practice level, the unit of randomisation; and 3-month clinical outcomes for GHQ—12 cases. The latter were measured by GHQ—12 score at follow-up and the proportion who were still cases, i.e. scoring >3 (). Secondary outcomes were quality of life (QoL), disability, satisfaction with care and the specificity of detection performance (at practice level).
Measures(, 百拇医药
A GHQ—12 score of >3 was used to define a case for the purpose of calculating the GP identification indices (sensitivity and specificity) for detection of morbidity. Repeat GHQ—12 was used to record 3-month clinical outcomes. Impact on role-functioning was recorded using the sum of questions 2 to 6 on the Brief Disability Questionnaire (BDQ; ). This comprises five items: limitation in daily activities; limitation in functioning; motivation for work; personal efficiency; and deterioration in social relations. These were rated on a 3-point scale: 1=no, not at all; 2=yes, sometimes or a little; 3=yes, moderately or definitely. Total score ranged from 5 to 15, high indicating worse disability.(, 百拇医药
Quality of life was recorded by the five-item European Quality of Life (EuroQol) instrument (). Items were summed to give a total score (range 5 to 15; high indicating worse QoL). A single question assessed satisfaction with care received: ‘How satisfied are you overall with the care you have recently received from your doctor?’ Responses were rated on a 5-point scale: 1=terrible; 2=mostly dissatisfied; 3=mixed views; 4=mostly satisfied; 5=excellent.
Analysisyx:, 百拇医药
Random effects meta-analysis () was used to provide graphical and statistical summaries of all primary (sensitivity, repeat GHQ—12) and secondary (specificity, disability, satisfaction and QoL) outcomes. This procedure generates a weighted average intervention effect (with 95% confidence intervals) pooled over the practice pairs, which were stratified by social deprivation. It also produced a z-score and P-value for the test that the intervention effect was significantly different from zero. Analyses were performed using the metan meta-analysis procedure in Stata version 6 for PC. Since measures of baseline performance (practice sensitivity and specificity before the introduction of the guidelines) were recorded, these were entered as covariates in a regression extension of the random effects meta-analysis procedure. We used the meta-regression approach recommended by Ukoumunne & Thompson to correct for baseline imbalance in study outcomes (at the cluster level). Meta-regression analysis in a pair-matched cluster RCT provides an (analysis of covariance style) adjustment to the estimated risk difference that corrects for any baseline differences in outcomes that might have resulted from randomising a small number of experimental units (as is the case in cluster RCTs). To implement the adjusted analyses we used the metareg procedure in Stata with the additive between study variance (tau) estimated using the method of moments (option bs(mm)). To maximise sample size for the analysis of the outcomes at 3-month follow-up, patients were included even if the GP had not completed the Physician Encounter Form. No adjustment was made for patient-level covariates. All analyses were on an intention-to-treat basis.
RESULTSj!r8|\, http://www.100md.com
summarises the flow of patients and practices.j!r8|\, http://www.100md.com
fig.ommittedj!r8|\, http://www.100md.com
Fig. 2 Flow of patients through screening and follow-up. PEF, Physician Encounter Form; GHQ, General Health Questionnaire.j!r8|\, http://www.100md.com
The administrative characteristics of the consenting practices (30/43) and those who declined to participate are summarised j!r8|\, http://www.100md.com
fig.ommittedj!r8|\, http://www.100md.com
Table 1 Practice characteristics;1 values are number (percentage) unless otherwise specifiedj!r8|\, http://www.100md.com
The characteristics of the participating GPs and of the sample of consecutive attenders for whom a matching Physician Encounter Form was collected appeared to indicate a balanced outcome of (cluster-level) randomisation, after stratifying by (practice) social deprivation score.j!r8|\, http://www.100md.com
fig.ommittedj!r8|\, http://www.100md.com
Table 2 Practitioner characteristics; values are number (percentage) unless stated otherwise
fig.ommitted8l)0)u1, 百拇医药
Table 3 Patient characteristics (for sample with complete GHQ and Physician Encounter Form); values are number (percentage) unless otherwise specified8l)0)u1, 百拇医药
shows the (very similar) cumulative distribution of GHQ—12 scores in guideline and usual-care practices, for consecutive attenders during the post-intervention period.8l)0)u1, 百拇医药
fig.ommitted8l)0)u1, 百拇医药
Fig. 3 General Health Questionnaire 12-item version (GHQ—12) scores in guideline and usual-care practices.8l)0)u1, 百拇医药
Primary cluster-level outcome: GP detection (sensitivity)8l)0)u1, 百拇医药
Identification of disorder required GPs to have indicated on the Physician Encounter Form the presence of at least one named psychological disorder from the list of ICD—10 PHC diagnoses. After intervention, the crude detection rate (sensitivity) for GPs in the guideline practices was 47%, compared with 55% in the usual-care practices .8l)0)u1, 百拇医药
fig.ommitted
Table 4 Practice detection rates during baseline period@{%)f)[, 百拇医药
The pooled risk difference between guideline and usual care was -10.8% (95% CI -24.0% to 2.4%), which was not significant (z=1.61, P=0.11). The unadjusted analysis is summarised in , which shows the risk difference for each pair and contributions to the pooled effect size (random-effects meta-analysis). The confidence limits for the intervention effect suggest that the guideline practices were less successful in identifying GHQ morbidity. However, this trend was reduced and estimated more precisely (evidenced by the reduction in width of the confidence interval) when the adjustment for baseline outcomes () was made: after adjustment for baseline sensitivity, the difference was -6.6% (95% CI -19.0% to 5.9%; z=1.03%, P=0.304). The cluster-level correlation between baseline and post-intervention sensitivity was 0.45 (Pearson correlation, P=0.07). The significance of the baseline adjustment in the meta-regression analysis was P=0.03, which explains the slight increase in the precision of the estimated intervention effect from the meta-regression analysis. The estimated effect of the intervention was also reduced by almost half (from -10.8% to -6.6%).
fig.ommittedz0, 百拇医药
Fig. 4 Random effects meta-analysis plot showing differences in practice detection by pair: (a) sensitivity and (b) specificity. Pooled estimates represent unadjusted and baseline-adjusted (meta-regression) weighted risk differences.z0, 百拇医药
Secondary cluster-level outcome: GP detection (specificity)z0, 百拇医药
After intervention, the crude specificities achieved by guideline and usual-care practices were 86% and 79%, respectively (). The pooled risk difference between guideline and usual care, for the secondary cluster-level outcome practice specificity, was 5.3% (-5.0% to 15.7%), which was not significant (z=1.01, P=0.31).z0, 百拇医药
fig.ommittedz0, 百拇医药
Table 5 Practice detection rates during post-intervention periodz0, 百拇医药
After adjustment for baseline specificity (), this difference increased slightly, to 6.2% (95% CI -4.4% to 16.8%; z=1.14, P=0.255). However, the baseline adjustment in the meta-regression analysis was not significant (P=0.416), explaining the decrease in the precision of the estimated intervention effect. The cluster-level correlation between baseline and post-intervention specificity was 0.21 (Pearson correlation, P=0.52). The baseline covariate was therefore not prognostic for intervention outcomes.
Postal questionnaire follow-up at 3 monthst|, http://www.100md.com
During both baseline and post-intervention periods, we followed up all consecutive attenders who scored >3 on the GHQ-12 screen. The response rate to the postal questionnaire follow-up during the post-intervention period was 61% for guideline and 62% for usual-care practices. Inspection of Tables 6 and 7 demonstrates that response rates were lower from practices in socially deprived areas. The response rate was unusually low for guideline practices during the pre-intervention baseline period (49%) (). The correlation between social deprivation score and response rate was greater than 0.3 (Spearman's rank correlation coefficient) during both baseline and post-intervention periods.t|, http://www.100md.com
fig.ommittedt|, http://www.100md.com
Table 6 Patient-level outcomes during baseline period; higher scores indicate worse outcomes (except for satisfaction)t|, http://www.100md.com
fig.ommittedt|, http://www.100md.com
Table 7 Patient-level primary (GHQ) outcomes during post-intervention period
Primary outcome measure, patient-level: repeat GHQ—12r., 百拇医药
There was no evidence for any impact of the intervention on our primary clinical outcome for patients, i.e. the repeat GHQ—12 score (difference in mean GHQ—12 at 3 months guideline minus usual care (G—UC)=0.45, 95% CI -1.42 to 2.33; P=0.63), nor in the proportion of patients who were still scoring above the threshold for caseness (difference in proportion scoring >3 on GHQ—12 at 3 months, G—UC=4.3%, 95% CI -12.4% to 20.9%). Results indicated worse outcomes (higher GHQ—12 scores and more cases at 3-month follow-up) in the guideline practices than in usual care, although the confidence intervals were wide.r., 百拇医药
Secondary outcomes, patient-level: disability, satisfaction and QoLr., 百拇医药
There were no differences in satisfaction (difference in mean satisfaction, G—UC=0.20, 95% CI -0.05 to 0.45; P=0.12) or disability (difference in mean BDQ, G—UC=0.68, 95% CI -0.21 to 1.56; P=0.13) between patients managed by GPs who had received the guidelines and those in the usual-care group (Tables 6 and 7). The trend was for greater satisfaction among patients managed by guideline practices but worse disability (neither significant). The only difference to approach significance was for the EuroQol score (difference in mean Euro-Qol, G—UC=0.75, 95% CI -0.11 to 1.61; P=0.09), indicating worse QoL among patients managed by guideline practices (trend level P<0.10).
DISCUSSION1pv, 百拇医药
Many guidelines for the diagnosis and management of psychiatric morbidity in primary care have been developed (), but they vary in scope and quality . Few have been evaluated in pragmatic RCTs. The WHO ICD—10 PHC guidelines have been widely disseminated. Upton and colleagues reported some benefits in a controlled before-and-after study ().1pv, 百拇医药
Main findings1pv, 百拇医药
We evaluated a process of local adaptation and dissemination of the 1996 WHO guidelines to see whether there was any impact on clinician behaviour or clinical outcomes for patients. There were two main findings of this study. First, we found no evidence that implementing these guidelines, through our local process of adaptation and extension, which was intended to engender ‘shared ownership’, had an impact on practitioners' detection performance (sensitivity or specificity). Second, there was no effect on clinical outcomes for patients: repeat GHQ—12 scores (mean and proportion that remained cases), disability and satisfaction did not differ significantly between guideline and usual-care practices. Contrary to expectation, the guideline practices achieved higher average disability scores (indicating worse outcome), greater satisfaction with care received but worse quality of life. None of these comparisons was tatistically significant and confidence intervals around estimated intervention effects were quite wide. The trend for worse QoL (one of the four secondary outcomes) may simply be a type 1 error.
Strengthsg, http://www.100md.com
Our results are based on a sample of practices from three sectors of a large, urban mental health service, more than 2000 screened patients and more than 100 GPs detecting disorder. Over two-thirds of the practices approached participated, including single-handed and fundholding practices. The characteristics of our sample correspond well with what is already known of the epidemiology of psychological distress in PHC and its detection by GPs. We therefore expect that our negative findings are widely generalisable. We used a pair-matched design to ensure that the outcome of randomisation was balanced for social deprivation, which we thought would have an influence on our practice-and patient-level outcomes. Our decision to match a priori on social deprivation appears to be justified, since loss to follow-up at the 3-month postal questionnaire survey was correlated with deprivation.g, http://www.100md.com
Our criterion for evaluating GP detection performance was a score of >3 on the GHQ—12, not a clinical interview, and our outcomes were all self-report. These design considerations were pragmatic and made it possible to implement the study in a large number of practices.
Limitations+.2, 百拇医药
There were small differences in the baseline detection performance of the guideline and usual-care practices. Where possible — for practice detection outcomes — we applied a meta-regression approach that enabled us to adjust for baseline imbalance using cluster-level performance from the baseline detection phase as a covariate. This approach has been recommended for cluster randomised trials with repeated cross-sectional designs, where different patients are surveyed during pre-intervention and post-intervention periods. It is preferred over analyses of change from baseline (estimated using an interaction of intervention group by time-period) because baselines are usually measured with low precision. An additional factor is that the baseline outcome may not be prognostic, i.e. may not correlate with intervention outcomes. These design features can lead to bias in results and increase the noise, leading to a reduction in the precision of the estimated intervention effect. Ideally, enough cluster (practices) would be recruited to reduce the potential for a poor outcome of randomisation. In our case the baseline adjustment did not alter the conclusions, with baseline performance proving useful (prognostic) for only one of the two cluster-level outcomes (practice sensitivity). When the baselines are not prognostic, Ukoumunne & Thompson have argued that interpretation should focus on the unadjusted effect, since the adjusted analysis places too much weight on the baselines. Our baseline measures were based on small samples, which limited our potential to adjust for differences between the practices that arose as an outcome of randomisation. For one outcome (sensitivity), our meta-regression adjustment increased the precision of the estimated intervention effect. In the second, the approach simply added noise. A low response rate to follow-up questionnaires in the guideline practices during the baseline period prohibited use of the meta-regression procedure for clinical outcomes. It might otherwise have been possible to aggregate these outcomes to cluster level and use them as covariates.
It is possible that our use of a categorical diagnostic approach reduced the fidelity of measurement of practitioner and patient variation. It is, nevertheless, an accepted tradition in primary care psychiatric research. We do not know to what extent the GPs made use of our guideline handbook, nor do we know the extent to which the guideline advocate was able to disseminate their contents to other primary care colleagues. We did not measure, but were not made aware of, any contamination between the guideline and usual-care practices. The study could not be blinded since the development of the intervention comprised participation in a local adaptation process and receipt of a personal copy of the guidelines.afi8@'-, 百拇医药
Hampshire Depression Projectafi8@'-, 百拇医药
Our findings are consistent with those of the Hampshire Depression Project (HDP), a larger cluster RCT of educational intervention for GPs on the recognition, management and treatment of depression . The HDP, which involved 60 practices and a self-selected sample of over 150 physicians, evaluated a more intensive educational approach to dissemination of a clinical practice guideline for depression, using a continuing medical education model (with quality testing of the educational component). The HDP screened and followed up more patients and involved more practitioners, but their sample of GPs was self-selected within participating practices. The participation rate among invited practices was much higher in our study than in the HDP (70% v. 26%) and all practitioners within participating practices were monitored, which may improve generalisability. Response rates to postal questionnaires were similar in both studies. In the HDP, response rates at 6 weeks ranged from 48% to 70%, depending on stage of study.
The futurex6, http://www.100md.com
Over the past few years, studies on guideline dissemination have consistently failed to demonstrate significant effectiveness in changing clinician behaviour. Evaluations of more structured implementation strategies have produced some favourable results, however, and we therefore designed and evaluated an education-based implementation strategy. Because of practical limitations, we were unable to measure important process variables, and in attempting to interpret our negative result we cannot discriminate between several possible explanations. These include failure of the GPs to read the guidelines, failure to implement them and failures in the content of the guidelines themselves in terms of their evidence base or relevance. Although there can be no doubt that guidelines such as those examined here are an important source of reference and guidance for PHC physicians, their effectiveness in changing clinician behaviour will require more complex and evidence-based strategies, probably involving multi-faceted targeting of interventions.
Since this study was carried out, the 1996 WHO guidelines have been further adapted. The latest version (currently unevaluated) is available free from the WHO collaborating centre website:3k|, 百拇医药
Clinical Implications and Limitations3k|, 百拇医药
CLINICAL IMPLICATIONS3k|, 百拇医药
Participation in a process of adaptation and extension of the ICD—10 Primary Health Care Guidelines failed to change practitioner behaviour (detection rates: sensitivity and specificity) or influence patient outcomes (General Health Questionnaire, disability, satisfaction, quality of life). Only specificity and satisfaction favoured guideline practices.3k|, 百拇医药
These negative findings highlight limitations in the ability of guideline interventions to influence UK general practitioners' (GPs') management of psychiatric morbidity.3k|, 百拇医药
These results are consistent with other studies in the UK that have adopted an intensive approach to dissemination of guidelines, e.g. medical education models.
LIMITATIONS(h%g, 百拇医药
We did not measure whether GPs used the guidelines, nor did we measure any contamination that may have influenced the performance of usual-care practices.(h%g, 百拇医药
Despite randomisation (at cluster level), there were small differences in baseline detection performance between practices. Poor response to the 3-month, postal questionnaire follow-up for guideline practices during the baseline period limited adjustment for baseline to detection outcomes only, and not for patient outcomes.(h%g, 百拇医药
Analysis did not take into account missing data from patients who did not respond to postal questionnaire follow-up, although stratification by social deprivation may have helped to reduce bias due to loss to follow-up (by ensuring balance). Power to detect small intervention effects for patient-level outcomes was low, and no adjustment was made for possible imbalance in patient-level covariates.
ACKNOWLEDGMENTS4iaw, http://www.100md.com
Guideline development and revision was performed by a panel comprising: Dr C. Ree, Dr T. J. Gollin, Dr M. Byron, Dr N. M. Reynolds, Dr P. Bagshaw, Dr M. Jones, Dr S. C. Pietersen, Dr D. S. Kessler, Dr S. Meehan, Dr D. R. Darvill, Dr J. E. Jones, Dr T. M. Southwood, Dr J. P. S. Parrott, Dr J. S. Wright, Dr T. H. Frewin, Dr N. P. Ring, Dr W. Mitchell, Dr M. Evans, T. Hext, K. Macleod, P. Appleton, P. Evans, M. Hilton, S. Summerhayes, M. Sharman, S. Smith, J. Baker, Dr M. Mitcheson, A. Hampshire, J. L. Llewelyn, Dr J. Potokar, R. Jones, Dr N. Moore, F. Williams.4iaw, http://www.100md.com
The editorial team were: C. Crilly (Primary Care), J. Evans (Psychiatry), G. Harrison (Psychiatry), G. McCann, D. Sharp (Primary Care), C. Smith and E. Wilkinson (Psychiatry).4iaw, http://www.100md.com
B. Blair assisted with the guide to the Mental Health Act and M. Fairfoot provided assistance with the resources directory and guide to statutory services. J. Colyer provided administrative support.
The Evaluating Guideline Outcomes (EGO) study was funded by the Department of Health (grant reference JRI26/0634). T.C. was supported by a Medical Research Council Training Fellowship in Health Services Research.w$y, 百拇医药
REFERENCESw$y, 百拇医药
Borowsky, S. J., Rubenstein, L. V., Meredith, L. S., et al (2000) Who is at risk of nondetection of mental health problems in primary care? Journal of General Internal Medicine, 15, 381-388.w$y, 百拇医药
Bower, P. & Sibbald, B. (2000) Systematic review of the effect of on-site mental health professionals on the clinical behaviour of general practitioners. BMJ, 320, 614-617.w$y, 百拇医药
Cornwall, P. L. & Scott, J. (2000) Which clinical practice guidelines for depression? An overview for busy practitioners. British Journal of General Practice, 50, 908-911.w$y, 百拇医药
Gask, L., Goldberg, D., Lesser, A., et al (1988) Improving the psychiatric skills of the general practice trainee: an evaluation of a group training course. Medical Education, 22, 132-138.
Gask, L., Usherwood, T., Thompson, H., et al (1998) Evaluation of a training package in the assessment and management of depression in primary care. Medical Education, 32, 190-198.xz-, http://www.100md.com
Glover, G. R., Robin, E., Emami, J., et al (1998) A needs index for mental health care. Social Psychiatry and Psychiatric Epidemiology, 33, 89-96.xz-, http://www.100md.com
Goldberg, D., Sharp, D. & Nanayakkara, K. (1995) The field trial of the mental disorders section of ICD—10 designed for primary care (ICD—10-PHC) in England. Family Practice, 12, 466-473.xz-, http://www.100md.com
Goldberg, D., Gater, R., Sartorius, N., et al (1997) The validity of two versions of the GHQ in the WHO study of mental illness in general health care. Psychological Medicine, 27, 191-197.xz-, http://www.100md.com
Goldberg, D., Privett, M., Ustun, B., et al (1998) The effects of detection and treatment on the outcome of major depression in primary care: a naturalistic study in 15 cities. British Journal of General Practice, 48, 1840-1844.xz-, http://www.100md.com
Katon, W. & Schulberg, H. (1992) Epidemiology of depression in primary care. General Hospital Psychiatry, 14, 237-247.
Katon, W., Von Korff, M., Lin, E., et al (1997) Collaborative management to achieve depression treatment guidelines. Journal of Clinical Psychiatry, 58 (suppl. 1), 20-23.*ls\], 百拇医药
Kind, P. (1996) The EuroQol instrument: an index of health-related quality of life. In Quality of Life and Pharmacoeconomics in Clinical Trials (ed. B. Spiker) (2nd edn), pp. 191-201. Philadelphia, PA: Lippincott-Raven.*ls\], 百拇医药
Littlejohns, P., Cluzeau, F., Bale, R., et al (1999) The quantity and quality of clinical practice guidelines for the management of depression in primary care in the UK. British Journal of General Practice, 49, 205-210.*ls\], 百拇医药
Morris, R., Gask, L., Ronalds, C., et al (1998) Cost-effectiveness of a new treatment for somatized mental disorder to GPs. Family Practice, 15, 119-125.*ls\], 百拇医药
Paykel, E. S. & Priest, R. G. (1992) Recognition and management of depression in general practice: consensus statement. BMJ, 305, 1198-1202.*ls\], 百拇医药
Simon, G. E. (1998) Can depression be managed appropriately in primary care? Journal of Clinical Psychiatry, 59 (suppl. 2), 3-8.
South Bristol General Practitioners and Specialist Mental Health Services Guideline Adaptation Group (1998) Primary Care Handbook for Mental Disorders: The Bristol Version of ICD—10 PHC Chapter V Guidelines for the Diagnosis and Management of Mental Disorders. Bristol: University of Bristol.jin, http://www.100md.com
Stevens, L., Kinmonth, A. L., Peveler, R., et al (1997) The Hampshire Depression Project: development and piloting of clinical practice guidelines and education about depression in primary health care. Medical Education, 31, 375-379. [CrossRef]jin, http://www.100md.com
Thompson, C., Kinmonth, A. L., Stevens, L., et al (2000) Effects of a clinical-practice guideline and practice-based education on detection and outcome of depression in primary care: Hampshire Depression Project randomised controlled trial. Lancet, 355, 185-191.jin, http://www.100md.com
Thompson, S. G., Pyke, S. D. & Hardy, R. J. (1997) The design and analysis of paired cluster randomized trials: an application of meta-analysis techniques. Statistics in Medicine, 16, 2063-2079.jin, http://www.100md.com
Trickey, H., Harvey, I., Wilcock, G., et al (1998) Formal consensus and consultation: a qualitative method for development of a guideline for dementia. Quality in Health Care, 7, 192-199.
Ukoumunne, O., & Thompson, S. (2001) Analysis of cluster randomized trials with repeated cross-sectional binary measurements. Statistics in Medicine, 20, 417-433.^ca, 百拇医药
Ukoumunne, O., Gulliford, M., Chinn, S., et al (1999) Methods for evaluating area-wide and organisation-based interventions in health and health care: a systematic review. Health Technology Assessment, 3.^ca, 百拇医药
Upton, M.W., Evans, M., Goldberg, D. P., et al (1999) Evaluation of ICD—10 PHC mental health guidelines in detecting and managing depression within primary care. British Journal of Psychiatry, 175, 476-482.^ca, 百拇医药
Ustun, T. B. & Sartorius, N. (eds) (1995) Mental Illness in General Health Care: An International Study. Chichester: John Wiley & Sons.^ca, 百拇医药
Von Korff, M., Ustun, T. B., Ormel, J., et al (1996) Self-report disability in an international primary care study of psychological illness. Journal of Clinical Epidemiology, 49, 297-303.^ca, 百拇医药
Wang, P. S., Berglund, P. & Kessler, R. C. (2000) Recent care of common mental disorders in the United States: prevalence and conformance with evidence-based recommendations. Journal of General Internal Medicine, 15, 284-292.^ca, 百拇医药
World Health Organization (1996) Diagnostic and Management Guidelines for Mental Disorders in Primary Care: ICD—10 Chapter V Primary Care Version. Göttingen: Hogrefe & Huber.(TIM CROUDACE PhD)
JONATHAN EVANS, MRCPsych and GLYNN HARRISON, FRCPsychw'(y/1, 百拇医药
Division of Psychiatry, University of Bristol, UKw'(y/1, 百拇医药
DEBORAH J. SHARP, FRCGPw'(y/1, 百拇医药
Division of Primary Health Care, University of Bristol, UKw'(y/1, 百拇医药
ELLEN WILKINSON, MRCPsych, GEMMA McCANN, BA, MATHEW SPENCE, BSc, CATHERINE CRILLY, MRCGP and LUCY BRINDLE, BScw'(y/1, 百拇医药
Division of Psychiatry, University of Bristol, UKw'(y/1, 百拇医药
Correspondence: Tim Croudace, Department of Psychiatry, University of Cambridge, Box 189, Addenbrooke's Hospital, Cambridge CB2 2QQ, UKw'(y/1, 百拇医药
Declaration of interest D.J.S. was involved in the development of the WHO guidelines.w'(y/1, 百拇医药
{dagger} See editorial, pp. 1–2, this issue. w'(y/1, 百拇医药
ABSTRACTw'(y/1, 百拇医药
Background The World Health Organization (WHO) ICD-10 Primary Health Care (PHC) Guidelines for Diagnosis and Management of Mental Disorders (1996) have not been evaluated in a pragmatic randomised controlled trial (RCT).
Aims To evaluate the effect of local adaptation and dissemination of the guidelines.a&8n, 百拇医药
Method Pragmatic, pair-matched, cluster RCT involving 30 practices.a&8n, 百拇医药
Results Guideline practices were less sensitive but more specific in identifying morbidity, but these differences were not significant. Guideline patients did not differ from usual-care patients on 12-item General Health Questionnaire scores at 3-month follow-up or in the proportion who were still cases. There were no significant differences in secondary outcomes.a&8n, 百拇医药
Conclusions Attempts to influence clinician behaviour through a process of adaptation and extension of guidelines are unlikely to change detection rates or outcomes.a&8n, 百拇医药
INTRODUCTIONa&8n, 百拇医药
The majority of patients with mental health problems present to primary health care (PHC) services , yet general practitioners' (GPs') detection and management are often considered deficient . Improvement in the knowledge and skills of primary care practitioners (Gask et al, ) has been sought through the development of clinical guidelines educational programmes , on-site mental health workers and shared care Evidence for the effectiveness of such approaches is contradictory, with benefits observed in some settings but not others. Current emphasis focuses on educational interventions based on clinical practice guidelines . The World Health Organization (WHO) undertook a major review of Chapter V of ICD-10 (on mental and behavioural disorders) specifically for primary health care practitioners. The new PHC version (ICD-10 PHC; ) proposed both a general diagnostic classification for use in PHC and recommendations on management. This system was subjected to international field trials in which it was evaluated for acceptability and ease of application. No study has evaluated the impact of introducing such guidelines in a pragmatic randomised controlled trial (RCT). We developed a process for local adaptation and dissemination of the ICD-10 PHC (1996), intended to engender shared ownership between primary and secondary care practitioners. We evaluated this development of the guidelines in a pragmatic cluster RCT. Our hypotheses were that enabling GPs to adapt and extend the guidelines in conjunction with health care professionals from secondary services would improve practice detection rates of minor psychiatric morbidity, and patient outcomes at 3 months.
METHODmq{&$, 百拇医药
Study area and eligibility of practicesmq{&$, 百拇医药
The study was conducted in Bristol, UK (pre-intervention data collection: 9 October 1997 to 9 April 1998; post-intervention data collection: 2 September 1998 to 13 May 1999) in a mixed urban and rural area (population 178 000 aged 16-64). Mental Illness Needs Index social deprivation scores for electoral wards ranged from 83 to 118. All 43 general practices located within the catchment area of South Bristol Mental Health Services were eligible and invited to participate (by letter from G.H. and D.J.S.). Participating practices were reimbursed to cover costs of time spent in guideline adaptation meetings and administrative support for the study. Approval was obtained from local ethics committees.mq{&$, 百拇医药
Design and process of randomisationmq{&$, 百拇医药
We used a pair-matched, cluster RCT design . Practices were randomised in pairs after stratifying by social deprivation score. It was considered a priori that the socio-economic characteristics of patients and practice settings might influence outcomes. Using the rand function in Excel, 15 random numbers between 0 and 1 were generated (by T.C.). In each pair, the first practice was assigned to the intervention group if the number was "0.5, and the second if >0.5; 30 practices (70%) consented to randomisation. summarises the trial design and the recruitment and retention of practices.
fig.ommittedh&9hdq, 百拇医药
Fig. 1 Trial design, recruitment and retention of practices. CMHT, community mental health team; GP, general practitioner; GHQ, General Health Questionnaire.h&9hdq, 百拇医药
Sample sizeh&9hdq, 百拇医药
Sample size calculations were based on patient-level outcomes at 3 months among those with General Health Questionnaire 12-item version (GHQ-12) scores >3 at the screen. We aimed to detect a mean difference of 1 point (standard deviation=3) in the GHQ-12 score at 3-month follow-up using a two-tailed test, alpha=0.05, beta=0.20. This required 143 patients (in each group), and therefore an initial screen of approximately 1000 surgery attenders (assuming 30% score >3 at the screen).h&9hdq, 百拇医药
Intracluster correlationh&9hdq, 百拇医药
Baseline data (n=30 practices) were used to estimate the variance inflation factors, e.g. the intracluster correlation for the GHQ-12 scores from the screen was 0.012 (average cluster size, 37.04; design effect, 1.43). The intraclass correlation for change in GHQ-12 scores among those scoring >3 at the screen (during baseline) was 0.038 when clustered by general practice. The average cluster size was 8.4 patients per practice followed up. The design effect for patient outcomes at follow-up was therefore 1.3, requiring 186 patients in each group or 372 in total.
Baseline screening and follow-updo, 百拇医药
During baseline and post-intervention periods we screened separate cross-sectional samples of consecutive attenders and followed them up by postal questionnaire at 3 months. Research workers visited each practice for at least two randomly selected surgeries to distribute copies of the GHQ-12 () to all surgery attenders aged between 16 and 64 years who gave verbal consent. During these surgeries, GPs completed a Physician Encounter Form () for each patient. Practitioners were asked to record reasons for consultation, presenting symptoms, severity of disorder and diagnoses selected from a list based on the ICD-10 PHC chapter headings. Where no disorder was present, they were asked to indicate ‘No diagnosis of psychological disorder’. This process was repeated post-intervention. All consecutive attenders who scored >3 on the GHQ-12 at initial screening were followed up at 3 months (regardless of GP detection). Outcomes were collected via postal administration of four self-report questionnaires, which were returned in the stamped, addressed envelopes provided. Non-responders were sent second and third reminders.
The interventionm, 百拇医药
The intervention comprised the local development and dissemination of the WHO ICD-10 PHC guidelines (1996 version, which was ‘current’ at that time). Acknowledging evidence that emphasised the need for ownership of guidelines and active participation in their development (), we provided participating GPs with the opportunity to adapt the WHO guidelines in a shared-ownership model with colleagues from local psychiatric services. One GP from each intervention practice volunteered to become the guideline advocate, and took part in a series of guideline revision workshops based on a modified nominal group technique (). During these workshops, attended by professionals from primary and secondary care (some jointly) the guidelines were:m, 百拇医药
revised to reflect the consensus of participating practitioners from primary and secondary services;m, 百拇医药
amended, e.g. to include recommendations concerning use of practice-based counsellors;
extended, to include thresholds for specialist referral and to incorporate a list of local statutory National Health Service (NHS) and non-statutory services to which referrals could be made or who offered specific help.m9-{, 百拇医药
An editorial team comprising primary care and psychiatric representatives of the research team incorporated the changes into a final document (‘the purple book’). In addition to the (indirect) dissemination through guideline-advocate participation in the above, participating GPs received a personal, desktop copy of the guidelines. Educational meetings (approved for Post-Graduate Education Allowance accreditation) were then organised in each intervention practice, facilitated by the guideline advocate and attended by a GP (C.C.) and psychiatrist (E.W.) from the research team. At these meetings the process of adaptation was described, and the guidelines were introduced and discussed.m9-{, 百拇医药
Outcomesm9-{, 百拇医药
Primary outcomes were: detection of minor psychiatric morbidity (sensitivity) at practice level, the unit of randomisation; and 3-month clinical outcomes for GHQ—12 cases. The latter were measured by GHQ—12 score at follow-up and the proportion who were still cases, i.e. scoring >3 (). Secondary outcomes were quality of life (QoL), disability, satisfaction with care and the specificity of detection performance (at practice level).
Measures(, 百拇医药
A GHQ—12 score of >3 was used to define a case for the purpose of calculating the GP identification indices (sensitivity and specificity) for detection of morbidity. Repeat GHQ—12 was used to record 3-month clinical outcomes. Impact on role-functioning was recorded using the sum of questions 2 to 6 on the Brief Disability Questionnaire (BDQ; ). This comprises five items: limitation in daily activities; limitation in functioning; motivation for work; personal efficiency; and deterioration in social relations. These were rated on a 3-point scale: 1=no, not at all; 2=yes, sometimes or a little; 3=yes, moderately or definitely. Total score ranged from 5 to 15, high indicating worse disability.(, 百拇医药
Quality of life was recorded by the five-item European Quality of Life (EuroQol) instrument (). Items were summed to give a total score (range 5 to 15; high indicating worse QoL). A single question assessed satisfaction with care received: ‘How satisfied are you overall with the care you have recently received from your doctor?’ Responses were rated on a 5-point scale: 1=terrible; 2=mostly dissatisfied; 3=mixed views; 4=mostly satisfied; 5=excellent.
Analysisyx:, 百拇医药
Random effects meta-analysis () was used to provide graphical and statistical summaries of all primary (sensitivity, repeat GHQ—12) and secondary (specificity, disability, satisfaction and QoL) outcomes. This procedure generates a weighted average intervention effect (with 95% confidence intervals) pooled over the practice pairs, which were stratified by social deprivation. It also produced a z-score and P-value for the test that the intervention effect was significantly different from zero. Analyses were performed using the metan meta-analysis procedure in Stata version 6 for PC. Since measures of baseline performance (practice sensitivity and specificity before the introduction of the guidelines) were recorded, these were entered as covariates in a regression extension of the random effects meta-analysis procedure. We used the meta-regression approach recommended by Ukoumunne & Thompson to correct for baseline imbalance in study outcomes (at the cluster level). Meta-regression analysis in a pair-matched cluster RCT provides an (analysis of covariance style) adjustment to the estimated risk difference that corrects for any baseline differences in outcomes that might have resulted from randomising a small number of experimental units (as is the case in cluster RCTs). To implement the adjusted analyses we used the metareg procedure in Stata with the additive between study variance (tau) estimated using the method of moments (option bs(mm)). To maximise sample size for the analysis of the outcomes at 3-month follow-up, patients were included even if the GP had not completed the Physician Encounter Form. No adjustment was made for patient-level covariates. All analyses were on an intention-to-treat basis.
RESULTSj!r8|\, http://www.100md.com
summarises the flow of patients and practices.j!r8|\, http://www.100md.com
fig.ommittedj!r8|\, http://www.100md.com
Fig. 2 Flow of patients through screening and follow-up. PEF, Physician Encounter Form; GHQ, General Health Questionnaire.j!r8|\, http://www.100md.com
The administrative characteristics of the consenting practices (30/43) and those who declined to participate are summarised j!r8|\, http://www.100md.com
fig.ommittedj!r8|\, http://www.100md.com
Table 1 Practice characteristics;1 values are number (percentage) unless otherwise specifiedj!r8|\, http://www.100md.com
The characteristics of the participating GPs and of the sample of consecutive attenders for whom a matching Physician Encounter Form was collected appeared to indicate a balanced outcome of (cluster-level) randomisation, after stratifying by (practice) social deprivation score.j!r8|\, http://www.100md.com
fig.ommittedj!r8|\, http://www.100md.com
Table 2 Practitioner characteristics; values are number (percentage) unless stated otherwise
fig.ommitted8l)0)u1, 百拇医药
Table 3 Patient characteristics (for sample with complete GHQ and Physician Encounter Form); values are number (percentage) unless otherwise specified8l)0)u1, 百拇医药
shows the (very similar) cumulative distribution of GHQ—12 scores in guideline and usual-care practices, for consecutive attenders during the post-intervention period.8l)0)u1, 百拇医药
fig.ommitted8l)0)u1, 百拇医药
Fig. 3 General Health Questionnaire 12-item version (GHQ—12) scores in guideline and usual-care practices.8l)0)u1, 百拇医药
Primary cluster-level outcome: GP detection (sensitivity)8l)0)u1, 百拇医药
Identification of disorder required GPs to have indicated on the Physician Encounter Form the presence of at least one named psychological disorder from the list of ICD—10 PHC diagnoses. After intervention, the crude detection rate (sensitivity) for GPs in the guideline practices was 47%, compared with 55% in the usual-care practices .8l)0)u1, 百拇医药
fig.ommitted
Table 4 Practice detection rates during baseline period@{%)f)[, 百拇医药
The pooled risk difference between guideline and usual care was -10.8% (95% CI -24.0% to 2.4%), which was not significant (z=1.61, P=0.11). The unadjusted analysis is summarised in , which shows the risk difference for each pair and contributions to the pooled effect size (random-effects meta-analysis). The confidence limits for the intervention effect suggest that the guideline practices were less successful in identifying GHQ morbidity. However, this trend was reduced and estimated more precisely (evidenced by the reduction in width of the confidence interval) when the adjustment for baseline outcomes () was made: after adjustment for baseline sensitivity, the difference was -6.6% (95% CI -19.0% to 5.9%; z=1.03%, P=0.304). The cluster-level correlation between baseline and post-intervention sensitivity was 0.45 (Pearson correlation, P=0.07). The significance of the baseline adjustment in the meta-regression analysis was P=0.03, which explains the slight increase in the precision of the estimated intervention effect from the meta-regression analysis. The estimated effect of the intervention was also reduced by almost half (from -10.8% to -6.6%).
fig.ommittedz0, 百拇医药
Fig. 4 Random effects meta-analysis plot showing differences in practice detection by pair: (a) sensitivity and (b) specificity. Pooled estimates represent unadjusted and baseline-adjusted (meta-regression) weighted risk differences.z0, 百拇医药
Secondary cluster-level outcome: GP detection (specificity)z0, 百拇医药
After intervention, the crude specificities achieved by guideline and usual-care practices were 86% and 79%, respectively (). The pooled risk difference between guideline and usual care, for the secondary cluster-level outcome practice specificity, was 5.3% (-5.0% to 15.7%), which was not significant (z=1.01, P=0.31).z0, 百拇医药
fig.ommittedz0, 百拇医药
Table 5 Practice detection rates during post-intervention periodz0, 百拇医药
After adjustment for baseline specificity (), this difference increased slightly, to 6.2% (95% CI -4.4% to 16.8%; z=1.14, P=0.255). However, the baseline adjustment in the meta-regression analysis was not significant (P=0.416), explaining the decrease in the precision of the estimated intervention effect. The cluster-level correlation between baseline and post-intervention specificity was 0.21 (Pearson correlation, P=0.52). The baseline covariate was therefore not prognostic for intervention outcomes.
Postal questionnaire follow-up at 3 monthst|, http://www.100md.com
During both baseline and post-intervention periods, we followed up all consecutive attenders who scored >3 on the GHQ-12 screen. The response rate to the postal questionnaire follow-up during the post-intervention period was 61% for guideline and 62% for usual-care practices. Inspection of Tables 6 and 7 demonstrates that response rates were lower from practices in socially deprived areas. The response rate was unusually low for guideline practices during the pre-intervention baseline period (49%) (). The correlation between social deprivation score and response rate was greater than 0.3 (Spearman's rank correlation coefficient) during both baseline and post-intervention periods.t|, http://www.100md.com
fig.ommittedt|, http://www.100md.com
Table 6 Patient-level outcomes during baseline period; higher scores indicate worse outcomes (except for satisfaction)t|, http://www.100md.com
fig.ommittedt|, http://www.100md.com
Table 7 Patient-level primary (GHQ) outcomes during post-intervention period
Primary outcome measure, patient-level: repeat GHQ—12r., 百拇医药
There was no evidence for any impact of the intervention on our primary clinical outcome for patients, i.e. the repeat GHQ—12 score (difference in mean GHQ—12 at 3 months guideline minus usual care (G—UC)=0.45, 95% CI -1.42 to 2.33; P=0.63), nor in the proportion of patients who were still scoring above the threshold for caseness (difference in proportion scoring >3 on GHQ—12 at 3 months, G—UC=4.3%, 95% CI -12.4% to 20.9%). Results indicated worse outcomes (higher GHQ—12 scores and more cases at 3-month follow-up) in the guideline practices than in usual care, although the confidence intervals were wide.r., 百拇医药
Secondary outcomes, patient-level: disability, satisfaction and QoLr., 百拇医药
There were no differences in satisfaction (difference in mean satisfaction, G—UC=0.20, 95% CI -0.05 to 0.45; P=0.12) or disability (difference in mean BDQ, G—UC=0.68, 95% CI -0.21 to 1.56; P=0.13) between patients managed by GPs who had received the guidelines and those in the usual-care group (Tables 6 and 7). The trend was for greater satisfaction among patients managed by guideline practices but worse disability (neither significant). The only difference to approach significance was for the EuroQol score (difference in mean Euro-Qol, G—UC=0.75, 95% CI -0.11 to 1.61; P=0.09), indicating worse QoL among patients managed by guideline practices (trend level P<0.10).
DISCUSSION1pv, 百拇医药
Many guidelines for the diagnosis and management of psychiatric morbidity in primary care have been developed (), but they vary in scope and quality . Few have been evaluated in pragmatic RCTs. The WHO ICD—10 PHC guidelines have been widely disseminated. Upton and colleagues reported some benefits in a controlled before-and-after study ().1pv, 百拇医药
Main findings1pv, 百拇医药
We evaluated a process of local adaptation and dissemination of the 1996 WHO guidelines to see whether there was any impact on clinician behaviour or clinical outcomes for patients. There were two main findings of this study. First, we found no evidence that implementing these guidelines, through our local process of adaptation and extension, which was intended to engender ‘shared ownership’, had an impact on practitioners' detection performance (sensitivity or specificity). Second, there was no effect on clinical outcomes for patients: repeat GHQ—12 scores (mean and proportion that remained cases), disability and satisfaction did not differ significantly between guideline and usual-care practices. Contrary to expectation, the guideline practices achieved higher average disability scores (indicating worse outcome), greater satisfaction with care received but worse quality of life. None of these comparisons was tatistically significant and confidence intervals around estimated intervention effects were quite wide. The trend for worse QoL (one of the four secondary outcomes) may simply be a type 1 error.
Strengthsg, http://www.100md.com
Our results are based on a sample of practices from three sectors of a large, urban mental health service, more than 2000 screened patients and more than 100 GPs detecting disorder. Over two-thirds of the practices approached participated, including single-handed and fundholding practices. The characteristics of our sample correspond well with what is already known of the epidemiology of psychological distress in PHC and its detection by GPs. We therefore expect that our negative findings are widely generalisable. We used a pair-matched design to ensure that the outcome of randomisation was balanced for social deprivation, which we thought would have an influence on our practice-and patient-level outcomes. Our decision to match a priori on social deprivation appears to be justified, since loss to follow-up at the 3-month postal questionnaire survey was correlated with deprivation.g, http://www.100md.com
Our criterion for evaluating GP detection performance was a score of >3 on the GHQ—12, not a clinical interview, and our outcomes were all self-report. These design considerations were pragmatic and made it possible to implement the study in a large number of practices.
Limitations+.2, 百拇医药
There were small differences in the baseline detection performance of the guideline and usual-care practices. Where possible — for practice detection outcomes — we applied a meta-regression approach that enabled us to adjust for baseline imbalance using cluster-level performance from the baseline detection phase as a covariate. This approach has been recommended for cluster randomised trials with repeated cross-sectional designs, where different patients are surveyed during pre-intervention and post-intervention periods. It is preferred over analyses of change from baseline (estimated using an interaction of intervention group by time-period) because baselines are usually measured with low precision. An additional factor is that the baseline outcome may not be prognostic, i.e. may not correlate with intervention outcomes. These design features can lead to bias in results and increase the noise, leading to a reduction in the precision of the estimated intervention effect. Ideally, enough cluster (practices) would be recruited to reduce the potential for a poor outcome of randomisation. In our case the baseline adjustment did not alter the conclusions, with baseline performance proving useful (prognostic) for only one of the two cluster-level outcomes (practice sensitivity). When the baselines are not prognostic, Ukoumunne & Thompson have argued that interpretation should focus on the unadjusted effect, since the adjusted analysis places too much weight on the baselines. Our baseline measures were based on small samples, which limited our potential to adjust for differences between the practices that arose as an outcome of randomisation. For one outcome (sensitivity), our meta-regression adjustment increased the precision of the estimated intervention effect. In the second, the approach simply added noise. A low response rate to follow-up questionnaires in the guideline practices during the baseline period prohibited use of the meta-regression procedure for clinical outcomes. It might otherwise have been possible to aggregate these outcomes to cluster level and use them as covariates.
It is possible that our use of a categorical diagnostic approach reduced the fidelity of measurement of practitioner and patient variation. It is, nevertheless, an accepted tradition in primary care psychiatric research. We do not know to what extent the GPs made use of our guideline handbook, nor do we know the extent to which the guideline advocate was able to disseminate their contents to other primary care colleagues. We did not measure, but were not made aware of, any contamination between the guideline and usual-care practices. The study could not be blinded since the development of the intervention comprised participation in a local adaptation process and receipt of a personal copy of the guidelines.afi8@'-, 百拇医药
Hampshire Depression Projectafi8@'-, 百拇医药
Our findings are consistent with those of the Hampshire Depression Project (HDP), a larger cluster RCT of educational intervention for GPs on the recognition, management and treatment of depression . The HDP, which involved 60 practices and a self-selected sample of over 150 physicians, evaluated a more intensive educational approach to dissemination of a clinical practice guideline for depression, using a continuing medical education model (with quality testing of the educational component). The HDP screened and followed up more patients and involved more practitioners, but their sample of GPs was self-selected within participating practices. The participation rate among invited practices was much higher in our study than in the HDP (70% v. 26%) and all practitioners within participating practices were monitored, which may improve generalisability. Response rates to postal questionnaires were similar in both studies. In the HDP, response rates at 6 weeks ranged from 48% to 70%, depending on stage of study.
The futurex6, http://www.100md.com
Over the past few years, studies on guideline dissemination have consistently failed to demonstrate significant effectiveness in changing clinician behaviour. Evaluations of more structured implementation strategies have produced some favourable results, however, and we therefore designed and evaluated an education-based implementation strategy. Because of practical limitations, we were unable to measure important process variables, and in attempting to interpret our negative result we cannot discriminate between several possible explanations. These include failure of the GPs to read the guidelines, failure to implement them and failures in the content of the guidelines themselves in terms of their evidence base or relevance. Although there can be no doubt that guidelines such as those examined here are an important source of reference and guidance for PHC physicians, their effectiveness in changing clinician behaviour will require more complex and evidence-based strategies, probably involving multi-faceted targeting of interventions.
Since this study was carried out, the 1996 WHO guidelines have been further adapted. The latest version (currently unevaluated) is available free from the WHO collaborating centre website:3k|, 百拇医药
Clinical Implications and Limitations3k|, 百拇医药
CLINICAL IMPLICATIONS3k|, 百拇医药
Participation in a process of adaptation and extension of the ICD—10 Primary Health Care Guidelines failed to change practitioner behaviour (detection rates: sensitivity and specificity) or influence patient outcomes (General Health Questionnaire, disability, satisfaction, quality of life). Only specificity and satisfaction favoured guideline practices.3k|, 百拇医药
These negative findings highlight limitations in the ability of guideline interventions to influence UK general practitioners' (GPs') management of psychiatric morbidity.3k|, 百拇医药
These results are consistent with other studies in the UK that have adopted an intensive approach to dissemination of guidelines, e.g. medical education models.
LIMITATIONS(h%g, 百拇医药
We did not measure whether GPs used the guidelines, nor did we measure any contamination that may have influenced the performance of usual-care practices.(h%g, 百拇医药
Despite randomisation (at cluster level), there were small differences in baseline detection performance between practices. Poor response to the 3-month, postal questionnaire follow-up for guideline practices during the baseline period limited adjustment for baseline to detection outcomes only, and not for patient outcomes.(h%g, 百拇医药
Analysis did not take into account missing data from patients who did not respond to postal questionnaire follow-up, although stratification by social deprivation may have helped to reduce bias due to loss to follow-up (by ensuring balance). Power to detect small intervention effects for patient-level outcomes was low, and no adjustment was made for possible imbalance in patient-level covariates.
ACKNOWLEDGMENTS4iaw, http://www.100md.com
Guideline development and revision was performed by a panel comprising: Dr C. Ree, Dr T. J. Gollin, Dr M. Byron, Dr N. M. Reynolds, Dr P. Bagshaw, Dr M. Jones, Dr S. C. Pietersen, Dr D. S. Kessler, Dr S. Meehan, Dr D. R. Darvill, Dr J. E. Jones, Dr T. M. Southwood, Dr J. P. S. Parrott, Dr J. S. Wright, Dr T. H. Frewin, Dr N. P. Ring, Dr W. Mitchell, Dr M. Evans, T. Hext, K. Macleod, P. Appleton, P. Evans, M. Hilton, S. Summerhayes, M. Sharman, S. Smith, J. Baker, Dr M. Mitcheson, A. Hampshire, J. L. Llewelyn, Dr J. Potokar, R. Jones, Dr N. Moore, F. Williams.4iaw, http://www.100md.com
The editorial team were: C. Crilly (Primary Care), J. Evans (Psychiatry), G. Harrison (Psychiatry), G. McCann, D. Sharp (Primary Care), C. Smith and E. Wilkinson (Psychiatry).4iaw, http://www.100md.com
B. Blair assisted with the guide to the Mental Health Act and M. Fairfoot provided assistance with the resources directory and guide to statutory services. J. Colyer provided administrative support.
The Evaluating Guideline Outcomes (EGO) study was funded by the Department of Health (grant reference JRI26/0634). T.C. was supported by a Medical Research Council Training Fellowship in Health Services Research.w$y, 百拇医药
REFERENCESw$y, 百拇医药
Borowsky, S. J., Rubenstein, L. V., Meredith, L. S., et al (2000) Who is at risk of nondetection of mental health problems in primary care? Journal of General Internal Medicine, 15, 381-388.w$y, 百拇医药
Bower, P. & Sibbald, B. (2000) Systematic review of the effect of on-site mental health professionals on the clinical behaviour of general practitioners. BMJ, 320, 614-617.w$y, 百拇医药
Cornwall, P. L. & Scott, J. (2000) Which clinical practice guidelines for depression? An overview for busy practitioners. British Journal of General Practice, 50, 908-911.w$y, 百拇医药
Gask, L., Goldberg, D., Lesser, A., et al (1988) Improving the psychiatric skills of the general practice trainee: an evaluation of a group training course. Medical Education, 22, 132-138.
Gask, L., Usherwood, T., Thompson, H., et al (1998) Evaluation of a training package in the assessment and management of depression in primary care. Medical Education, 32, 190-198.xz-, http://www.100md.com
Glover, G. R., Robin, E., Emami, J., et al (1998) A needs index for mental health care. Social Psychiatry and Psychiatric Epidemiology, 33, 89-96.xz-, http://www.100md.com
Goldberg, D., Sharp, D. & Nanayakkara, K. (1995) The field trial of the mental disorders section of ICD—10 designed for primary care (ICD—10-PHC) in England. Family Practice, 12, 466-473.xz-, http://www.100md.com
Goldberg, D., Gater, R., Sartorius, N., et al (1997) The validity of two versions of the GHQ in the WHO study of mental illness in general health care. Psychological Medicine, 27, 191-197.xz-, http://www.100md.com
Goldberg, D., Privett, M., Ustun, B., et al (1998) The effects of detection and treatment on the outcome of major depression in primary care: a naturalistic study in 15 cities. British Journal of General Practice, 48, 1840-1844.xz-, http://www.100md.com
Katon, W. & Schulberg, H. (1992) Epidemiology of depression in primary care. General Hospital Psychiatry, 14, 237-247.
Katon, W., Von Korff, M., Lin, E., et al (1997) Collaborative management to achieve depression treatment guidelines. Journal of Clinical Psychiatry, 58 (suppl. 1), 20-23.*ls\], 百拇医药
Kind, P. (1996) The EuroQol instrument: an index of health-related quality of life. In Quality of Life and Pharmacoeconomics in Clinical Trials (ed. B. Spiker) (2nd edn), pp. 191-201. Philadelphia, PA: Lippincott-Raven.*ls\], 百拇医药
Littlejohns, P., Cluzeau, F., Bale, R., et al (1999) The quantity and quality of clinical practice guidelines for the management of depression in primary care in the UK. British Journal of General Practice, 49, 205-210.*ls\], 百拇医药
Morris, R., Gask, L., Ronalds, C., et al (1998) Cost-effectiveness of a new treatment for somatized mental disorder to GPs. Family Practice, 15, 119-125.*ls\], 百拇医药
Paykel, E. S. & Priest, R. G. (1992) Recognition and management of depression in general practice: consensus statement. BMJ, 305, 1198-1202.*ls\], 百拇医药
Simon, G. E. (1998) Can depression be managed appropriately in primary care? Journal of Clinical Psychiatry, 59 (suppl. 2), 3-8.
South Bristol General Practitioners and Specialist Mental Health Services Guideline Adaptation Group (1998) Primary Care Handbook for Mental Disorders: The Bristol Version of ICD—10 PHC Chapter V Guidelines for the Diagnosis and Management of Mental Disorders. Bristol: University of Bristol.jin, http://www.100md.com
Stevens, L., Kinmonth, A. L., Peveler, R., et al (1997) The Hampshire Depression Project: development and piloting of clinical practice guidelines and education about depression in primary health care. Medical Education, 31, 375-379. [CrossRef]jin, http://www.100md.com
Thompson, C., Kinmonth, A. L., Stevens, L., et al (2000) Effects of a clinical-practice guideline and practice-based education on detection and outcome of depression in primary care: Hampshire Depression Project randomised controlled trial. Lancet, 355, 185-191.jin, http://www.100md.com
Thompson, S. G., Pyke, S. D. & Hardy, R. J. (1997) The design and analysis of paired cluster randomized trials: an application of meta-analysis techniques. Statistics in Medicine, 16, 2063-2079.jin, http://www.100md.com
Trickey, H., Harvey, I., Wilcock, G., et al (1998) Formal consensus and consultation: a qualitative method for development of a guideline for dementia. Quality in Health Care, 7, 192-199.
Ukoumunne, O., & Thompson, S. (2001) Analysis of cluster randomized trials with repeated cross-sectional binary measurements. Statistics in Medicine, 20, 417-433.^ca, 百拇医药
Ukoumunne, O., Gulliford, M., Chinn, S., et al (1999) Methods for evaluating area-wide and organisation-based interventions in health and health care: a systematic review. Health Technology Assessment, 3.^ca, 百拇医药
Upton, M.W., Evans, M., Goldberg, D. P., et al (1999) Evaluation of ICD—10 PHC mental health guidelines in detecting and managing depression within primary care. British Journal of Psychiatry, 175, 476-482.^ca, 百拇医药
Ustun, T. B. & Sartorius, N. (eds) (1995) Mental Illness in General Health Care: An International Study. Chichester: John Wiley & Sons.^ca, 百拇医药
Von Korff, M., Ustun, T. B., Ormel, J., et al (1996) Self-report disability in an international primary care study of psychological illness. Journal of Clinical Epidemiology, 49, 297-303.^ca, 百拇医药
Wang, P. S., Berglund, P. & Kessler, R. C. (2000) Recent care of common mental disorders in the United States: prevalence and conformance with evidence-based recommendations. Journal of General Internal Medicine, 15, 284-292.^ca, 百拇医药
World Health Organization (1996) Diagnostic and Management Guidelines for Mental Disorders in Primary Care: ICD—10 Chapter V Primary Care Version. Göttingen: Hogrefe & Huber.(TIM CROUDACE PhD)