quality assessment tool for systematic review of observational studies

Abstract

Background Assessing quality and susceptibility to bias is essential when interpreting primary research and conducting systematic reviews and meta-analyses. Tools for assessing quality in clinical trials are well-described but much less attention has been given to similar tools for observational epidemiological studies.

Methods Tools were identified from a search of three electronic databases, bibliographies and an Internet search using Google®. Two reviewers extracted data using a pre-piloted extraction form and strict inclusion criteria. Tool content was evaluated for domains potentially related to bias and was informed by the STROBE guidelines for reporting observational epidemiological studies.

Results A total of 86 tools were reviewed, comprising 41 simple checklists, 12 checklists with additional summary judgements and 33 scales. The number of items ranged from 3 to 36 (mean 13.7). One-third of tools were designed for single use in a specific review and one-third for critical appraisal. Half of the tools provided development details, although most were proposed for future use in other contexts. Most tools included items for selection methods (92%), measurement of study variables (86%), design-specific sources of bias (86%), control of confounding (78%) and use of statistics (78%); only 4% addressed conflict of interest. The distribution and weighting of domains across tools was variable and inconsistent.

Conclusion A number of useful assessment tools have been identified by this report. Tools should be rigorously developed, evidence-based, valid, reliable and easy to use. There is a need to agree on critical elements for assessing susceptibility to bias in observational epidemiology and to develop appropriate evaluation tools.

Introduction

Systematic reviews identify, appraise and synthesize evidence from multiple studies of the same research question, and can be applied to diverse topics in medical research, including the effects of health-care interventions, the accuracy of diagnostic tests and the relationship between risk factors and disease. Meta-analyses, often contained within systematic reviews, offer a means of quantitatively summarizing the body of evidence identified. The strengths and limitations of systematic reviews and meta-analyses have been well established for randomized clinical trials, largely through the efforts of The Cochrane Collaboration. Although they have been used in parallel for observational epidemiological studies, such as cohort, case-control and cross-sectional studies, considerably less attention has been paid to their methodology in this area of application.

A systematic review should follow a protocol in order to minimize bias and ensure that the findings are reproducible. A key source of potential bias in a meta-analysis is bias due to limitations in the original studies contained within it. For example, a review of case-control studies of oral contraceptives and risk of rheumatoid arthritis found exaggerated effects in hospital-based control groups compared with population-based control groups1 whilst a review of case-control studies investigating the impact of sunlight exposure on skin cancer identified an important difference between study results when subjects or interviewers were blinded (or not) to skin cancer status.2 A large prospective study of the association between C-reactive protein and coronary heart disease obtained odds ratios varying from 2.13 to 3.46 with different degrees of adjustment for confounding variables.3

An important component of a thorough systematic review is therefore an evaluation of the methodological quality of the primary research. Numerous tools have been proposed for evaluation of methodological quality of observational epidemiological studies. A comprehensive study of tools for assessing non-randomized intervention studies in health care (excluding case-control studies) identified 193 tools, including several that could also be used for assessing non-intervention studies.4 A large-scale review of tools for grading the quality of research articles and rating the strength of bodies of evidence identified 17 tools for grading evidence from observational study designs,5 although it did not include some of the key tools identified in previous reviews. More recently, Katrak and colleagues6 reviewed 121 critical appraisal tools for allied health research, including physiotherapy, occupational and speech therapy and found a number of problems. All of these reviews have generally concluded that there is currently no agreed gold standard appraisal tool; that the majority of tools did not undergo a rigorous development process; and that there are many tools from which to choose. Consequently, to our knowledge, no tool has been adopted for widespread use within systematic reviews. In addition, none of these reviews sought to identify all tools for assessing observational epidemiological studies.

Quality is an amorphous concept. A convenient interpretation is susceptibility to bias, although it is not uncommon for aspects of study conduct that are not directly associated with bias to be included in a quality assessment. For example, study size, whether or not a power calculation was performed, and ethical approval might be considered aspects of quality, but are, in their own right, not potential causes of bias. Our main objective was to seek tools to assess susceptibility to bias, but we do not draw a clear distinction between quality in bias, reflecting the lack of a distinction in much of the published literature.

It is important, however, to distinguish between quality of reporting and quality of what was actually done in the design, conduct and analysis of a study. A high-quality report ensures that all relevant information about a study is available to the reader, but does not necessarily reflect a low susceptibility to bias.1 Factors such as the peer-review process, editorial policy or journal space restrictions may preclude detailed reporting and so make it difficult to assess inherent biases. A number of consensus statements have encouraged higher quality of reporting, including recommendations for reporting systematic reviews (QUOROM),7 randomized trials (CONSORT),8 studies of diagnostic tests (STARD),9 meta-analyses of observational studies (MOOSE)10 and observational epidemiological studies (STROBE).11,,12 These are aimed at authors of reports, not at those seeking to assess the validity of what they read.

This study provides an annotated bibliography of tools specifically designed to assess quality or susceptibility to bias in observational epidemiological studies, obtained from a comprehensive search of the published literature and of the Internet. It follows the approach of a previous review of tools to assess quality of randomized controlled trials,13 and attempts to identify whether there is an existing tool that could be recommended for widespread use.

Methods

Inclusion criteria

To be included in the review, a tool was defined as any structured instrument aimed at aiding the user to assess quality or susceptibility to bias in observational epidemiological studies (cohort, case-control and cross-sectional studies). Tools were placed in one of the following three categories defined below: scales, simple checklists or checklists with a summary judgement. Scales result in a summary numerical score, typically derived as a sum of scores for several items. Checklists consisted of only a list of items, whilst checklists with a summary judgement were checklists that also resulted in an overall qualitative assessment about the study's quality, such as high, medium or low. These tools may have been developed for use in critical appraisal or in systematic reviews, and may have been developed for general use or use in a specific context. Articles that provided general narrative guidance only or were without an explicit scale or checklist were excluded.

Search methods

Three electronic databases (MEDLINE, EMBASE and Dissertation Abstracts up to March 2005) were searched using full text and MeSH terms to identify articles discussing observational epidemiological study designs, including cohort studies, case-control studies, cross-sectional studies and follow-up studies. Where possible, all terms were included as full text, with truncation used where possible to capture variation in the terminology. The search was not limited to the English language, nor restricted by any other means.

In order to capture tools posted on Internet websites, we conducted an Internet search using the Google® search engine14 during March 2005. Searches were conducted using several combinations of the following search terms: tool, scale, checklist, validity, quality, critical appraisal, bias and confounding. The first 300 links identified by each separate search were investigated. Reference lists of published articles were examined to identify additional sources not identified in the database searches.

Inclusion criteria

Articles or websites were included if they described a tool suitable for assessing quality of observational epidemiological studies. Abstracts were scrutinized for suitability before obtaining the full text of all relevant articles. Where more than one tool was published within the same article or website (for example, independent tools for assessing cohort and case-control study designs published within the same article or website), these were included as separate quality assessment tools. Published reports were used in preference to web sites for tools reported in both formats. Care was taken not to include the same tool twice.

Data extraction

A data extraction form was developed and piloted and included information about the type of study addressed by the tool, number of items, scoring system, description of the development process, whether the tool was developed for generic use in systematic reviews, single use in a specific systematic review or for critical appraisal, and whether the tool was proposed for future use. Data extraction was performed by two authors (SS and IT) with differences of opinion resolved by discussion or by the third author (JH). Items in tools were classified into domains that covered key potential sources of bias. The selection was strongly influenced by the STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) guidelines for reporting observational epidemiological studies. These guidelines for reporting case-control, cohort and cross-sectional studies were developed by an international collaboration of epidemiologists, statisticians and journal editors. Although not a tool for assessing the quality of primary studies, they provide a useful indication of the essential information needed to appraise the conduct of such studies. Table 1 shows how the domains and criteria were used to evaluate tool content.

Table 1

Domains and criteria for evaluating each tool's content

Domain .Tool item must address .
Methods for selecting study participantsAppropriate source population (cases, controls and cohorts) and inclusion or exclusion criteria
Methods for measuring exposure and outcome variablesAppropriate measurement methods for both exposure(s) and/or outcome(s)
Design-specific sources of bias (excluding confounding)Appropriate methods outlined to deal with any design-specific issues such as recall bias, interviewer bias, biased loss to follow or blinding
Methods to control confoundingAppropriate design and/or analytical methods
Statistical methods (excluding control of confounding)Appropriate use of statistics for primary analysis of effect
Conflict of interestDeclarations of conflict of interest or identification of funding sources
Domain .Tool item must address .
Methods for selecting study participantsAppropriate source population (cases, controls and cohorts) and inclusion or exclusion criteria
Methods for measuring exposure and outcome variablesAppropriate measurement methods for both exposure(s) and/or outcome(s)
Design-specific sources of bias (excluding confounding)Appropriate methods outlined to deal with any design-specific issues such as recall bias, interviewer bias, biased loss to follow or blinding
Methods to control confoundingAppropriate design and/or analytical methods
Statistical methods (excluding control of confounding)Appropriate use of statistics for primary analysis of effect
Conflict of interestDeclarations of conflict of interest or identification of funding sources
Table 1

Domains and criteria for evaluating each tool's content

Domain .Tool item must address .
Methods for selecting study participantsAppropriate source population (cases, controls and cohorts) and inclusion or exclusion criteria
Methods for measuring exposure and outcome variablesAppropriate measurement methods for both exposure(s) and/or outcome(s)
Design-specific sources of bias (excluding confounding)Appropriate methods outlined to deal with any design-specific issues such as recall bias, interviewer bias, biased loss to follow or blinding
Methods to control confoundingAppropriate design and/or analytical methods
Statistical methods (excluding control of confounding)Appropriate use of statistics for primary analysis of effect
Conflict of interestDeclarations of conflict of interest or identification of funding sources
Domain .Tool item must address .
Methods for selecting study participantsAppropriate source population (cases, controls and cohorts) and inclusion or exclusion criteria
Methods for measuring exposure and outcome variablesAppropriate measurement methods for both exposure(s) and/or outcome(s)
Design-specific sources of bias (excluding confounding)Appropriate methods outlined to deal with any design-specific issues such as recall bias, interviewer bias, biased loss to follow or blinding
Methods to control confoundingAppropriate design and/or analytical methods
Statistical methods (excluding control of confounding)Appropriate use of statistics for primary analysis of effect
Conflict of interestDeclarations of conflict of interest or identification of funding sources

Wherever possible, we have attempted to demonstrate weighting within checklists and scales by including the total number of items for a checklist and the number of these items allocated to a particular quality domain. For scales, we have included the total maximum raw score for each scale and the possible total score by domain (although most scales do not address all of the domains in Table 1). A few of the tools use extremely complicated assessment and scoring systems, and for these we have reported the total raw score and the maximum item score by domain.

Results

A total of 86 tools were included in the review, 62 identified from the electronic database search (72%) and a further 24 from the Internet search (28%). An overall summary of the main tool characteristics is presented in Tables 24 and more detailed information in Tables 57.

Table 2

Summary results comparing identified tools by type

Tool characteristics .Simple checklists(n = 41) .Simple checklists with additional judgement (n = 12) .Scales (n = 33) .Total (n = 86) .
Source
    Electronic database21 (51%)a9 (75%)32 (97%)62 (72%)
    Internet20 (49%)3 (25%)1 (3%)24 (28%)
100%100%100%100%
Tool purpose
    Single use in a specific context3 (7%)4 (33%)22 (67%)29 (34%)
    Generic tool for systematic reviews8 (20%)3 (25%)2 (6%)13 (15%)
    Critical appraisal tool22 (54%)4 (33%)5 (15%)31 (36%)
    Ambiguous (unable to allocate above categories)8 (20%)1 (8%)4 (12%)13 (15%)
41 (100%)12 (100%)33 (100%)86 (100%)
Development
    Development described21 (51%)7 (58%)18 (55%)46 (53%)
Future use
    Proposed for future use38 (93%)8 (67%)14 (42%)60 (70%)
Tool characteristics .Simple checklists(n = 41) .Simple checklists with additional judgement (n = 12) .Scales (n = 33) .Total (n = 86) .
Source
    Electronic database21 (51%)a9 (75%)32 (97%)62 (72%)
    Internet20 (49%)3 (25%)1 (3%)24 (28%)
100%100%100%100%
Tool purpose
    Single use in a specific context3 (7%)4 (33%)22 (67%)29 (34%)
    Generic tool for systematic reviews8 (20%)3 (25%)2 (6%)13 (15%)
    Critical appraisal tool22 (54%)4 (33%)5 (15%)31 (36%)
    Ambiguous (unable to allocate above categories)8 (20%)1 (8%)4 (12%)13 (15%)
41 (100%)12 (100%)33 (100%)86 (100%)
Development
    Development described21 (51%)7 (58%)18 (55%)46 (53%)
Future use
    Proposed for future use38 (93%)8 (67%)14 (42%)60 (70%)
Table 2

Summary results comparing identified tools by type

Tool characteristics .Simple checklists(n = 41) .Simple checklists with additional judgement (n = 12) .Scales (n = 33) .Total (n = 86) .
Source
    Electronic database21 (51%)a9 (75%)32 (97%)62 (72%)
    Internet20 (49%)3 (25%)1 (3%)24 (28%)
100%100%100%100%
Tool purpose
    Single use in a specific context3 (7%)4 (33%)22 (67%)29 (34%)
    Generic tool for systematic reviews8 (20%)3 (25%)2 (6%)13 (15%)
    Critical appraisal tool22 (54%)4 (33%)5 (15%)31 (36%)
    Ambiguous (unable to allocate above categories)8 (20%)1 (8%)4 (12%)13 (15%)
41 (100%)12 (100%)33 (100%)86 (100%)
Development
    Development described21 (51%)7 (58%)18 (55%)46 (53%)
Future use
    Proposed for future use38 (93%)8 (67%)14 (42%)60 (70%)
Tool characteristics .Simple checklists(n = 41) .Simple checklists with additional judgement (n = 12) .Scales (n = 33) .Total (n = 86) .
Source
    Electronic database21 (51%)a9 (75%)32 (97%)62 (72%)
    Internet20 (49%)3 (25%)1 (3%)24 (28%)
100%100%100%100%
Tool purpose
    Single use in a specific context3 (7%)4 (33%)22 (67%)29 (34%)
    Generic tool for systematic reviews8 (20%)3 (25%)2 (6%)13 (15%)
    Critical appraisal tool22 (54%)4 (33%)5 (15%)31 (36%)
    Ambiguous (unable to allocate above categories)8 (20%)1 (8%)4 (12%)13 (15%)
41 (100%)12 (100%)33 (100%)86 (100%)
Development
    Development described21 (51%)7 (58%)18 (55%)46 (53%)
Future use
    Proposed for future use38 (93%)8 (67%)14 (42%)60 (70%)
Table 3

Summary results comparing identified tools by content

.Simple checklists (n = 41) .Simple checklists with additional judgement (n = 12) .Scales (n = 33) .Total (n = 86) .
Tool content
Number of items
     Range336432435
     Mean13.415.212.6
Maximum raw score range (scales only)NANA472
Appropriate methods for selecting study participants % (range)39; 95%a (110)11; 92% (16)29; 88% (126.4)79 (92%)
Appropriate methods for measuring exposure and outcome variables % (range)36; 88% (110)12; 100% (18)26; 79% (122)74 (86%)
Appropriate design-specific sources of bias (excluding confounding) n; % (range)36; 88% (16)11; 92% (110)27; 82% (18)74 (86%)
Appropriate methods to control confounding n; % (range)34; 83% (15)12; 100% (13)21; 64% (112)67 (78%)
Appropriate statistical methods (primary analysis of effect but excluding confounding) n; % (range)34; 83% (18)8; 67% (13)24; 73% (120)66 (78%)
Conflict of interest n; % (range)1; 2% (1)1; 8% (1)1; 3% (1)3 (4%)
.Simple checklists (n = 41) .Simple checklists with additional judgement (n = 12) .Scales (n = 33) .Total (n = 86) .
Tool content
Number of items
     Range336432435
     Mean13.415.212.6
Maximum raw score range (scales only)NANA472
Appropriate methods for selecting study participants % (range)39; 95%a (110)11; 92% (16)29; 88% (126.4)79 (92%)
Appropriate methods for measuring exposure and outcome variables % (range)36; 88% (110)12; 100% (18)26; 79% (122)74 (86%)
Appropriate design-specific sources of bias (excluding confounding) n; % (range)36; 88% (16)11; 92% (110)27; 82% (18)74 (86%)
Appropriate methods to control confounding n; % (range)34; 83% (15)12; 100% (13)21; 64% (112)67 (78%)
Appropriate statistical methods (primary analysis of effect but excluding confounding) n; % (range)34; 83% (18)8; 67% (13)24; 73% (120)66 (78%)
Conflict of interest n; % (range)1; 2% (1)1; 8% (1)1; 3% (1)3 (4%)
Table 3

Summary results comparing identified tools by content

.Simple checklists (n = 41) .Simple checklists with additional judgement (n = 12) .Scales (n = 33) .Total (n = 86) .
Tool content
Number of items
     Range336432435
     Mean13.415.212.6
Maximum raw score range (scales only)NANA472
Appropriate methods for selecting study participants % (range)39; 95%a (110)11; 92% (16)29; 88% (126.4)79 (92%)
Appropriate methods for measuring exposure and outcome variables % (range)36; 88% (110)12; 100% (18)26; 79% (122)74 (86%)
Appropriate design-specific sources of bias (excluding confounding) n; % (range)36; 88% (16)11; 92% (110)27; 82% (18)74 (86%)
Appropriate methods to control confounding n; % (range)34; 83% (15)12; 100% (13)21; 64% (112)67 (78%)
Appropriate statistical methods (primary analysis of effect but excluding confounding) n; % (range)34; 83% (18)8; 67% (13)24; 73% (120)66 (78%)
Conflict of interest n; % (range)1; 2% (1)1; 8% (1)1; 3% (1)3 (4%)
.Simple checklists (n = 41) .Simple checklists with additional judgement (n = 12) .Scales (n = 33) .Total (n = 86) .
Tool content
Number of items
     Range336432435
     Mean13.415.212.6
Maximum raw score range (scales only)NANA472
Appropriate methods for selecting study participants % (range)39; 95%a (110)11; 92% (16)29; 88% (126.4)79 (92%)
Appropriate methods for measuring exposure and outcome variables % (range)36; 88% (110)12; 100% (18)26; 79% (122)74 (86%)
Appropriate design-specific sources of bias (excluding confounding) n; % (range)36; 88% (16)11; 92% (110)27; 82% (18)74 (86%)
Appropriate methods to control confounding n; % (range)34; 83% (15)12; 100% (13)21; 64% (112)67 (78%)
Appropriate statistical methods (primary analysis of effect but excluding confounding) n; % (range)34; 83% (18)8; 67% (13)24; 73% (120)66 (78%)
Conflict of interest n; % (range)1; 2% (1)1; 8% (1)1; 3% (1)3 (4%)
Table 4

Distribution of tools by epidemiological study design addressed

Case-control .Cohort .Cross-sectional .Simple checklists n (%) .Simple checklists with a judgement n (%) .Scales n (%) .Total n (%) .
YNN9 (22)2 (17)5 (15)16 (19)
YYN15 (36)6 (50)7 (21)28 (32)
YYY4 (10)1 (8)8 (24)13 (15)
NYN11 (27)2 (17)10 (30)23 (27)
NNY2 (5)1 (8)3 (9)6 (7)
41123386
Case-control .Cohort .Cross-sectional .Simple checklists n (%) .Simple checklists with a judgement n (%) .Scales n (%) .Total n (%) .
YNN9 (22)2 (17)5 (15)16 (19)
YYN15 (36)6 (50)7 (21)28 (32)
YYY4 (10)1 (8)8 (24)13 (15)
NYN11 (27)2 (17)10 (30)23 (27)
NNY2 (5)1 (8)3 (9)6 (7)
41123386
Table 4

Distribution of tools by epidemiological study design addressed

Case-control .Cohort .Cross-sectional .Simple checklists n (%) .Simple checklists with a judgement n (%) .Scales n (%) .Total n (%) .
YNN9 (22)2 (17)5 (15)16 (19)
YYN15 (36)6 (50)7 (21)28 (32)
YYY4 (10)1 (8)8 (24)13 (15)
NYN11 (27)2 (17)10 (30)23 (27)
NNY2 (5)1 (8)3 (9)6 (7)
41123386
Case-control .Cohort .Cross-sectional .Simple checklists n (%) .Simple checklists with a judgement n (%) .Scales n (%) .Total n (%) .
YNN9 (22)2 (17)5 (15)16 (19)
YYN15 (36)6 (50)7 (21)28 (32)
YYY4 (10)1 (8)8 (24)13 (15)
NYN11 (27)2 (17)10 (30)23 (27)
NNY2 (5)1 (8)3 (9)6 (7)
41123386
Study/tool name/reference ID .Year .Source .Tool purpose .CC .Coh .CS .Items (n) .Development described .Future use .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Avis151994EDCAYY24YY5N211N
Briggs16@WAMBYYY5NN11N11N
Cameron172000EDSUYY36NY43612N
Carneiro182002EDCAY8NY11111N
CASP CC19@WCAY7NY21111N
CASP Co19@WCAY8NY12221N
CenOccHealth20@WCAYYY23NY8102N2Y
CEBM Prog21@WCAY7YY1N211N
CEBM Diag21@WCAYYY3YY111NNN
DuRantCC221994EDCAY22NY64533N
DuRantCoh221994EDCAY24NY78223N
DuRantCS221994EDCAY18NY64223N
Elwood232002EDCAYY20YYY2111N
Esdaile241985EDSUYY6YNN211NN
Gardner251986EDAMBYY12NY1NNN7N
Hadorn261996EDAMBY24YY83218N
HEB Wales27@WCAYYY13YY21431N
Horwitz281979EDCAY12NY622NNN
Khan29@WSRY9NY22212N
Khan29@WSRY10YY21421N
Kilgore301981EDCAYY2YYNNNNNN
Levine311994EDCAYY7NY11112N
Lichtenstein321987EDCAY20YY42423N
London33@WCAYY30YY410553N
Margetts342002EDSRYY6NY221N2N
Montreal35@WCAYY8NY21111N
Mulrow361986EDSUY9YN22211N
Newc-Ott CC37@WSRY8NY4211NN
Newc-Ott Co37@WSRY8NY2321NN
QUADAS382003EDSRYY14YY233NNN
Campbell392003EDAMBY13NYYYNYYN
SIGN 50 CC40@WAMBY22YY61213N
SIGN 50 Co40@WAMBY25YY53413N
Solomon411997EDSRYY12NY13121N
STARD42@WAMBYY14NY34312N
Surgical tutor43@WCAYY18YYY4233N
UCW CC44@WCAY6YY12121N
UCW Co44@WCAY8YY1N411N
UCW Cross44@WCAY3YY1NN11N
Zaza452000EDSRYY15YY53312N
Zola461989EDAMBY11YY222N2N
Study/tool name/reference ID .Year .Source .Tool purpose .CC .Coh .CS .Items (n) .Development described .Future use .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Avis151994EDCAYY24YY5N211N
Briggs16@WAMBYYY5NN11N11N
Cameron172000EDSUYY36NY43612N
Carneiro182002EDCAY8NY11111N
CASP CC19@WCAY7NY21111N
CASP Co19@WCAY8NY12221N
CenOccHealth20@WCAYYY23NY8102N2Y
CEBM Prog21@WCAY7YY1N211N
CEBM Diag21@WCAYYY3YY111NNN
DuRantCC221994EDCAY22NY64533N
DuRantCoh221994EDCAY24NY78223N
DuRantCS221994EDCAY18NY64223N
Elwood232002EDCAYY20YYY2111N
Esdaile241985EDSUYY6YNN211NN
Gardner251986EDAMBYY12NY1NNN7N
Hadorn261996EDAMBY24YY83218N
HEB Wales27@WCAYYY13YY21431N
Horwitz281979EDCAY12NY622NNN
Khan29@WSRY9NY22212N
Khan29@WSRY10YY21421N
Kilgore301981EDCAYY2YYNNNNNN
Levine311994EDCAYY7NY11112N
Lichtenstein321987EDCAY20YY42423N
London33@WCAYY30YY410553N
Margetts342002EDSRYY6NY221N2N
Montreal35@WCAYY8NY21111N
Mulrow361986EDSUY9YN22211N
Newc-Ott CC37@WSRY8NY4211NN
Newc-Ott Co37@WSRY8NY2321NN
QUADAS382003EDSRYY14YY233NNN
Campbell392003EDAMBY13NYYYNYYN
SIGN 50 CC40@WAMBY22YY61213N
SIGN 50 Co40@WAMBY25YY53413N
Solomon411997EDSRYY12NY13121N
STARD42@WAMBYY14NY34312N
Surgical tutor43@WCAYY18YYY4233N
UCW CC44@WCAY6YY12121N
UCW Co44@WCAY8YY1N411N
UCW Cross44@WCAY3YY1NN11N
Zaza452000EDSRYY15YY53312N
Zola461989EDAMBY11YY222N2N
Study/tool name/reference ID .Year .Source .Tool purpose .CC .Coh .CS .Items (n) .Development described .Future use .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Avis151994EDCAYY24YY5N211N
Briggs16@WAMBYYY5NN11N11N
Cameron172000EDSUYY36NY43612N
Carneiro182002EDCAY8NY11111N
CASP CC19@WCAY7NY21111N
CASP Co19@WCAY8NY12221N
CenOccHealth20@WCAYYY23NY8102N2Y
CEBM Prog21@WCAY7YY1N211N
CEBM Diag21@WCAYYY3YY111NNN
DuRantCC221994EDCAY22NY64533N
DuRantCoh221994EDCAY24NY78223N
DuRantCS221994EDCAY18NY64223N
Elwood232002EDCAYY20YYY2111N
Esdaile241985EDSUYY6YNN211NN
Gardner251986EDAMBYY12NY1NNN7N
Hadorn261996EDAMBY24YY83218N
HEB Wales27@WCAYYY13YY21431N
Horwitz281979EDCAY12NY622NNN
Khan29@WSRY9NY22212N
Khan29@WSRY10YY21421N
Kilgore301981EDCAYY2YYNNNNNN
Levine311994EDCAYY7NY11112N
Lichtenstein321987EDCAY20YY42423N
London33@WCAYY30YY410553N
Margetts342002EDSRYY6NY221N2N
Montreal35@WCAYY8NY21111N
Mulrow361986EDSUY9YN22211N
Newc-Ott CC37@WSRY8NY4211NN
Newc-Ott Co37@WSRY8NY2321NN
QUADAS382003EDSRYY14YY233NNN
Campbell392003EDAMBY13NYYYNYYN
SIGN 50 CC40@WAMBY22YY61213N
SIGN 50 Co40@WAMBY25YY53413N
Solomon411997EDSRYY12NY13121N
STARD42@WAMBYY14NY34312N
Surgical tutor43@WCAYY18YYY4233N
UCW CC44@WCAY6YY12121N
UCW Co44@WCAY8YY1N411N
UCW Cross44@WCAY3YY1NN11N
Zaza452000EDSRYY15YY53312N
Zola461989EDAMBY11YY222N2N
Study/tool name/reference ID .Year .Source .Tool purpose .CC .Coh .CS .Items (n) .Development described .Future use .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Avis151994EDCAYY24YY5N211N
Briggs16@WAMBYYY5NN11N11N
Cameron172000EDSUYY36NY43612N
Carneiro182002EDCAY8NY11111N
CASP CC19@WCAY7NY21111N
CASP Co19@WCAY8NY12221N
CenOccHealth20@WCAYYY23NY8102N2Y
CEBM Prog21@WCAY7YY1N211N
CEBM Diag21@WCAYYY3YY111NNN
DuRantCC221994EDCAY22NY64533N
DuRantCoh221994EDCAY24NY78223N
DuRantCS221994EDCAY18NY64223N
Elwood232002EDCAYY20YYY2111N
Esdaile241985EDSUYY6YNN211NN
Gardner251986EDAMBYY12NY1NNN7N
Hadorn261996EDAMBY24YY83218N
HEB Wales27@WCAYYY13YY21431N
Horwitz281979EDCAY12NY622NNN
Khan29@WSRY9NY22212N
Khan29@WSRY10YY21421N
Kilgore301981EDCAYY2YYNNNNNN
Levine311994EDCAYY7NY11112N
Lichtenstein321987EDCAY20YY42423N
London33@WCAYY30YY410553N
Margetts342002EDSRYY6NY221N2N
Montreal35@WCAYY8NY21111N
Mulrow361986EDSUY9YN22211N
Newc-Ott CC37@WSRY8NY4211NN
Newc-Ott Co37@WSRY8NY2321NN
QUADAS382003EDSRYY14YY233NNN
Campbell392003EDAMBY13NYYYNYYN
SIGN 50 CC40@WAMBY22YY61213N
SIGN 50 Co40@WAMBY25YY53413N
Solomon411997EDSRYY12NY13121N
STARD42@WAMBYY14NY34312N
Surgical tutor43@WCAYY18YYY4233N
UCW CC44@WCAY6YY12121N
UCW Co44@WCAY8YY1N411N
UCW Cross44@WCAY3YY1NN11N
Zaza452000EDSRYY15YY53312N
Zola461989EDAMBY11YY222N2N
Table 6

Checklists with an additional summary judgement

Study/tool name/ reference ID .Year .Source .Purpose .CC .Coh .CS .Items (n) .Development described .Future use .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Bollini741992EDSUYY10YN3312NN
Ciliska751996EDSUYY6YNN1111N
Cowley761995EDSUYY13NN113121
Effective PH77@WCAYY13NY22233N
EPIQ CC78@WCAY30NY58833N
EPIQ Cohort78@WCAY32NY681033N
Fowkes791991EDCAYYY22NY63622N
GyorkosCC801994EDSRY5YY2111NN
GyorkosCoh801994EDSRY6YY1221NN
GyorkosCS801994EDSRY4YY12N1NN
Spitzer811990EDSUYY17YN44332N
Steinberg822000EDAMBYY24YY32523N
Study/tool name/ reference ID .Year .Source .Purpose .CC .Coh .CS .Items (n) .Development described .Future use .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Bollini741992EDSUYY10YN3312NN
Ciliska751996EDSUYY6YNN1111N
Cowley761995EDSUYY13NN113121
Effective PH77@WCAYY13NY22233N
EPIQ CC78@WCAY30NY58833N
EPIQ Cohort78@WCAY32NY681033N
Fowkes791991EDCAYYY22NY63622N
GyorkosCC801994EDSRY5YY2111NN
GyorkosCoh801994EDSRY6YY1221NN
GyorkosCS801994EDSRY4YY12N1NN
Spitzer811990EDSUYY17YN44332N
Steinberg822000EDAMBYY24YY32523N
Table 6

Checklists with an additional summary judgement

Study/tool name/ reference ID .Year .Source .Purpose .CC .Coh .CS .Items (n) .Development described .Future use .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Bollini741992EDSUYY10YN3312NN
Ciliska751996EDSUYY6YNN1111N
Cowley761995EDSUYY13NN113121
Effective PH77@WCAYY13NY22233N
EPIQ CC78@WCAY30NY58833N
EPIQ Cohort78@WCAY32NY681033N
Fowkes791991EDCAYYY22NY63622N
GyorkosCC801994EDSRY5YY2111NN
GyorkosCoh801994EDSRY6YY1221NN
GyorkosCS801994EDSRY4YY12N1NN
Spitzer811990EDSUYY17YN44332N
Steinberg822000EDAMBYY24YY32523N
Study/tool name/ reference ID .Year .Source .Purpose .CC .Coh .CS .Items (n) .Development described .Future use .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Bollini741992EDSUYY10YN3312NN
Ciliska751996EDSUYY6YNN1111N
Cowley761995EDSUYY13NN113121
Effective PH77@WCAYY13NY22233N
EPIQ CC78@WCAY30NY58833N
EPIQ Cohort78@WCAY32NY681033N
Fowkes791991EDCAYYY22NY63622N
GyorkosCC801994EDSRY5YY2111NN
GyorkosCoh801994EDSRY6YY1221NN
GyorkosCS801994EDSRY4YY12N1NN
Spitzer811990EDSUYY17YN44332N
Steinberg822000EDAMBYY24YY32523N
Study/tool name/ reference ID .Year .Source .Purpose .CC .Coh .CS .Items (n) .Development described .Future use .Maximum raw score .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Anders471996EDSUY6NN6N32NNN
AriensCC482000EDSUY18YY1839112N
AriensCoh482000EDSUY17YY1737112N
AriensCS482000EDSUY13YY1328112N
Berlin491990EDSUYY16YN32N2N2NN
Bhutta502002EDSUY6NY101NN2NN
Borghouts511998EDSUY13YN13311N2N
Campos521995EDSUYY7NN70N10NN10N
Carson531994EDAMBY10YY103N212N
Loney54@WCAYYY6NY8221N1N
Cho55,b1994EDCAYYY18YY36122842N
Corrao561999EDSUYY16NN30594N2N
Downs571998EDCAYY17YY2151338N
Garber681996EDSUYYY6NN18NNNNNN
Goodman591994EDAMBYY10YY5020N5515N
Jabbour601996EDSUY7NN71N1NNN
Kreulen61,c1998EDSUY16NN423126312N
Krogh621985EDCAYYY7NY42N1N1N
Littenberg63,d1998EDSUYYY15NN45NANANANANAN
LongneckerCC64,a1988EDSUY11NN53/58a(5)(5)(5)(5)NN
LongneckerCoh641988EDSUY4NN205555NN
Macfarlane652001EDAMBYYY6YN6212N1N
Manchikanti662002EDSUYY6YN622NN11
MargettsCC67,a1995EDSRY13YY46.426.410255N
MargettsCoh671995EDSRY19YY53.4422729N
Meijer682003EDSUY9YY913112N
Nguyen691999EDSUYYY14NN7241861220N
Rangel702003EDAMBY15YN17351N6N
Reisch71,a1989EDCAYY35 (min)NY% items fulfilled(1)(1)(1)N(1)N
Stock721991EDSUYYY7NN216633NN
WindtCC732000EDSUY20YN20311414N
WindtCoh732000EDSUY18YN1838114N
WindtCS732000EDSUY16YN1638114N
Study/tool name/ reference ID .Year .Source .Purpose .CC .Coh .CS .Items (n) .Development described .Future use .Maximum raw score .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Anders471996EDSUY6NN6N32NNN
AriensCC482000EDSUY18YY1839112N
AriensCoh482000EDSUY17YY1737112N
AriensCS482000EDSUY13YY1328112N
Berlin491990EDSUYY16YN32N2N2NN
Bhutta502002EDSUY6NY101NN2NN
Borghouts511998EDSUY13YN13311N2N
Campos521995EDSUYY7NN70N10NN10N
Carson531994EDAMBY10YY103N212N
Loney54@WCAYYY6NY8221N1N
Cho55,b1994EDCAYYY18YY36122842N
Corrao561999EDSUYY16NN30594N2N
Downs571998EDCAYY17YY2151338N
Garber681996EDSUYYY6NN18NNNNNN
Goodman591994EDAMBYY10YY5020N5515N
Jabbour601996EDSUY7NN71N1NNN
Kreulen61,c1998EDSUY16NN423126312N
Krogh621985EDCAYYY7NY42N1N1N
Littenberg63,d1998EDSUYYY15NN45NANANANANAN
LongneckerCC64,a1988EDSUY11NN53/58a(5)(5)(5)(5)NN
LongneckerCoh641988EDSUY4NN205555NN
Macfarlane652001EDAMBYYY6YN6212N1N
Manchikanti662002EDSUYY6YN622NN11
MargettsCC67,a1995EDSRY13YY46.426.410255N
MargettsCoh671995EDSRY19YY53.4422729N
Meijer682003EDSUY9YY913112N
Nguyen691999EDSUYYY14NN7241861220N
Rangel702003EDAMBY15YN17351N6N
Reisch71,a1989EDCAYY35 (min)NY% items fulfilled(1)(1)(1)N(1)N
Stock721991EDSUYYY7NN216633NN
WindtCC732000EDSUY20YN20311414N
WindtCoh732000EDSUY18YN1838114N
WindtCS732000EDSUY16YN1638114N
Study/tool name/ reference ID .Year .Source .Purpose .CC .Coh .CS .Items (n) .Development described .Future use .Maximum raw score .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Anders471996EDSUY6NN6N32NNN
AriensCC482000EDSUY18YY1839112N
AriensCoh482000EDSUY17YY1737112N
AriensCS482000EDSUY13YY1328112N
Berlin491990EDSUYY16YN32N2N2NN
Bhutta502002EDSUY6NY101NN2NN
Borghouts511998EDSUY13YN13311N2N
Campos521995EDSUYY7NN70N10NN10N
Carson531994EDAMBY10YY103N212N
Loney54@WCAYYY6NY8221N1N
Cho55,b1994EDCAYYY18YY36122842N
Corrao561999EDSUYY16NN30594N2N
Downs571998EDCAYY17YY2151338N
Garber681996EDSUYYY6NN18NNNNNN
Goodman591994EDAMBYY10YY5020N5515N
Jabbour601996EDSUY7NN71N1NNN
Kreulen61,c1998EDSUY16NN423126312N
Krogh621985EDCAYYY7NY42N1N1N
Littenberg63,d1998EDSUYYY15NN45NANANANANAN
LongneckerCC64,a1988EDSUY11NN53/58a(5)(5)(5)(5)NN
LongneckerCoh641988EDSUY4NN205555NN
Macfarlane652001EDAMBYYY6YN6212N1N
Manchikanti662002EDSUYY6YN622NN11
MargettsCC67,a1995EDSRY13YY46.426.410255N
MargettsCoh671995EDSRY19YY53.4422729N
Meijer682003EDSUY9YY913112N
Nguyen691999EDSUYYY14NN7241861220N
Rangel702003EDAMBY15YN17351N6N
Reisch71,a1989EDCAYY35 (min)NY% items fulfilled(1)(1)(1)N(1)N
Stock721991EDSUYYY7NN216633NN
WindtCC732000EDSUY20YN20311414N
WindtCoh732000EDSUY18YN1838114N
WindtCS732000EDSUY16YN1638114N
Study/tool name/ reference ID .Year .Source .Purpose .CC .Coh .CS .Items (n) .Development described .Future use .Maximum raw score .Participants .Variables measure .Other biases .Control confounding .Other statistics .Conflict of interest .
Anders471996EDSUY6NN6N32NNN
AriensCC482000EDSUY18YY1839112N
AriensCoh482000EDSUY17YY1737112N
AriensCS482000EDSUY13YY1328112N
Berlin491990EDSUYY16YN32N2N2NN
Bhutta502002EDSUY6NY101NN2NN
Borghouts511998EDSUY13YN13311N2N
Campos521995EDSUYY7NN70N10NN10N
Carson531994EDAMBY10YY103N212N
Loney54@WCAYYY6NY8221N1N
Cho55,b1994EDCAYYY18YY36122842N
Corrao561999EDSUYY16NN30594N2N
Downs571998EDCAYY17YY2151338N
Garber681996EDSUYYY6NN18NNNNNN
Goodman591994EDAMBYY10YY5020N5515N
Jabbour601996EDSUY7NN71N1NNN
Kreulen61,c1998EDSUY16NN423126312N
Krogh621985EDCAYYY7NY42N1N1N
Littenberg63,d1998EDSUYYY15NN45NANANANANAN
LongneckerCC64,a1988EDSUY11NN53/58a(5)(5)(5)(5)NN
LongneckerCoh641988EDSUY4NN205555NN
Macfarlane652001EDAMBYYY6YN6212N1N
Manchikanti662002EDSUYY6YN622NN11
MargettsCC67,a1995EDSRY13YY46.426.410255N
MargettsCoh671995EDSRY19YY53.4422729N
Meijer682003EDSUY9YY913112N
Nguyen691999EDSUYYY14NN7241861220N
Rangel702003EDAMBY15YN17351N6N
Reisch71,a1989EDCAYY35 (min)NY% items fulfilled(1)(1)(1)N(1)N
Stock721991EDSUYYY7NN216633NN
WindtCC732000EDSUY20YN20311414N
WindtCoh732000EDSUY18YN1838114N
WindtCS732000EDSUY16YN1638114N

The biggest group was checklists (41; 48%),1546 followed by scales (33; 38%)4773 and finally summary judgement checklists (12; 14%)7482. Fifteen per cent of all tools were for generic use in systematic reviews, one-third for use in critical appraisal, one-third for single use in a specific systematic review and 15% where the purpose was ambiguous. For checklists, half were critical appraisal tools (22; 54%) whilst two-thirds of scales were review-specific (21; 64%). Over half of all tools (54%) described their development process in detail.

Just under three-quarters of all tools were proposed as being suitable for future use, including all of the critical appraisal tools and generic systematic review tools and six of the tools originally designed for use in a specific systematic review.

A number of tools were designed to address specific study design types: case-control studies alone (19%); cohort studies alone (27%) and cross-sectional studies alone (7%) (Table 3). Others addressed different combinations of these design types, with almost one-third addressing both case-control and cohort studies (45%) and 15% addressing all three. The number of items in all tools ranged from 3 to 36, with a mean of 13.7 (13.4 for simple checklists, 15.2 for simple checklists with a summary judgement and 12.6 for scales).

The majority of tools included items relating to methods for selecting study participants (92%). The proportion of tools including items about the measurement of study variables (exposure, outcome and/or confounding variables) was also high (86%). Assessment of other design-specific sources of bias (including recall bias, interviewer bias and biased loss to follow-up but excluding confounding) was included in 86%, around three-quarters assessed control of confounding (78%) and three-quarters included items concerning statistical methods (78%). Conflict of interest was included in only three tools (3%).

To address weighting, we recorded the number of items included in both types of checklists devoted to each of our key domains, whilst for scales we recorded the total available raw score for each domain. As can be seen from Tables 5 to 7, there is a little consistency among tools, with considerable variability in the number of items across domains and across tool types.

Discussion

Assessing the quality of evidence from observational epidemiological studies requires tools that are designed and developed with this specific purpose in mind. To our knowledge, this is the most comprehensive search to date of both the medical literature and the Internet for tools to assess such studies. We have identified 86 candidate tools, comprising checklists, summary judgement checklists and scales. The Internet search identified three more tools that were not identified through searching electronic databases. Future search strategies may wish to employ similar methodologies to ensure the identification of all available tools, articles or studies. Despite the comprehensive nature of the search strategy employed, it is unlikely that all existing tools for assessing quality of observational epidemiological studies have been identified, since many are developed for specific systematic reviews, and it is very difficult to identify all of these through searching electronic databases.

A large number of the tools were scales that resulted in numerical summary scores. Whilst this approach has the appearance of simplicity, considerable concerns have been raised about such an approach to assessing quality.83 Summary scores involve inherent weighting of component items, some of which may not be directly related to the validity of a study's findings (such as sample size calculations). It is unclear how weights for different items should be determined, and different scales may reach different conclusions on the overall quality of an individual study.84 We have found that the weighting applied in scales to different study domains is variable and inconsistent. Similar considerations apply to summary judgement checklists, although qualitative rather than quantitative summaries may be less prone to inappropriate analysis. We prefer a more transparent checklist approach that concentrates on the few, principal, potential sources of bias in a study's findings.

Tool components should, where possible, be based on empirical evidence of bias, although this may be difficult to obtain, and there is a need for more empirical research on relationships between specific quality items and findings from epidemiological studies. There was wide variation among tools in the number and nature of items, scoring ranges (where applicable) and levels of development. The specific components assessed by the tools differed across both study design and tool type. Although we have not implemented all tools, we would anticipate that different tools would indicate different degrees of quality when applied to the same study.

It is encouraging that most tools included items to assess methods for selecting study participants (92%) and to assess methods for measuring study variable and design-specific sources of bias (both 86%). Over three-quarters of tools assessed the appropriate use of statistics, and the control of confounding (both 78%) but conflict of interest was only included in 4% of tools. Around one-third of the tools were designed for specific clinical or research topics, limiting their wider applicability; there was a marked difference between tool types in this respect, with the majority of checklists designed for critical appraisal and the majority of scales for single use in specific single reviews. The ambiguity of purpose of some of the tools is a cause for concern, and more clarity is needed to differentiate assessments of the quality of reporting from the quality of what was actually done in the study.

A rigorous development process should be an important component of tool design, but only half of the tools provided a clear description of their design, development or the empirical basis for item inclusion or evaluation of the tool's validity and reliability. This is of particular concern as 70% of the tools were proposed as being suitable for future use in other contexts. Future tools should undergo a rigorous development process to ensure that they are evidence-based, easy to use and readily interpretable.

This review has highlighted the lack of a single obvious candidate tool for assessing quality of observational epidemiological studies. One might regard this review as the first stage towards development of a generic tool. In such an endeavour, one would need to reach a consensus on the critical domains that should be included. The development of the STROBE statement has involved extensive discussion among numerous experienced epidemiologists and statisticians. Despite targeting the reporting of studies, many items were no doubt selected due to presumed (or evidence of) association with susceptibility to bias. Thus the statement should provide a suitable starting point for development of a quality assessment tool, and we have been guided by it in our presentation of results.

Around half of the checklists included what we regard as the three most fundamental domains of appropriate selection of participants, appropriate measurement of variables and appropriate control of confounding; all were considered appropriate for future use. The majority of these tools also included items on potential design-specific biases. However, we are reluctant to recommend a specific tool, without having implemented them all on multiple studies with a view to assessing their properties and ease-of-use. Our broad recommendations are that tools should (i) include a small number of key domains; (ii) be as specific as possible (with due consideration of the particular study design and topic area); (iii) be a simple checklist rather than a scale and (iv) show evidence of careful development, and of their validity and reliability.

Search strategy

(1 or 2 or 3 or 4) AND (5 or 6 or 7) AND (8 or  to 17)

  • scale*

  • checklist*

  • critical apprais*

  • tool*

  • valid*

  • quality

  • (bias* OR confounding) AND (assess* OR measure* OR evaluat*)

  • OBSERVATIONAL STUDIES (MeSH)

  • observational stud*

  • COHORT STUDIES (MeSH)

  • cohort stud*

  • CASE-CONTROL STUDIES (MeSH)

  • case-control stud*

  • CROSS-SECTIONAL STUDIES (MeSH)

  • cross-sectional stud*

  • FOLLOW-UP STUDIES (MeSH)

  • follow-up stud*

Conflict of interest: None declared.

  • Tools for assessing quality in clinical trials are well-described but much less attention has been given to similar tools for observational epidemiological studies.

  • Only about half of the identified tools did not describe their development or validity and reliability.

  • Tools for assessing quality should be rigorously developed, evidence-based, valid, reliable and easy to use and concentrate on assessing sources of bias.

  • There is a need to agree on critical elements for assessing susceptibility to bias in observational epidemiology and to develop appropriate evaluation tools.

References

1
, , , .

Quality of reporting of randomized trials as a measure of methodologic quality

,,,vol.(pg.-)
2
, , , .

An addition to the controversy on sunlight exposure and melanoma risk: a meta-analytical approach

,,,vol.(pg.-)
3
, , ,et al.

Low grade inflammation and coronary heart disease: prospective study and updated meta-analyses

,,,vol.(pg.-)
4
, , , , , .

Controversy of oral contraceptives and risk of rheumatoid arthritis: meta-analysis of conflicting studies and review of conflicting meta-analyses with special emphasis on analysis of heterogeneity

,,,vol.(pg.-)
5
, , .

Systematic reviews in health care: assessing the quality of controlled clinical trials

,,,vol.(pg.-)
6
, , , , .

A systematic review of the content of critical appraisal tools

,,,vol.pg.
7
, , , , , .

Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement. Quality of reporting of meta-analyses

,,,vol.(pg.-)
8
, , ,et al.

Evaluating non-randomised intervention studies

,,,vol.(pg.-)
9
, , , , , , .

Systems to Rate the Strength of Evidence. Evidence Report/Technology Assessment No. 47

,

Agency for Healthcare Research and Quality

,
10
, , ,et al.

Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group

,,,vol.(pg.-)
11
, .

The scandal of poor epidemiological research

,,,vol.(pg.-)
12
, , , , .

Strengthening the reporting of observational epidemiological studies. STROBE Statement

,

Checklist of Essential Items Version 3

,
13
, , , , , .

Assessing the quality of randomized controlled trials: an annotated bibliography of scales and checklists

,,,vol.(pg.-)
15
.

Reading research critically. II. An introduction to appraisal: assessing the evidence

,,,vol.(pg.-)
16

The Joanna Briggs Institute

System for the Unified Management of the Review and Assessment of Information (SUMARI)

,

The Joanna Briggs Institute

17
, , ,et al.

Geriatric rehabilitation following fractures in older people: a systematic review

,,,vol.(pg.-)
18
.

Critical appraisal of prognostic evidence: practical rules

,,,vol.(pg.-)
19

Critical Appraisal Skills Programme (CASP): appraisal tools

,

Public Health Resource Unit

20

Centre for Occupational and Environmental Health

,

School of Epidemiology and Health Sciences, University of Manchester

21

Centre for Evidence-Based Mental Health

,
22
.

Checklist for the evaluation of research articles

,,,vol.(pg.-)
23
.

Forward projectionusing critical appraisal in the design of studies

,,,vol.(pg.-)
24
, .

Observational studies of cause-effect relationships: an analysis of methodologic problems as illustrated by the conflicting data for the role of oral contraceptives in the etiology of rheumatoid arthritis

,,,vol.(pg.-)
25
, , .

Use of check lists in assessing the statistical content of medical studies

,,,vol.(pg.-)
26
, , , .

Rating the quality of evidence for clinical practice guidelines

,,,vol.(pg.-)
27

Health Evidence Bulletin, Wales

,

Questions to assist with the critical appraisal of an observational study eg cohort, case-control, cross-sectional

,
28
, .

Methodologic standards and contradictory results in case-control research

,,,vol.(pg.-)
29
, , , , .

Undertaking systematic reviews of research effectiveness. CRD's guidance for those carrying out or commissioning reviews

,,2nd edn

The University of York Centre for Reviews and Dissemination

30

Department of Clinical Epidemiology and Biostatistics

How to read clinical journals: IV. To determine etiology or causation

,,,vol.(pg.-)
31
, , , , , .

Users' guides to the medical literature. IV. How to use an article about harm. Evidence-Based Medicine Working Group

,,,vol.(pg.-)
32
, , .

Guidelines for reading case-control studies

,,,vol.(pg.-)
33

Federal Focus, Incorporated

The London Principles for Evaluating Epidemiologic Data in Regulatory Risk Assessment

,
34
, , .

Evidence-based nutritionreview of nutritional epidemiological studies

,

South African J Clin Nutr

,,vol.(pg.-)
35

Critical Appraisal Worksheet

,
36
, .

Blood glucose and diabetic retinopathy: a critical appraisal of new evidence

,,,vol.(pg.-)
37
, , , , , , .

Quality Assessment Scales for Observational Studies

,

Ottawa Health Research Institute

38
, , , , .

The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews

,,,vol.pg.
39
, .

Interpretation of genetic association studies in complex disease

,,,vol.(pg.-)
40

Scottish Intercollegiate Guidelines Network

SIGN 50: A guideline developers' handbook

,

Scottish Intercollegiate Guidelines Network

41
, , , .

Costs, outcomes, and patient satisfaction by provider type for patients with rheumatic and musculoskeletal conditions: a critical review of the literature and proposed methodologic standards

,,,vol.(pg.-)
42

The STARD InitiativeTowards Complete and Accurate Reporting of Studies on Diagnostic Accuracy

,
43

Critical appraisal: Guidelines for the critical appraisal of a paper

,
44

University of Wales College of Medicine

,
45
, , ,et al.

Data collection instrument and procedure for systematic reviews in the guide to community preventive services. Task Force on Community Preventive Services

,,,vol.(pg.-)
46
, , ,et al.

Is the published literature a reliable guide for deciding between alternative treatments for patients with early cervical cancer?

,

Int J Radiat Oncol Biol Phys

,,vol.(pg.-)
47
, , , , .

Secondary failure rates of measles vaccines: a meta-analysis of published studies

,,,vol.(pg.-)
48
, , , , .

Physical risk factors for neck pain

,

Scand J Work Environ Health

,,vol.(pg.-)
49
, .

A meta-analysis of physical activity in the prevention of coronary heart disease

,,,vol.(pg.-)
50
, , , , .

Cognitive and behavioral outcomes of school-aged children who were born preterm: a meta-analysis

,,,vol.(pg.-)
51
, , , .

The effects of medical school curricula, faculty role models, and biomedical research support on choice of generalist physician careers: a review and quality assessment of the literature

,,,vol.(pg.-)
52
, , .

The clinical course and prognostic factors of non-specific neck pain: a systematic review

,,,vol.(pg.-)
53
, , , , , .

Quality of published reports of the prognosis of community-acquired pneumonia

,,,vol.(pg.-)
54
, , , , .

Critical appraisal of the health research literature: prevalence or incidence of a health problem

,,,vol.(pg.-)
55
, .

Instruments for assessing the quality of drug studies published in the medical literature

,,,vol.(pg.-)
56
, , , .

Exploring the dose-response relationship between alcohol consumption and the risk of several alcohol-related conditions: a meta-analysis

,,,vol.(pg.-)
57
, .

The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions

,

J Epidemiol Commun Health

,,vol.(pg.-)
58
, , , , .

Adult respiratory distress syndrome: a systemic overview of incidence and risk factors

,,,vol.(pg.-)
59
, , , .

Manuscript quality before and after peer review and editing at Annals of Internal Medicine

,,,vol.(pg.-)
60
, , .

Life support courses: are they effective?

,,,vol.(pg.-)
61
, , .

Meta-analysis of anterior veneer restorations in clinical studies

,,,vol.(pg.-)
62
.

A checklist system for critical review of medical literature

,,,vol.(pg.-)
63
, , ,et al.

Closed fractures of the tibial shaft. A meta-analysis of three methods of treatment

,,,vol.(pg.-)
64
, , , .

A meta-analysis of alcohol consumption in relation to risk of breast cancer

,,,vol.(pg.-)
65
, , .

Systematic review of population-based epidemiological studies of oro-facial pain

,,,vol.(pg.-)
66
, , , , , .

Medial branch neurotomy in management of chronic spinal pain: systematic review of the evidence

,,,vol.(pg.-)
67
, , ,et al.

Development of a scoring system to judge the scientific quality of information from case-control and cohort studies of nutrition and disease

,,,vol.(pg.-)
68
, , , , .

Prognostic factors in the subacute phase after stroke for the future residence after six months to one year. A systematic review of the literature

,,,vol.(pg.-)
69
, , , .

A systematic review of the relationship between overjet size and traumatic dental injuries

,,,vol.(pg.-)
70
, , , , .

Development of a quality assessment scale for retrospective clinical studies in pediatric surgery

,,,vol.(pg.-)
71
, , .

Aid to the evaluation of therapeutic studies

,,,vol.(pg.-)
72
.

Workplace ergonomic factors and the development of musculoskeletal disorders of the neck and upper limbs: a meta-analysis

,,,vol.(pg.-)
73
, , ,et al.

Occupational risk factors for shoulder pain: a systematic review

,,,vol.(pg.-)
74
, , , .

The impact of research quality and study design on epidemiologic estimates of the effect of nonsteroidal anti-inflammatory drugs on upper gastrointestinal tract disease

,,,vol.(pg.-)
75
, , ,et al.

A systematic overview of the effectiveness of home visiting as a delivery strategy for public health nursing interventions

,,,vol.(pg.-)
76
.

Prostheses for primary total hip replacement. A critical appraisal of the literature

,

Int J Technol Assess Health Care

,,vol.(pg.-)
77

Effective Public Health Practice Project

Quality Assessment Tool for Quantitative Studies

,

(Effective Practice, Informatics and Quality Improvement)

78

School of Population Health

,

Faculty of Medical and Health Sciences, University of Auckland

79
, .

Critical appraisal of published research: introductory guidelines

,,,vol.(pg.-)
80
, , ,et al.

An approach to the development of practice guidelines for community health interventions

,,,vol.(pg.-)
81
, , ,et al.

Links between passive smoking and disease: a best-evidence synthesis. A report of the Working Group on Passive Smoking

,,,vol.(pg.-)
82
, , ,et al.

Methods used to evaluate the quality of evidence underlying the National Kidney Foundation-Dialysis Outcomes Quality Initiative Clinical Practice Guidelines: description, findings, and implications

,,,vol.(pg.-)
83
, .

On the bias produced by quality scores in meta-analysis, and a hierarchical view of proposed solutions

,,,vol.(pg.-)
84
, , , .

The hazards of scoring the quality of clinical trials for meta-analysis

,,,vol.(pg.-)

Published by Oxford University Press on behalf of the International Epidemiological Association © The Author 2007; all rights reserved.