Quality scoring of diagnostic articles for the development of evidence-based practice guidelines




Poster session 2 Thursday: Evidence synthesis - methods / improving conduct and reporting


Thursday 14 September 2017 - 12:30 to 14:00


All authors in correct order:

Ott U1, Hegmann KT1, Kristine H1, Thiese MS1, Harris J2
2 Kaiser Permanente, USA
Presenting author and contact person

Presenting author:

Ulrike Ott

Contact person:

Abstract text
Background: Well-designed diagnostic studies are necessary for determining evidence to support a diagnostic approach. As quality varies widely, a method is needed to separate higher from lower quality.

Objectives: To quantify the number of diagnosis guidelines listed in the National Guideline Clearinghouse (NGC) that utilised a rating system to determine the quality of evidence and present a quantitative method to assess the quality of diagnostic studies.

Methods: We reviewed the guideline matrix used by the NGC, which allowed for quantification of methods to assess the quality of the evidence.

Results: Of the diagnostic guidelines (N=678) in the NGC, 81.7% (N=554) use weighting according to a rating scheme (scheme given) to assess the quality of the evidence. However, 1.8% (N=12) use a rating scheme but do not provide further details; 12.3% (N=84) use expert consensus or a subjective review and 4.2% (N=28) do not provide methods regarding analysing the quality of evidence. Assessment of diagnostic guidelines (scheme given) published in 2016 that systematically reviewed the literature (N=34) showed that only 2.9% (N=1) actually use a quantitative scheme. A quantitative scoring method used by the American College of Occupational and Environmental Medicine emphasises the comparative test being studied. Another criterion is data to calculate test specificity and sensitivity. Studies that compare the new test to an established Gold Standard test are evaluated first. However, many diagnostic studies have no established Gold Standard so a variety of comparative tests are often used. The scoring metric considers 11 criteria with each criterion being rated at 0, 0.5, or 1.0. A study is considered low quality if the composite rating was ≤3.5, moderate if rated 4-7.5, and high if rated 8-11. This system results in a testable study score and reproducible guidelines methods.

Conclusions: Properly grading study quality and rating overall strength of evidence can produce improved levels of confidence about the scientific evidence underlying diagnostic guidelines.