Use of machine learning to conduct systematic reviews of patient values and preferences in the context of guideline development




Short oral session 6: Evidence synthesis methods


Thursday 14 September 2017 - 11:00 to 12:30


All authors in correct order:

Zhang Y1, Pérez Rada D2, Etxeandia-Ikobaltzeta I1, Rada G3, Vásquez J2, Wiercioch W1, Nieuwlaat R1, Couban R4, Schünemann H1
1 Department of Health Research Methods, Evidence, and Impact, McMaster University, Canada
2 Epistemonikos, Chile
3 Department of Internal Medicine and Evidence-Based Healthcare Program, Pontificia Universidad Católica de Chile, Santiago, Chile
4 McMaster University, Canada
Presenting author and contact person

Presenting author:

Yuan Zhang

Contact person:

Abstract text
Background: In the context of clinical practice guideline development we conducted a systematic review on patient values and preferences, or how patients value healthcare outcomes, following the GRADE evidence-to-decision framework. Challenges with these systematic reviews arise as a sensitive search strategy results in a large number of citations to screen, so alternative strategies to balance sensitivity and feasibility are needed.

Objectives: To describe our experience of using a machine-learning model to exclude citations for screening in the context of a large systematic review.

Methods:We ran a sensitive search strategy in MEDLINE and EMBASE. We used the Collaboratron™ platform for: the screening in duplicate of a training sample of the search results (records from 2014 to 2016); the development of a machine-learning model to predict the probability of inclusion of a reference; and, the implementation of the model in the remaining records to be screened. For the machine-learning model we arbitrarily used a score of 0.01 (i.e. 1% probability of an article being relevant) to exclude irrelevant records.

Results: From 48 563 records we screened 10 193 in order to create the training set.
The predicted accuracy of the model was 87.5.% sensitivity and 92.3% specificity, which left 2983 records to screen from the remaining 38 370.

Conclusions: The application of a machine-learning model substantially decreased the workload associated with the screening of a very large number of records. This approach might be useful when a small loss of relevant studies is acceptable.