The effect of incorporating RobotReviewer suggestions into risk-of-bias assessments conducted within Covidence




Poster session 2 Thursday: Evidence synthesis - methods / improving conduct and reporting


Thursday 14 September 2017 - 12:30 to 14:00


All authors in correct order:

Arno A1, Elliott J2, Thomas J3, Wallace B4, Marshall I5
1 Covidence, Ireland
2 School of Public Health and Preventive Medicine, Monash University, Australia
3 Institute of Education, University College London, United Kingdom
4 College of Computer and Information Science, Northeastern University, United States
5 Department of Primary Care and Public Health Sciences, King's College London, United Kingdom
Presenting author and contact person

Presenting author:

Anneliese Arno

Contact person:

Abstract text
Background: Machine learning in health-evidence synthesis is moving forward rapidly. As these technologies mature and become more widely available, it is essential that their effect on accuracy and efficiency is rigorously assessed. Covidence is an online platform that streamlines completion of systematic review tasks, including title/abstract screening, full text review, quality assessment (Risk of Bias, RoB), and data extraction. RobotReviewer is a web-based tool which uses machine learning to semi-automate specific tasks in evidence synthesis, including RoB on user-uploaded PDFs.

Objectives: The purpose of this experiment was to determine the effect of incorporating the suggestions of the RobotReviewer machine learning algorithms into RoB assessments conducted within Covidence (experimental), when compared to human-only, conventional RoB assessment (control).

Methods: We randomised studies (1:1) included within systematic reviews to semi-automated or human-only RoB assessment. In the experimental condition, one of two reviewers was presented with RobotReviewer suggestions (judgement and supporting text) and then asked to complete their assessment. In the control condition, two reviewers completed their assessments without RobotReviewer suggestions. Main outcomes were time to complete assessments (efficiency) and differences between semi-automated and human-only assessments (accuracy).

Results: The results of the randomised study described above will be presented, including the main outcomes of effect on time to complete assessments and assessment accuracy.

Conclusions: Results of this study will contribute to our understanding of the potential benefits and disadvantages of RobotReviewer-generated RoB assessments, and more generally the use of machine learning in the extraction tasks of a systematic review. Rigorous assessments of new, semi-automated evidence systems will form a foundation for effective and appropriate use of emerging technologies.