Automatic and Semi-Automatic Document Selection for Technology-Assisted Review

Maura R. Grossman, Gordon V. Cormack, and Adam Roegiest • August 2017 • SIGIR 2017

In the TREC Total Recall Track (2015-2016), participating teams could employ either fully automatic or human-assisted (“semi-automatic”) methods to select documents for relevance assessment by a simulated human reviewer. According to the TREC 2016 evaluation, the fully automatic baseline method achieved a recall-precision breakeven (“R-precision”) score of 0.71, while the two semi-automatic efforts achieved scores of 0.67 and 0.51. In this work, we investigate the extent to which the observed effectiveness of the different methods may be confounded by chance, by inconsistent adherence to the Track guidelines, by selection bias in the evaluation method, or by discordant relevance assessments. We find no evidence that any of these factors could yield relative effectiveness scores inconsistent with the official TREC 2016 ranking.

Read the Paper

Automatic and Semi-Automatic Document Selection for Technology-Assisted Review

Read more papers

Dancing with the AI Devil: Investigating the Partnership Between Lawyers and AI

A Reliable and Accurate Multiple Choice Question Answering System for Due Diligence