Case study: Bill and Melinda Gates Foundation

The Bill and Melinda Gates Foundation, as a sponsor of clinical trials globally, has the task of reviewing protocols submitted for sponsorship by investigators. This is a time consuming task and involves reviewing each protocol manually for a number of key features, such as adequate sample size or a statistical analysis plan.

The Foundation contacted Fast Data Science asking if it would be possible to develop some natural language processing software to pull out risk factors, removing some of the subjective and time consuming elements of the protocol review process.

We developed an open source tool focusing on HIV and tuberculosis trials running in low- and middle-income countries (LMICs). The first iteration of the tool was simple web-based tool in Python which allowed a nontechnical user to drag and drop a trial protocol in PDF format, and rated the risk level as a traffic light (red, amber/yellow, or green).

The users at the Gates Foundation reported that there was a significant reduction in the number of hours required to review a protocol, and the Clinical Trial Risk Tool served as a first triage in the review process.

We have since extended the functionality of the tool to produce cost estimates. We have published an article in Gates Open Research[1] detailing the workings of the tool and the source code is published on Github.

You can read more about the development of the tool in our article in Clinical Leader. [2]

References

Wood, Thomas A., and Douglas McNair. “Clinical Trial Risk Tool: software application using natural language processing to identify the risk of trial uninformativeness.” Gates Open Research 7.56 (2023): 56. https://gatesopenresearch.org/articles/7-56/v1
Wood, Thomas A., A Tool To Tackle The Risk Of Uninformative Trials , Clinical Leader, 2025