We have developed a machine learning and rule-based tool using natural language processing which allows a user to upload a trial protocol, and which categorises the protocol as high, medium or low risk of ending uninformatively. The tool is at app.clinicaltrialrisk.org and is open-sourced on Github. You can read an explanation of how the tool works here, and a description of how we validated its accuracy here.


There are several indicators of high risk of uninformativeness which can be identified in a protocol, such as a lack of and or an inadequate statistical analysis plan, use of non-standard endpoints, or the use of cluster randomisation. One of the most common causes of a trial ending uninformatively is underpowering. Low-risk trials are often run by well-known institutions with external funding and an international or intercontinental array of sites. These indicators can be referred to as features or parameters.

The proof of concept

This project is an initial Proof of Concept (POC) which to showcase what is possible with natural language processing, with a view to moving towards a more comprehensive main project which may identify a more complete set of cost, complexity, or uninformativeness risk factors.

Benefits of the Clinical Trial Risk Tool for researchers and funders

  1. The future tool could assist a human in assessing the cost, complexity or risk of uninformativeness of a trial, and understanding which factors contribute to the cost, complexity and risk of uninformativeness.
  2. Reviewers may be able to assess trials more rapidly.
  3. The tool may augment certain current processes.
  4. The tool could be used to inform stakeholders about the most impactful features for complexity, cost, and informativeness or risk of uninformativeness.
  5. The tool can assist reviewers in assessing trials more consistently.
  6. The tool may illustrate what we can expect to achieve from investment of further review time.

Improving the tool

The tool is designed with a feedback form so that inaccurate data extractions can be reported back to the developers.

In addition the MIT licence means that you are free to add features or extend the scope of the tool.


I hope that researchers who are considering submitting a protocol of a trial to a prospective source of funding will be able to use the tool as a kind of checklist to ensure that their trial is designed to reduce risk and increase the prospects of being funded.

Leave a Reply

Your email address will not be published. Required fields are marked *