Frequently Asked Questions

When a pharmaceutical company develops a drug, it needs to pass through several phases of clinical trials before it can be approved by regulators.

Before the clinical trial is run, the drug developer writes a document called a clinical trial protocol. This contains key information about how long the trial will run for, what is the risk to participants, what kind of treatment is being investigated, etc.

The problem is that each protocol is up to 200 pages long and the structure can vary. There is not a standardised way of noting the intervention, number of participants, locations, and so on, although there exist in-house standards within many pharma companies.

The Clinical Trial Risk tool is a tool that helps funders and pharma companies identify risk factors in a trial protocol using natural language processing and rate the trial as high, medium or low risk.

Wood TA and McNair D. Clinical Trial Risk Tool: software application using natural language processing to identify the risk of trial uninformativeness [version 1; peer review: awaiting peer review]. Gates Open Res 2023, 7:56 (https://doi.org/10.12688/gatesopenres.14416.1)

A BibTeX entry for LaTeX users is

@article{Wood_2023,
  doi = {10.12688/gatesopenres.14416.1},
  url = {https://doi.org/10.12688%2Fgatesopenres.14416.1},
  year = 2023,
  month = {apr},
  publisher = {F1000 Research Ltd},
  volume = {7},
  pages = {56},
  author = {Thomas A Wood and Douglas McNair},
  title = {Clinical Trial Risk Tool: software application using natural language processing to identify the risk of trial uninformativeness},
  journal = {Gates Open Research}
}

If you upload a protocol, the Clinical Trial Risk Tool does not store or save it. You can read more on our Privacy Policy page.

If you choose to create a user account on app.clinicaltrialrisk.org or clinicaltrialrisk.org/tool, you can click Login and you will be directed to create an account and/or authenticate on the third party authentication provider Auth0.com. Your email address is stored as your unique identifier while on the app. Our reason for storing your email address is that it is needed for optional user authentication. If you want to use the application anonymously, all functionality is still available without logging in, only that you will not be able to save and retrieve profiles at a later date.

You can delete any configuration you have saved on the server using the Delete button on the application at app.clinicaltrialrisk.org or clinicaltrialrisk.org/tool. You can also delete your account on our third party authentication provider Auth0.com by logging into that service. In accordance with the Right to Be Forgotten (please see our Privacy Policy), you can also send a message via the contact form to ensure complete deletion of your data.

The Clinical Trial Risk Tool allows a user to upload a trial protocol in PDF format. The tool processes the PDF into plain text and identifies features which indicate high or low risk of uninformativeness.

The tool uses a series of machine learning algorithms, such as Convolutional Neural Networks, combined with rule-based components, to identify key features of a protocol. You can download and run the source code on Github.

Please see this blog post for a summary of the tool’s accuracy in the different areas.

The tool gives a broad sense of risk using a traffic light system (red/amber/green for high/medium/low risk respectively). To see a finer definition of the risk, the tool internally scores protocols between 0 and 100. These scores are derived from a linear model. For example, a trial gains 20 points if it has a completed statistical analysis plan, 10 extra points if it has a large sample size, and so on. These are summed in an easy-to-understand way. You can adjust the weights (coefficients) of the different parameters extracted by the tool under the right-hand tab entitled “Configure thresholds and parameters”.

At this time the Clinical Trial Risk Tool does not give p-values. In future we hope to provide more statistical data to the Clinical Trial Risk Tool’s users.

The Python code of the Clinical Trial Risk Tool was written by Thomas Wood (Fast Data Science).

Our source code is on Github. If you would like to improve the tool you are welcome to submit any changes you may have using a pull request. Please contact us to discuss.

At present the tool is designed to handle single documents only, but a future improvement may allow batch processing of multiple PDFs.

You are welcome to upload a protocol for a different pathology even if not covered by the dropdowns. Just please be aware that the thresholds for what is a small, medium or large trial may be different in e.g. for oncology vs HIV, so you may need to go into the options and define what these are for your disease area. Click to see a video on how to set a custom weight profile.

The tool is open-sourced under the MIT License. Functionality outside the disease areas of HIV, TB, COVID, Influenza, Malaria, enteric and diarrheal diseases, neglected tropical diseases, Polio, Diabetes and Pneumonia is closed-source and only available in the paid product. If there is a particular feature or pathology which you would like us to cover, we would like to hear from you.

The tool does not save personal data, except for your email address if you create an account, and is GDPR and HIPAA compliant. More information is available in our Privacy Policy.

What is a clinical trial protocol?

What does the Clinical Trial Risk Tool do?

How do I cite the Clinical Trial Risk Tool?

Does the Clinical Trial Risk Tool store my data?

I want you to delete all data you hold about me

How does the Clinical Trial Risk Tool work?

How reliable is the Clinical Trial Risk Tool?

What do the numbers mean?

Does the Clinical Trial Risk Tool give p-values?

Who made the Clinical Trial Risk Tool?

I would like to improve the tool

My protocol is several separate PDFs and the SAP is also a separate PDF, can the tool handle this?

Can the tool help with all disease areas?

What license does the tool have?

Is the Clinical Trial Risk Tool GDPR or HIPAA compliant?