Benchmarking Phase 1 trials is challenging because the “average” cost is often distorted by a small number of highly complex, expensive studies in areas like oncology or gene therapy.
While a standard safety trial in healthy volunteers may have a predictable price point, intensive protocols requiring specialized inpatient monitoring or rare patient populations can drive costs into the multi-millions.
To illustrate, below is a graph of trial costs for a dataset of US-funded Phase 1 trials that we assembled. In Phase 1, the skew of the trial cost data is particularly extreme (more than in other phases - see the table below for more details).
The median cost of a Phase 1 trial was $984,000 and the mean cost was $2,419,000. The standard deviation was $4,035,000, which completely dwarfs the mean.
Above: histogram of Phase 1 clinical trial costs.
Using the median is often more reliable than the mean for budgeting, as it better reflects the typical trial experience by ignoring the extreme statistical outliers that inflate the standard deviation.
A log 10 transform on the cost data is also a good practice when analysing clinical trial costs. The log transform “squeezes” the outliers and spreads out the smaller values. This converts the data into something closer to a normal distribution, which is much easier to model. (Although you can see below that we’re nowhere near a proper clean normal distribution even after the log 10 transformation - the log transformed distribution appears somewhat bimodal and still has a larger skew than that of other phases.)
Above: histogram of Phase 1 clinical trial costs using a log 10 scale. You can use the button to toggle through the other phases.
At Fast Data Science, we are working on the next level of clinical trial cost benchmarking: reference class forecasting. For the trial that you want to benchmark for, we can use transformers and vector embeddings to identify similar trials from our database, and apply a correction for inflation.
The tool will display the estimated cost, a lower and upper bound, and the past trials that the cost estimate was based on.
Of our complete dataset of trials, Phase 1 is trickier than the other phases to model in that the distribution is more skewed. The log transformation helps, but Phase 1 trials seem to be a larger headache than the other phases.
| Phase | N | Mean cost | Median cost | Standard deviation | Skew | |
|---|---|---|---|---|---|---|
| 0 | Early Phase 1 | 134 | $1,351,000 | $816,000 | $1,597,000 | 4.14 |
| 1 | Phase 1 | 616 | $2,419,000 | $984,000 | $4,035,000 | 5.81 |
| 2 | Phase 1/Phase 2 | 298 | $2,733,000 | $1,135,000 | $4,529,000 | 3.33 |
| 3 | Phase 2 | 1045 | $2,205,000 | $1,459,000 | $3,035,000 | 5.71 |
| 4 | Phase 2/Phase 3 | 174 | $2,803,000 | $1,844,000 | $4,375,000 | 5.41 |
| 5 | Phase 3 | 524 | $4,328,000 | $2,515,000 | $6,204,000 | 3.45 |
| 6 | Phase 4 | 413 | $2,391,000 | $1,766,000 | $2,787,000 | 4.67 |
Find out about trial cost benchmarking
As an alternative to cost benchmarking, you can model a trial cost, by identifying all activities associated with the trial (such as the assessments in the schedule of events), and sum these items to create a budget. This is complex and time consuming, but creates an itemised budget. The Clinical Trial Risk Tool allows you to build a site budget directly from the protocol PDF.
Above: the Clinical Trial Risk Tool lets you upload a protocol and will automatically check the protocol design against its checklist, as well as generating a site budget for you.
Upload your clinical trial protocol and create a budget with AI