Population adjustment with limited individual patient data

My paper, Methods for Population Adjustment with Limited Access to Individual Patient Data: A Review and Simulation Study, co-authored with my PhD supervisors Gianluca Baio and Anna Heath, is up on arXiv after undergoing the first round of peer-review. Population adjustment methods are increasingly used in health technology assessments when access to patient-level data is limited and there are cross-trial differences in effect modifiers. Popular methods are matching-adjusted indirect comparison (MAIC), based on propensity score weighting, and simulated treatment comparison (STC), a regression adjustment method. We evaluate these methods and the standard Bucher method in a comprehensive simulation study.

We investigate survival outcomes and continuous covariates across 162 scenarios, with the log hazard ratio as the measure of effect. Study sample sizes and the degree of covariate overlap reflect scenarios typically encountered in health technology appraisals of oncology drugs. MAIC yields unbiased treatment effect estimates and is the least biased and most accurate method in the scenarios considered. Standard errors underestimate variability with small sample sizes, leading to undercoverage. The bias reduction of MAIC generally outweighs the loss in precision, even under poor covariate overlap.

STC is systematically biased because it targets a conditional treatment effect as opposed to a marginal treatment effect. This measure of effect is incompatible in the indirect treatment comparison due to the non-collapsibility of the (log) hazard ratio. Measures of effect in most health technology assessment applications are non-collapsible. Hence, we discourage the use of STC in most cases. As expected, the Bucher method is biased and overprecise. Bias increases with poorer overlap and stronger effect-modifying interactions.

The simulation study has some limitations. Firstly, population adjustment methods make very strong assumptions and a greater number of these than standard indirect comparisons. Failures in assumptions are not explored in this study and future simulation studies should assess the methods under model misspecification. Conclusions are also dependent on model/outcome type. We have considered survival outcomes and Cox proportional hazards models. Bias-variance trade-offs may differ in less efficient setups, e.g. binary outcomes and logistic regression. A huge thank you to the peer reviewers that helped improve the manuscript. Of course, note that this is still a pre-print yet to be fully certified by peer review.


Written on September 15, 2020