(Mild) panic In my previous post I looked into how survival::survfit produces standard errors and confidence intervals for a survival curve based on a Cox proportional hazards model. I discovered (I could also have just read it from the documentation) that when you ask for the standard error fit_1$std.err after fit_1 <- survfit(...), it provides you not with the standard error of the estimator of the survival probability, but instead with the standard error of the estimator of the cumulative hazard.
This post is to express some minor frustration with some papers I’ve read recently evaluating the performance of restricted mean survival time as a summary measure in oncology studies.
I should say that I’m not a saint when it comes to designing simulation studies. Consciously and/or unconsciously, it’s tempting to give our favourite methods an easier ride.
Nevertheless, a couple of things bother me, and they’re related to each other.
The aim of this post is to demonstrate a landmark/milestone analysis of RCT time-to-event data with a Royston-Parmar flexible parametric survival model. The original reference is:
Royston P, Parmar M (2002). “Flexible Parametric Proportional-Hazards and Proportional-Odds Models for Censored Survival Data, with Application to Prognostic Modelling and Estimation of Treatment Effects.” Statistics in Medicine, 21(1), 2175–2197. doi:10.1002/ sim.1203
This model has been expertly coded and documented by Chris Jackson in the R package flexsurv (https://www.
I’ve written a lot recently about non-proportional hazards in immuno-oncology. One aspect that I have unfortunately overlooked is covariate adjustment. Perhaps this is because it’s so easy to work with extracted data from published Kaplan-Meier plots, where the covariate data is not available. But we know from theoretical and empirical work that covariate adjustment can lead to big increases in power, and perhaps this is equally important or even more important than the power gains from using a weighted log-rank test to match the anticipated non-proportional hazards.
In my opinion, many phase III trials in immuno-oncology are 10–20 % larger than they need (ought) to be.
This is because the method we use for the primary analysis doesn’t match what we know about how these drugs work.
Fixing this doesn’t require anything fancy, just old-school stats from the 1960s.
In this new preprint I try to explain how I think it should be done.
I’ve spent a lot of time thinking about hypothesis tests in clinical trials recently. Periodically, it’s good to question the fundamentals of whether this is a good idea in general. I don’t think my views have changed. I was considering writing a blog post but then I came across this article by the editors of the Clinical Trials journal which pretty much sums it up. The article can also be found with DOI: 10.