Technical

flexsurv 2

Previously, I started discussing the flexsurv package. I used it to fit a Weibull model. This is implemented as an accelerated failure time model. It is also a proportional hazards model (although, as I found previously, converting between the two is not so straightforward, but it can be done by SurvRegCensCov). Now let’s compare Weibull regression with Cox regression. Firstly, Weibull regression: assumes proportional hazards; the number of parameters is equal to \(k + 2\), where \(k\) is the number of covariates; we can estimate things like the median, \(P(S>s^*)\), etc.

flexsurv

I’m going to write about some of my favourite R packages. I’ll start with flexsurv (https://github.com/chjackson/flexsurv-dev) by Chris Jackson, which can be used to fit all kinds of parametric models to survival data. It can really do a lot, but I’ll pick out just 2 cool things I like about it: Fit a standard survival model, but where it’s slightly easier to work out what the parameters mean. Fit a proportional hazards model, which is a lot like a Cox model, but where you also model the baseline hazard using a spline.

Longitudinal hurdle models 3

In the last post on longitudinal hurdle models, I had just taken samples from the marginal mean \[\begin{align} g(\theta; x) & = E(Y \mid \theta; x) \\ & = \int E(Y \mid u_{i'}, v_{i'}, \theta; x) f(u_{i'}, v_{i'} \mid \theta, \mathbf{y}) du_{i'}dv_{i'} \\ &\approx L^{-1}\sum_{l = 1}^{L}E(Y \mid u_{i'}^{(l)}, v_{i'}^{(l)}, \theta; x)\\ &= L^{-1}\sum_{l = 1}^{L}\left\lbrace 1 - \text{logit}^{-1} ( x^T \gamma + u_{i'}^{(l)}) \right\rbrace \exp(x^T\beta + v_{i'}^{(l)} + \frac{\sigma^2}{2}).

Longitudinal hurdle models 2

In a previous post I fit a longitudinal hurdle model using the brms package. library(brms) summary(fit_hurdle) ## Family: hurdle_lognormal ## Links: mu = identity; sigma = identity; hu = logit ## Formula: y ~ time + (1 | q | id) ## hu ~ time + (1 | q | id) ## Data: dat (Number of observations: 800) ## Samples: 4 chains, each with iter = 4000; warmup = 2000; thin = 1; ## total post-warmup samples = 8000 ## ## Group-Level Effects: ## ~id (Number of levels: 100) ## Estimate Est.

Longitudinal hurdle models

Data Recently, I have been modelling data that is longitudinal, contains excess zeros, and where the non-zero data is right-skewed and measured on a continuous scale, rather than being count data. I’ll simulate a semi-realistic example data set from a lognormal hurdle model. The “random effects” for the pr(zero) and non-zero parts of the model are negatively correlated. set.seed(180) ## 100 patients id <- 1:100 ## 2 timepoints time <- c("1", "2") ## random effects u <- rnorm(100, sd = 2) v <- rnorm(100, mean = -0.