How to calculate the log-rank statistic

Jun 3, 2019 2 min read

Suppose we have the following data.

df <- dplyr::tibble(patient_id = as.character(1:12), 
                    treatment = rep(c("C", "E"), each = 6),
                    survival_time = survival::Surv(time = c(2,6,8,11,17,24,7,9,13,22,23,25),
                                                   event = c(1,1,1,1,1,0,1,1,1,0,0,0)))

knitr::kable(df)

patient_id	treatment	survival_time
1	C	2
2	C	6
3	C	8
4	C	11
5	C	17
6	C	24+
7	E	7
8	E	9
9	E	13
10	E	22+
11	E	23+
12	E	25+

Let’s arrange the data in increasing order of survival time.

df_ordered <- dplyr::arrange(df, survival_time[,"time"])
knitr::kable(df_ordered)

patient_id	treatment	survival_time
1	C	2
2	C	6
7	E	7
3	C	8
8	E	9
4	C	11
9	E	13
5	C	17
10	E	22+
11	E	23+
6	C	24+
12	E	25+

Method 1: scores

The first step is to estimate the survival probability from the pooled data using the Nelson-Aalen estimator. Then, to get the log-rank scores, we add 1 to the logarithm of the pooled survival estimate for all observed events. For censored events we do not add 1.

pooled_fit <- survival::survfit(survival_time ~ 1, data = df_ordered)
df_logrank <- dplyr::mutate(df_ordered, 
                            pooled_s = exp(-cumsum(pooled_fit$n.event / pooled_fit$n.risk)),
                            logrank_score = log(pooled_s) + survival_time[,"status"])

knitr::kable(df_logrank, digits = 3)

patient_id	treatment	survival_time	pooled_s	logrank_score
1	C	2	0.920	0.917
2	C	6	0.840	0.826
7	E	7	0.760	0.726
3	C	8	0.680	0.615
8	E	9	0.600	0.490
4	C	11	0.520	0.347
9	E	13	0.440	0.180
5	C	17	0.361	-0.020
10	E	22+	0.361	-1.020
11	E	23+	0.361	-1.020
6	C	24+	0.361	-1.020
12	E	25+	0.361	-1.020

To calculate the logrank score statistic we add up all the scores on the control arm.

logrank_score <- dplyr::summarise(dplyr::group_by(df_logrank, treatment), sum(logrank_score))
logrank_score

## # A tibble: 2 x 2
##   treatment `sum(logrank_score)`
##   <chr>                    <dbl>
## 1 C                         1.66
## 2 E                        -1.66

Method 2: observed - expected

The score statistic is equivalent to the sum of “observed” - “expected” events at each event time. This is implemented in the survival package.

logrank_fit <- survival::survdiff(survival_time ~ treatment, data = df)
rbind(logrank_fit$n, logrank_fit$obs - logrank_fit$exp)[2,]

## treatment=C treatment=E 
##    1.664105   -1.664105

Dominic Magirr

Medical Statistician

Interested in the design, analysis and interpretation of clinical trials.