Causal interpretation of the hazard ratio in RCTs

Causal inference
Author

Dominic Magirr

Published

May 11, 2024

Motivation

A new set of articles on the “Causal interpretation of the hazard ratio in randomized clinical trials” ( original article, commentary, response) has pushed me to write this blogpost. It’s something I’ve been considering for a while, not because it’s the most important thing in the world, but there seems to be a persistent confusion, despite some great attempts to clarify

Source of the confusion

The mistake (in my eyes) is to claim something like “the hazard ratio is not a causal effect” (in the context of an RCT) without first providing a precise definition of “causal effect”.

Definition A

One clean and precise definition of a causal effect is provided by Hernan & Robins Technical Point 1.1

In general, a population causal effect can be defined as a contrast of any functional of the marginal distributions of counterfactual outcomes under different actions or treatment values.

The hazard ratio, \(hr(t)\), satisfies this definition, for any choice of \(t\). Therefore, if we follow this definition, then the hazard ratio is a causal effect. End of story. We can all move on.

Criticism of hazard ratios in loose language

“The hazards of hazard ratios” (Hernan, 2010) contains an example where the hazard ratio (hormone therapy vs control) after 5 years is less than 1 “even if hormone therapy has no truly preventive effect in any woman at any time.”

Interestingly, though, the fact that a hazard ratio can ostensibly point in favour of an experimental treatment despite no true benefit is not the grounds given for claiming that a hazard ratio lacks a causal interpretation. For example, the so-called “total effect” in a competing risks setting also has the property that it can be less than 1 (if it’s a ratio) despite no true benefit. Yet this is unequivocally a causal effect according to the authors.

Rather, the grounds given for claiming the hazard ratio is not a causal effect is always given in loose language, e.g., it somehow does not compare equal groups, or has a so-called “in-built selection bias”. This is ambiguous. I would say that a hazard ratio at 5 years is a consequence of everything that happens to everyone in the first 5 years. It is not merely a comparison of the set of patients alive at 5 years, as if the patients who have died had never existed.

Definition B

I’ve not seen a precise definition of a causal effect that would rule out the hazard ratio, so I’ll make one up here…

In general, a population causal effect can be defined as a contrast of any functional (except for a conditional expectation where the conditioning is on a post-baseline event) of the marginal distributions of counterfactual outcomes under different actions or treatment values.

If we follow this definition, then the hazard ratio is not (in general) a causal effect. End of story. We can all move on.

The reason for “in general” is that if the hazard ratio happens to be constant over time then it coincides with the difference in survival curves on the complimentary log log scale, which is an unconditional expectation.

Which definition is better?

Given that both the hazard ratio and total effect have the same property that they can be less than 1 despite no true benefit, I prefer Definition A to Definition B.

Both the hazard ratio and the total effect require deep thought to understand. Neither has a straightforward interpretation. To use definition B, and therefore make the distinction that the total effect is causal and the hazard ratio in not causal, is a pure technicality in my eyes.