The choice to define competing risk events as censoring events and implications for causal inference

The choice to define competing risk events as censoring events and implications for causal inference

Jessica G. Young Corresponding Author: jessica_young@harvardpilgrim.org Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, MA, USA Eric J. Tchetgen Tchetgen Department of Statistics, The Wharton School, University of Pennsylvania, PA, USA Miguel A. Hernán Departments of Epidemiology and Biostatistics, Harvard T.H. Chan School of Public Health, MA, USA
Abstract

In failure-time settings, a competing risk event is any event that makes it impossible for the event of interest to occur. Different analytical methods are available for estimating the effect of a treatment on a failure event of interest that is subject to competing events. The choice of method depends on whether or not competing events are defined as censoring events. Though such definition has key implications for the causal interpretation of a given estimate, explicit consideration of those implications has been rare in the statistical literature. As a result, confusion exists as to how to choose amongst available methods for analyzing data with competing events and how to interpret effect estimates. This confusion can be alleviated by understanding that the choice to define a competing event as a censoring event or not corresponds to a choice between different causal estimands. In this paper, we describe the assumptions required to identify those causal estimands and provide a mapping between such estimands and standard terminology from the statistical literature—in particular, the terms subdistribution function, subdistribution hazard and cause-specific hazard. We show that when the censoring process depends on measured time-varying risk factors, conventional statistical methods for competing events are not valid and alternative methods derived from Robins’s g-formula may recover the causal estimand of interest.

1 Introduction

In failure-time settings, a competing risk event is any event that makes it impossible for the event of interest to occur. For example, death from cancer is a competing event for stroke because an individual cannot have a stroke once they have died of cancer. In follow-up studies, competing events cannot be prevented by design and can occur even in randomized experiments with full adherence to the assigned treatment strategies and no losses to follow-up.

Different analytical methods are available for estimating the effect of a treatment on a failure event of interest that is subject to competing events, as discussed in reviews by previous authors (Gooley et al., 1999; Pintilie, 2007; Larouche et al., 2007; Wolbers and Koller, 2007; Andersen et al., 2012; Lau et al., 2015; Edwards et al., 2016). Several of these authors have noted that the choice of method depends on whether or not competing events are defined as censoring events. Though such definition has key implications for the causal interpretation of a given estimate, explicit consideration of those implications has been rare in the statistical literature. As a result, confusion exists as to how to choose amongst available methods for analyzing data with competing events and how to interpret effect estimates.

This confusion can be alleviated by understanding that the choice to define a competing event as a censoring event or not corresponds to a choice between different causal estimands. In this paper, we describe the assumptions required to identify those causal estimands and provide a mapping between such estimands and standard terminology from the statistical literature—in particular, the terms subdistribution function, subdistribution hazard and cause-specific hazard. We show that when the censoring process depends on measured time-varying risk factors, conventional statistical methods for competing events are not valid and that, provided sufficient variables are measured, methods derived from Robins’s g-formula 1986 may recover the causal estimand of interest.

2 Observed data structure

Consider a study in which each of individuals are randomly assigned to either treatment or at baseline (individuals are assumed independent and identically distributed and thus we suppress an individual-specific subscript). Let denote equally spaced follow-up intervals (e.g. days, months) with interval corresponding to baseline and interval (e.g. 60 months post-baseline) corresponding to the administrative end of the study. Let , and denote indicators of loss to follow-up, a competing event (e.g. death from cancer) and the event of interest (e.g. stroke) by interval , respectively. By definition (no individual has been lost to follow-up or has yet experienced the event of interest or the competing event at baseline).

Let denote a vector of time-varying individual characteristics measured in interval (e.g. smoking status, newly diagnosed disease). For simplicity, and with no loss of generality, we will restrict our discussion to a subset of individuals with the same vector of baseline covariate values (measured before ), e.g., males with no history of coronary heart disease and aged 60 at baseline. We assume the temporal ordering within each follow-up interval.

We denote the history of a random variable using overbars; for example, is the history of the event of interest through interval . We denote the future of a random variable through the follow-up of interest using underbars; for example . If an individual is lost to follow-up by () then all future indicators for both the event of interest (all components of ) and the competing event (all components of will be unobserved. By contrast, if an individual is known to experience the competing event by interval without history of the event of interest () then all future indicators for the event of interest will be observed and deterministically zero because, by definition, individuals who experience a competing event can never subsequently experience the event of interest.

3 Definitions of causal estimands

Suppose we are interested in the causal effect of assignment to treatment on the event of interest. This section considers two possible scenarios. First, we consider the special case in which the event of interest is not subject to competing events; i.e. , which will occur when the event of interest is death from any cause. Second, we consider the more general case in which competing events exist, which will occur when the event of interest is, for example, stroke and death from cancer is a competing event.

3.1 When competing events do not exist

To define the causal effect, we first need to define the counterfactual (or potential) outcome variables for and with and . For each individual, is the indicator of the event by interval if the individual, possibly contrary to fact, had been assigned to . In an experimental study without loss to follow-up, i.e., , will be observed for all individuals with and unobserved for individuals with . We can define as the probability of the event of interest by had all individuals in the population been assigned to . We refer to this quantity as the counterfactual population risk of the event of interest by under assignment to . Treatment assignment has a nonnull average causal effect on the risk of the event of interest by if and only if .

In most studies, some individuals are lost to follow-up by a given time . In a study with loss to follow-up, the counterfactual outcome of interest for time is more precisely understood as the indicator of the event by interval that would have been observed under in the absence of loss to follow-up or which is indexed by both and , where represents a hypothetical intervention that eliminates loss to follow-up (Hernán and Robins, 2018). We say that loss to follow-up is a censoring event because, for an individual who is lost to follow-up by , we are prevented from observing for any level of . Then the counterfactual risk by under

(1)

is the risk that would have been observed if all individuals had been assigned to treatment and we had somehow eliminated loss to follow-up. Thus, in a study where some individuals are loss to follow-up, we can more precisely say that treatment has a nonnull average causal effect on the risk of the event of interest by if and only if .

We can analogously define the discrete-time hazard of the event of interest in interval under and no loss to follow-up as

(2)

Note, we will refer to (2) as a discrete-time hazard regardless of whether the underlying counterfactual failure time is discrete or continuous. That is, defining as the counterfactual time to failure from the event of interest under an intervention that sets to and eliminates loss to follow-up, we can equivalently write (2) as with interval defined by . This is a discrete-time hazard when is discrete with support at , and approximates a continuous-time hazard function when is continuous as intervals become increasingly small.

Unlike the risk (1), the hazard (2) is conditional on survival to , which may be affected by treatment . Therefore, does not necessarily imply that has a nonnull causal effect at . The hazards at may differ just because of differences in individuals who survive until under versus due to treatment effects before (Hernán et al., 2004; Hernán, 2010). See also Appendix A. For this reason, in contrast to counterfactual risk differences or risk ratios, we cannot always interpret counterfactual hazard ratios as causal effects even though they may be precisely defined contrasts of counterfactual quantities and may even be identifiable from the study data.

3.2 When competing events exist

When the event of interest (e.g. stroke) is subject to competing events (e.g. death from cancer), the causal effect of treatment assignment on the the event of interest (i.e. the causal estimand) can be defined in different ways because the counterfactual outcomes can be defined in different ways. Let us consider two possibilities.

First, suppose we consider the counterfactual outcome where represents a hypothetical intervention that eliminates the competing event. Then the counterfactual risk by under

(3)

is the risk that would have been observed if all individuals had been assigned to treatment and we had somehow eliminated both losses to follow-up and competing events. For this choice of counterfactual outcome, we say that competing events are censoring events because, for an individual who experiences a competing event by , we are prevented from observing for any level of . The corresponding counterfactual discrete-time hazard

(4)

is the hazard of the event of interest at if all individuals had been assigned to treatment and we had somehow eliminated both loss to follow-up and competing events. For the counterfactual time to failure from the event of interest under an intervention that sets to and eliminates both loss to follow-up and competing events, we can equivalently write (4) as .

Second, suppose we consider the alternative counterfactual outcome , which does not entail any hypothetical intervention on competing events. Let be an indicator of the competing event by had, possibly contrary to fact, an individual received and was not lost to follow-up by with and . Any individual with has counterfactual outcome . Thus we are not precluded from observing this alternative choice of counterfactual outcome because an individual experiences a competing event by . Competing events are therefore not censoring events for this choice and the counterfactual risk by under

(5)

is as in (1) of Section 3.1; the risk if all individuals had been assigned to treatment and if we had somehow eliminated losses to follow-up with no intervention on competing events. However, when competing events exist there will be two different types of individuals with : those who ultimately survive both the event of interest and the competing event (e.g. stroke and death from other causes) and those who ultimately experience a competing event (e.g. death from another cause) by .

Those familiar with conventional statistical notation can see that the risk (5) can be alternatively represented as where, in the setting where competing events exist, we more generally define as the counterfactual time to failure from either the event of interest () or a competing event (), whichever comes first. Equivalence between and the risk (5) follows by , with the indicator function. In the statistical literature, this risk has been called either the subdistribution function or the cumulative incidence function (Kalbfleisch and Prentice, 1980; Fine and Gray, 1999) evaluated at . Here we denote this function explicitly by a counterfactual intervention where is set to and loss to follow-up is eliminated.

Analogously, the corresponding discrete-time hazard for this counterfactual

(6)

is as in (2) of Section 3.1; the hazard of the event of interest at if all individuals had been assigned to treatment and we had somehow eliminated losses to follow-up but not competing events. The “risk set” of individuals at , i.e, those with , in this case is comprised by (i) those who have experienced neither the event of interest nor the competing event, and (ii) those who have not experienced the event of interest but have experienced the competing event by .

When competing events exist, expression (6) can alternatively be represented as . This alternative representation of (6) follows by and

and coincides with the subdistribution hazard for cause of the statistical literature in this counterfactual world.

We briefly note that two additional causal estimands are common in the competing risks literature. One common estimand is the hazard of the event of interest at among those who have not previously experienced the competing event. Under a counterfactual world where is set to and loss to follow-up is eliminated, this hazard can be written as

(7)

Alternatively written as , this quantity coincides with the cause-specific hazard for cause of the statistical literature. Similarly, relative to this same counterfactual world, the hazard of the competing event itself at among those who have not previously experienced the event of interest coincides with the cause-specific hazard for cause or

(8)

which is alternatively written .

When the estimand of interest is (7), it is now clear that competing events are not censoring events; they do not render the counterfactual outcome of interest unobserved. As before, it is known that for any individual previously experiencing a competing event without previously experiencing the event of interest. In the case of the estimand (7), competing events constitute conditioning events. Following arguments above, as competing events are post-treatment variables, a contrast in (7) under versus may be nonnull if treatment has a nonnull effect on the competing event before even if treatment has a null effect on the event of interest at all times. This compounds the previously identified problem with interpreting hazard ratios as causal effects(Hernán et al., 2004; Hernán, 2010). Also see Appendix A.

Finally, the other common estimand considered in the competing risks literature is the result of redefining as a composite outcome; i.e., an indicator of experiencing either the event of interest or the competing event by . This estimand, often considered in randomized trials, effectively eliminates competing events. However, it may profoundly alter the nature of the effect that is being estimated (Hernán et al., 2014).

3.3 Choosing between causal estimands when competing events exist

As we have just seen, the choice to consider a competing event as a censoring event or not is inherently linked to the choice of causal estimand. Competing events will be censoring events for counterfactual outcomes indexed by interventions on competing events. Competing events will not be censoring events for counterfactual outcomes that involve no intervention on competing events. The choice between these counterfactuals depends on both the scientific question of interest and the untestable assumptions we are willing to make. To fix ideas, say we were interested in the causal effect of treatment initiation on the risk of the event of interest (e.g. stroke) by on a difference scale.

First suppose we choose to define this risk as in (3). Then the risk difference

(9)

is a special case of a controlled direct effect (Robins and Greenland, 1992; Vanderweele, 2015). This estimand is only well-defined if there exist sufficiently well-defined interventions to eliminate losses to follow-up and competing events (e.g. death from cancer). While we might imagine a study in which we could eliminate loss to follow-up (e.g. by investing more financial resources for follow-up), it is difficult to imagine a study in which we could eliminate competing events (e.g. by preventing death from cancer). The fact that interventions are not well-defined impacts the plausibility of assumptions required to identify estimands like (3) or (9) in studies with censoring events. These assumptions are discussed in the next section.

Second, suppose we alternatively choose to define risk as in (5). Then the risk difference

(10)

is defined in a world, closer to ours, where competing events (e.g. death from cancer) may occur. Therefore we do not need to consider the existence of hypothetical interventions to prevent competing events. However, this causal effect must be interpreted with caution. One concern is that the magnitude of (10) depends on the distribution of the competing events in the study population. Thus, even if (10) were perfectly estimated (or even known) in one population, it may fail to apply to another (Hernán and VanderWeele, 2011); e.g. if there are more deaths from cancer in one population compared to the other.

A second concern is that misleading values of (10) may occur due to the fact that (10) captures the effect of treatment initiation on the event of interest (e.g. stroke) through all pathways, including possibly through the effect of on the competing event (e.g. death from another cause such as cancer). Consider the following extreme case. Suppose that, for all individuals in the study population, ; that is, all individuals would die immediately from another cause if, possibly contrary to fact, they received treatment . As it is impossible to have a stroke after death, it follows that none of these individuals would have a stroke if they received such that . Suppose further that, for all individuals, ; that is, no individual would die from another cause at any time during the study period if, possibly contrary to fact, they received . Therefore, some individuals may have a stroke under . If at least one individual would have a stroke then . It follows that, under this scenario, (10) will be negative.

However, this apparently protective effect of versus on risk of stroke, by this measure of risk, is explained by the fact that all individuals would immediately die from another cause after taking while no individual would die from another cause after taking . Simultaneous reporting of estimates of the effect of treatment on the competing event may help illuminate such problems. In our example, knowing , the causal risk difference for death from another cause under versus and elimination of loss to follow-up is positive, in conjunction with the negative value of (10) is essential before recommending treatment with versus to a patient in this population.

4 Identification of the causal estimands

Both counterfactual risks (3) and (5) are defined in terms of interventions that eliminate censoring events. Under definition (3), both loss to follow-up and competing events are censoring events. Under definition (5), only loss to follow-up is a censoring event. We will now give untestable assumptions under which we may identify these two versions of the counterfactual risk in data with both competing events and loss to follow-up. We begin with the case where both events are censoring events.

4.1 When loss to follow-up and competing events are censoring events

To identify the counterfactual risk (3) using only observed variables, we must make untestable assumptions. Specifically, for each , consider the following three identifying assumptions:

  1. Exchangeability 1:

    (11)

    where is some realization of . This assumption requires that, in addition to the baseline observed treatment, at each follow-up time, all forms of censoring are independent of future counterfactual outcomes had everyone followed and censoring were eliminated given the measured past. Because loss to follow-up and competing events cannot be randomly assigned by an investigator in practice, these assumptions will not hold by design, even in an experiment where is randomized.

  2. Positivity 1:

    (12)

    This assumption requires that, for any possibly observed level of treatment and covariate history amongst those remaining uncensored (here free of competing events and loss to follow-up) and free of the the event of interest through , some individuals continue to remain uncensored through .

  3. Consistency 1:

    (13)

    This assumption requires that, if an individual has data consistent with the interventions indexing the counterfactual outcome of interest through , then her observed outcomes and covariates through equal her counterfactual outcomes and covariates under that intervention. The consistency assumption is generally unrealistic when the counterfactual outcome of interest is indexed by an ill-defined intervention such as (Robins and Greenland, 2000; VanderWeele, 2009). Further, as discussed by Hernán (2016), failure of consistency typically brings into question the plausibility of corresponding exchangeability and positivity assumptions.

Assumption (11) is represented by the causal directed acyclic graph (DAG) (Pearl, 1995) in Figure 1 for two arbitrary follow-up times. The assumption (11) holds in Figure 1 by (i) the unconditional absence of any unblocked backdoor paths between and as well as (ii) the absence of any such paths between and conditional only on ; and conditional only on , and ; and and conditional only on , , and . In Figure 1, (ii) is guaranteed by the absence of arrows from , an unmeasured risk factor for the event of interest, into , and . Note that we have omitted other arrows on the graph (e.g. an arrow from to ) to avoid clutter as adding any missing arrows from past into future measured variables will still preserve (i) and (ii).

Figure 1: A causal DAG representing observed data generating assumptions under which a causal effect on the scale of the counterfactual risk (3) may be identified.

Given assumptions (11), (12) and (13), the counterfactual risk (3) is identified by the following function of the observed data:

(14)

Expression (14) is the g-formula for the counterfactual risk (3) where is the discrete-time hazard of the event of interest in interval conditional on treatment and measured covariate history as well as remaining free of loss to follow-up and competing events. Similarly is the corresponding conditional density of . The proof of equivalence between (14) and (3), which follows directly from results given by Robins (1986, 1997), is reviewed in Appendix B.

The g-formula (14) has several algebraically equivalent representations. For example, we can equivalently write (14) using the following inverse probability weighted (IPW) representation

(15)

where

(16)

and with

and

where denotes expectation and the denominators of the weights and denote the probabilities of remaining free of each type of censoring (loss to follow-up and competing events, respectively) by conditional on measured history. Note that, given the identifying assumptions (11), (12) and (13), the observed data function (16) itself identifies the counterfactual hazard (4). See Appendix B for proof.

Our ability to represent the g-formula in different yet algebraically equivalent ways has implications for choices in estimating this function in high-dimensional settings. We will discuss this in Section 5.

4.2 When only loss to follow-up is a censoring event

Untestable assumptions are also required to identify the counterfactual risk (5) as a function of only observed variables. However, because only loss to follow-up (and not competing events) is a censoring event for causal effects defined on the scale of the counterfactual risk (5), assumptions required to identify such effects in data with competing events are generally weaker than those required to identify effects on the scale of the alternative counterfactual risk (3).

Specifically, for each , consider the following alternative versions of exchangeability, positivity and consistency:

  1. Exchangeability 2:

    (17)

    where may be viewed as a covariate like those in . This assumption requires that, in addition to the baseline observed treatment, at each follow-up time, censoring (here, only loss to follow-up) is independent of future counterfactual outcomes had everyone followed and censoring were eliminated given the measured past.

  2. Positivity 2:

    (18)

    Note that for any such that , this assumption holds by definition because, in this case, the probability of remaining uncensored by is 1 (individuals who fail from the competing event by are, by definition, not loss to follow-up).

  3. Consistency 2:

    (19)

    Note that no intervention is now defined on competing events.

The causal DAG in Figure 2 represents a possible observed data generating assumption for an arbitrary subset of the follow-up that is in line with these alternative identifying assumptions. The only difference between Figure 2 and Figure 1 is the former allows for the presence of arrows from the unmeasured risk factor for the event of interest into the competing event at each time ( and ). As discussed in the previous section, the presence of these arrows would violate assumption (11) rendering causal effects on the scale of the risk (3) unidentified. By contrast, the alternative assumption (17) is not precluded by the presence of these arrows. Figure 2 is consistent with assumption (17) by (i) the unconditional absence of any unblocked backdoor paths between and as well as (ii) the absence of such paths between and , conditional only on , and . The latter is guaranteed by the lack of an arrow from into , even when there are arrows from into competing events.

Figure 2: A causal DAG representing observed data generating assumptions under which a causal effect on the scale of the counterfactual risk (5) may be identified while an effect on the scale of the counterfactual risk (3) is not identified.

Given assumptions (17), (18) and (19), the counterfactual risk (5) is identified by the following alternative g-formula:

(20)

The proof of equivalence between the risk (5) and (20) given (17), (18), and (19) also follows from earlier results by Robins (1986, 1997) by defining as an implicit component of . See Appendix B.

Unlike , has a deterministic relationship with the event of interest. As a result, (20) is algebraically equivalent to the following somewhat simplified expression

(21)

where is the discrete-time hazard of the competing event in interval conditional on treatment and covariate history as well as remaining free of loss to follow-up and the event of interest. This simplification follows because, given for any it will be the case that . Thus expression (21) makes explicit that all terms in the sum (20) over are zero when for any . Taubman et al. (2009) first gave the expression (21), generalized to the setting where interest is in the causal effects of sustaining different treatment strategies over time, rather than simply being randomized to different treatments at baseline.

Analogously, there are multiple different algebraically equivalent IPW representations of (21). One is:

(22)

where

(23)

and

Note that, given (17), (18) and (19), the IPW function (23) itself identifies the counterfactual hazard (6). See Appendix B.

A second algebraically equivalent IPW representation of (21) is:

(24)

where

(25)
(26)

and is defined as in Section 4.1. See Appendix B.

Following our discussion in Section 3.3, when using the counterfactual risk (5) to quantify the causal effect of treatment, it is important to also quantify the treatment effect on the competing event itself, which requires additional untestable assumptions given in Appendix B along with the corresponding g-formula for . Further shown in Appendix B is that, given exchangeability, positivity and consistency assumptions for identification of along with the identifying assumptions (17), (19) and (18), the IP weighted functions (25) and (26) themselves identify the counterfactual cause-specific hazards (7) and (8), respectively.

Note, the data generating assumptions of Figure 2 are consistent with these assumptions; that is, consistent with assumptions under which causal treatment effects on the scale of both the counterfactual risk of the event of interest (5) and the risk of the competing event may be identified by (i) the unconditional absence of any unblocked backdoor paths between and ; and ; and and and (ii) the absence of such paths between and , as well as and , conditional only on , and .

5 Implications for choosing an analytic method

The results summarized in Sections 3 and 4 have immediate implications for choosing an appropriate analytic method when faced with competing events data. These results make explicit that our choice should be restricted to methods that can consistently estimate the identifying function that results from (i) our choice of counterfactual estimand and (ii) an explicit set of untestable assumptions linking that estimand to the variables we have measured.

In realistic high-dimensional settings where nonparametric estimation of the resulting identifying function is infeasible, several consistent estimators may be available for that function. These estimators differ only by how they handle curse of dimensionality via parametric models or other means of smoothing over areas where there is little or no data support. The various algebraically equivalent representations of the same identifying function that we considered in Section 4 can be used to motivate such estimation choices. However, we stress that these are last stage choices that come as a result of first choosing an estimand and set of identifying assumptions. In this section, we briefly relate existing analytic methods to different identifying functions considered in Section 4, possibly under additional parametric constraints to deal with curse of dimensionality.

5.1 G-methods

As above, the functions (14) and (21), respectively, correspond to two versions of the g-formula for risk by some follow-up had been set to and censoring been eliminated. Our results thus far have made clear that the choice between these functions will depend entirely on whether our causal estimand of interest is an effect on the scale of the counterfactual risk (3) or the counterfactual risk (5). Under the former, competing events are censoring events and under the latter they are not. Whichever function we choose via such substantive considerations, there still remains the problem of how to proceed with estimating such functions in a sample of data. Estimation proves challenging in realistic settings where contains many components and/or is large. When even one component of is a continuous variable, sums in these expressions must be replaced by integrals.

Methods that can be used to estimate the g-formula and associated contrasts in such practical settings are together called g-methods. The parametric g-formula(Robins, 1986) (interchangeably called g-computation or the plug-in g-formula(Hernán and Robins, 2018)) is one such method. To deal with curse of dimensionality, this approach directly estimates under parametric model constraints the components of the g-formula expression such as that in (14) or (21), then uses Monte Carlo simulation based on the estimated conditional densities to approximate the high-dimensional sum/integral over all risk factor histories. A key distinction between expressions (14) and (21) is that the latter depends on knowledge of the discrete-time hazard of the competing event for each time index , while the former does not. Thus, a parametric g-formula estimator for the latter will rely on an estimate of this quantity while the former will not. See Logan et al. (2016) for a description of the parametric g-formula algorithm for estimating (14) versus (21) and associated contrasts (risk differences/risk ratios) along with SAS code examples.

The parametric g-formula may require especially strong model assumptions in practice, particularly when contains continuous components or is otherwise high-dimensional. In Section 4, we considered algebraically equivalent IPW representations of both (14) and (21) which motivate semiparametric alternative estimators. These generally rely on alternative constraints for dealing with curse of dimensionality.

Specifically, when the goal is to estimate causal treatment effects on the scale of the counterfactual risk (3), expression (15) suggests that an alternative to the parametric g-formula for estimating (14) in high-dimensional settings is an IPW estimator with weights estimated via parametric assumptions or some other form of data dimension reduction on the denominators of and . In practice, an additional model constraint may be placed on the function (16), which may be necessary when takes many levels and/or there are many measurement intervals . Such a model corresponds to a special case of a marginal structural model or MSM (Robins, 2000); e.g., the proportional hazards MSM:

(27)

Following arguments in Section 4, as (16) identifies the counterfactual hazard (4) under (11), (12), and (13), then an IPW estimator (or any other consistent estimator) of in the model (27) may be interpreted in terms of an estimate of the counterfactual hazard ratio defined by (4) comparing to provided these untestable assumptions hold and the model (27) is correctly specified. In Appendix C, we describe an IPW algorithm for estimating effects on the scale of the counterfactual risk (3) via expression (15), possibly under an MSM constraint for (16). This approach follows previously described estimating equation methodology with the competing event algorithmically treated like loss to follow-up (Hernán et al., 2000). In the special case where all models used to construct a parametric g-formula estimate and IPW estimate of the g-formula (14) all perfectly fit the data, these two approaches will give equivalent results.

When the goal is to estimate causal treatment effects on the scale of the counterfactual risk (5), we considered two different algebraically equivalent IPW representations of the g-formula (21), suggesting two different IPW estimators for the risk (5); one based on expression (22) and another based on expression (24). IPW estimators derived from (22) will rely on a correctly specified model for the denominator of along with, possibly, an MSM constraint for (23); e.g. the alternative proportional hazards MSM

(28)

Following arguments in Section 4, an IPW estimator (or any other consistent estimator) of may consistently estimate the counterfactual hazard ratio defined by (6) comparing different levels of under untestable assumptions (17), (18), and (19), provided the model (28) is correctly specified. Following previous work by Bekaert et al. (2010), in Appendix C we describe an IPW algorithm for estimating treatment effects on the scale of the counterfactual risk (5) via expression (22), possibly under an MSM constraint for (23).

Alternative IPW estimators of the counterfactual risk (5) derived from the second IPW representation of the g-formula (24) will rely on a correctly specified model for the denominator of along with, possibly, MSM constraints on the functions (25) and (26). IPW estimators of the parameters of MSMs for (25) and (26) have been described by Moodie et al. (2014). Also see Appendix D of Young et al. (2018). In the special case where all models used to construct a parametric g-formula estimator and both of these possible IPW estimators of the g-formula (21) all perfectly fit the data, these three approaches will give equivalent results. Finally note that alternative, including doubly-robust, estimators of expressions (14) and (21) may be derived by considering alternative algebraically equivalent representations of the g-formula in terms of a series of nested conditional means (J.M. Robins and A. Rotnitzky, 1992; D.O. Scharfstein and A. Rotnitzky and J.M. Robins, 1999; van der Laan and Robins, 2003; Bang and Robins, 2005; van der Laan and Rose, 2011; Schnitzer et al., 2015).

5.2 Classical methods

Under the restricted special case where is the empty set , well-known methods for survival analysis can be used in place of g-methods to estimate the functions considered in Section 4 (as we have implicitly restricted the population to one level of baseline risk factors ). For example, under the special case of , the g-formula (14) reduces to

where

(29)

This restricted version of (14) can be estimated by the complement of the popular product-limit survival estimator (Kaplan and Meier, 1958). Following identification arguments of Section 4, given the untestable assumptions (11), (12), and (13) hold under the restrictive case with then a contrast of such estimators for different levels of is a consistent estimator of the causal effect of on the scale of the counterfactual (3). Such assumptions are consistent with the causal DAG of Figure 1 further restricted such that more arrows are removed; for example the arrows and .

Also for the special case of , the IP weighted function (16) reduces to (29). Well-known partial likelihood methods (Cox, 1975) can be used to estimate the parameters of a model for a ratio defined in terms of the function in (29) for different levels of as a function of and possibly ; e.g. the proportional hazards model

(30)

Following our identification arguments, provided model (30) is correctly specified and given underlying identifying assumptions (11), (12), and (13) hold for the special case of , then any consistent estimator of