See Magirr and Burman (2019) and Magirr (2021) for details about the weighted log-rank tests, and in particular the modestly weighted log-rank test. This vignette works through an example of using the package to simulate data and perform weighted log-rank tests. A summary of the formulas used within this package is presented.
This package can be used to simulate a dataset for a two-arm RCT with
delayed separation of survival curves by using the
sim_events_delay
function.
There are two parts to simulating the event times and statuses: the
event model (parameters defined in event_model
) and the
recruitment model (parameters defined in
recruitment_model
).
Firstly, looking at the event model. The function
sim_events_delay
assumes that the survival times on the
control and exponential arm follow a piecewise exponential distribution.
Given rate parameter λ, the
exponential distribution has the form:
f(t) = λexp (−λt)
The rate parameters are set using the argument lambda_c
for the control arm and lambda_e
for the experimental arm.
To use the piecewise version, set this argument a vector with a value
for each piece. The duration of each piece is set using parameter
duration_c
and duration_e
.
Secondly, looking at the recruitment model. The recruitment can be
modeled using either a power model or a piecewise constant model. See
help(sim_events_delay)
more details about these models.
Additionally, the sim_events_delay
function censors all
observations at the calendar time max_cal_t
.
Here we create a simulated dataset with 5 individuals on each arm.
Assume that one unit of time is equal to one month. From entering the
study until 6 months both arms have the same λ parameter, with a median event
time of 9 months. From 6 months, the experimental arm has a lower hazard
rate, with a median event time of 18 months. Setting
rec_period = 12
and rec_power = 1
means that
individuals are recruited at a uniform rate over 12 months.
library(nphRCT)
set.seed(1)
sim_data <- sim_events_delay(
event_model=list(
duration_c = 36,
duration_e = c(12 ,24),
lambda_c = log(2)/9,
lambda_e = c(log(2)/9,log(2)/18)),
recruitment_model=list(
rec_model="power",
rec_period=12,
rec_power=1),
n_c=5,
n_e=5,
max_cal_t = 36
)
sim_data
#> event_time event_status group
#> 1 18.06 1 control
#> 2 9.89 1 control
#> 3 16.07 1 control
#> 4 28.07 0 control
#> 5 13.69 1 control
#> 6 25.22 0 experimental
#> 7 24.66 0 experimental
#> 8 8.50 1 experimental
#> 9 4.37 1 experimental
#> 10 7.64 1 experimental
Now that we have simulated a dataset, we will look at performing the weighted log-rank tests.
Consider the ordered, distinct event times t1, …, tk. Let d0, j and d1, j be the number of events at event time tj on each of the arms respectively, and let dj be equal to the sum of these two values. Similarly, let n0, j and n1, j be the number at risk at event time tj on each of the arms respectively, and let nj be equal to the sum of these two values.
The function find_at_risk
can be used to calculate these
values for a dataset.
find_at_risk(formula=Surv(event_time,event_status)~group,
data=sim_data,
include_cens=FALSE)
#> t_j n_event_control n_event_experimental n_event n_risk_control
#> 1 4.37 0 1 1 5
#> 2 7.64 0 1 1 5
#> 3 8.50 0 1 1 5
#> 4 9.89 1 0 1 5
#> 5 13.69 1 0 1 4
#> 6 16.07 1 0 1 3
#> 7 18.06 1 0 1 2
#> n_risk_experimental n_risk
#> 1 5 10
#> 2 4 9
#> 3 3 8
#> 4 2 7
#> 5 2 6
#> 6 2 5
#> 7 2 4
Here, each row relates to the distinct event times tj, which are
specified in column t_j
. The value d0, j relates to
the column n_event_control
, d1, j to
n_event_experimental
, and dj to
n_event
. Similarly, n0, j relates to
column n_risk_control
, n1, j to
n_risk_experimental
, and nj to
n_risk
.
To calculate the test statistics for a weighted log-rank test, we need to evaluate the observed number of events on one arm, e.g. d0, j, and the expected number of events on the same arm, e.g. $d_j \frac{n_{0,j}}{n_j}$ at each tj. The test statistic UW is then a weighted sum (using weights wj) of the difference of these values:
$$ U^W = \sum_{j=1}^k w_j \left(d_{0,j} - d_j \frac{n_{0,j}}{n_j}\right) $$
The weights wj that are used depend on the type of weighted log-rank test, these are described next.
Three types of weighted log rank test are available in this package.
The values of the weights in the log-rank test can be calculated
using the function find_weights
with argument
method="lr"
. In the case of the standard log-rank test, the
weights are clearly very simple.
find_weights(formula=Surv(event_time,event_status)~group,
data=sim_data,
method="lr",
include_cens = FALSE)
#> [1] 1 1 1 1 1 1 1
Again the weights can be calculated using the
find_weights
function and setting method="fh"
,
along with arguments rho
and gamma
.
find_weights(formula=Surv(event_time,event_status)~group,
data=sim_data,
method="fh",
rho = 0,
gamma= 1,
include_cens = FALSE)
#> [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6
Under the null hypothesis that the survival curves of the two treatment arms are equal, the distribution of UW is
UW ∼ N(0, VW)
where the variance, VW, is equal to $$ \sum_{j=1}^k w_j^2\frac{n_{0,j}n_{1,j} d_j (n_j - d_j)}{n_j^2(n_j-1)} $$
The Z-statistic is then simply calculated in the usual way by dividing the test statistic U by the square root of its variance.
To perform the full weighted log-rank test, use the function
wlrt
. This outputs the test statistic, its variance, the
Z-statistic and the name of the treatment group the test corresponds
to.
Leton and Zuluaga (2001) showed that every weighted log-rank test can be written as either an observed-minus-expected test (as described above), or as a permutation test.
The weights can be reformulated as scores for a permutation test using the following formula for the censoring scores and event scores respectively:
$$ C_j=-\sum_{i=1}w_i\frac{d_i}{n_i} $$
cj = Cj + wj
These scores can be calculated using the function
find_scores
in the following way. Plotting these scores
against the rank of the event times provides an intuitive explanation of
the issues of using the Fleming-Harrington test as it makes sense that
the scores for the events are decreasind with time, see Magirr
(2021).
The issue of stratification when performing weighted log-rank tests is discussed in Magirr and Jiménez (2022). They explore various approaches to combining the results of stratified analyses. In particular they recommend combining on the Z-statistic scale, i.e. for the case of two strata, first express the stratified log-rank test as a linear combination of standardized Z-statistics, $\sqrt{V_1}Z_1+\sqrt{V_2}Z_2 \sim N(0,V_1+V_2)$. V1 and V2 are the variances for the log-rank test statistic on the first and second stratum respectively, and Z1 and Z2 are the Z-statistics for the log-rank test statistic on the first and second stratum respectively. Secondly, the Z-statistics Z1 and Z2 are replaced by the Z-statistics from the weighted log-rank test.
$$ \tilde{U}^W=\sqrt{V_1}\left( \frac{U_1^W}{\sqrt{V_1^W}}\right)+\sqrt{V_2}\left( \frac{U_2^W}{\sqrt{V_2^W}}\right) $$
Here we introduce a strata ecog
that has different λ parameters, and demonstrate that
it is simple to perform the described stratified weighted log-rank
test.
sim_data_0 <- sim_data
sim_data_0$ecog=0
sim_data_1 <- sim_events_delay(
event_model=list(
duration_c = 36,
duration_e = c(6,30),
lambda_c = log(2)/6,
lambda_e = c(log(2)/6,log(2)/12)),
recruitment_model=list(
rec_model="power",
rec_period=12,
rec_power=1),
n_c=5,
n_e=5,
max_cal_t = 36
)
sim_data_1$ecog=1
sim_data_strata<-rbind(sim_data_0,sim_data_1)
wlrt(formula=Surv(event_time,event_status)~group+strata(ecog),
data=sim_data_strata,
method="mw",
t_star = 4
)
#> $by_strata
#> strata u v_u z trt_group
#> 1 ecog0 0.1615079 1.647592 0.1258256 experimental
#> 2 ecog1 -2.2293871 2.386703 -1.4430662 experimental
#>
#> $combined
#> u v z trt_group
#> 1 -1.70296 3.316904 -0.9350569 experimental
Leton, E. and Zuluaga, P. (2001) Equivalence between score and weighted tests for survival curves. Commun Stat., 30(4), 591-608.
Magirr, D. (2021). Non-proportional hazards in immuno-oncology: Is an old perspective needed?. Pharmaceutical Statistics, 20(3), 512-527.
Magirr, D. and Burman, C.F., (2019). Modestly weighted logrank tests. Statistics in medicine, 38(20), 3782-3790.
Magirr, D. and Jiménez, J. (2022) Stratified modestly-weighted log-rank tests in settings with an anticipated delayed separation of survival curves PREPRINT at https://arxiv.org/abs/2201.10445