sunicate() performs inference about biomarkers' capacity to modify treatment effects in randomized control trials with right-censored, time to event outcomes. The strength of each biomarkers' treatment effect modification is captured by an unknown variable importance parameter defined as the slope of the univariate conditional average treatment effect's linear approximation. In all but pathological cases, the larger the absolute value of this parameter, the greater the treatment effect modification. unicate() implements assumption-lean, cross-validated inference procedures about these variable importance parameters based on semiparametric theory. Assuming that the biomarkers have non-zero variance and that either the conditional survival or conditional censoring functions are well-estimated by their respective SuperLearners, estimates are generated by an unbiased and consistent estimator. Tests assessing whether these slope parameters are significantly different from zero are also performed; they are valid under the same bounded-variance condition and assuming that the censoring mechanism is consistently estimated.

sunicate(
  data,
  event,
  censor,
  relative_time,
  treatment,
  covariates,
  biomarkers,
  time_cutoff = NULL,
  cond_surv_haz_super_learner = NULL,
  cond_censor_haz_super_learner = NULL,
  propensity_score_ls,
  v_folds = 5L,
  parallel = FALSE
)

Arguments

data

A "wide" data.frame or tibble object containing the status (event variable), relative time of the event, treatment indicator, and covariates. Note that the biomarkers must be a subset of the covariates, and that there should only be one row per observation.

event

A character defining the name of the binary variable in the data argument that indicates whether an event occurred. Observations can have an event or be censored, but not both.

censor

A character defining the name of the binary variable in the data argument that indicates a right-censoring event. Observations can have an event or be censored, but not both.

relative_time

A character providing the name of the time variable in data.

treatment

A character indicating the name of the binary treatment variable in data.

covariates

A character vector listing the covariates in data.

biomarkers

A character vector listing the biomarkers of interest in data. biomarkers must be a subset of covariates.

time_cutoff

A numeric representing the time at which to assess the biomarkers' importance with respect to the outcome. If not specified, this value is set to the median value of the data argument's relative_time variable.

cond_surv_haz_super_learner

A Lrnr_sl object used to estimate the conditional event hazard model. If set to NULL, the default, an elastic net regression is used instead. It is best to use this default behaviour when analyzing small datasets.

cond_censor_haz_super_learner

A Lrnr_sl object used to estimate the conditional censoring hazard model. If set to NULL, the default, an elastic net regression is used instead. It is best to use this default behaviour when analyzing small datasets.

propensity_score_ls

A named numeric list providing the propensity scores for the treatment conditions. The first element of the list should correspond to the "treatment" condition, and the second to the "control" condition, whatever their names may be.

v_folds

A numeric indicating the number of folds used for V-fold cross-validation. Defaults to 5.

parallel

A logical determining whether to use origami's built-in parallelized cross-validation routines. This parallelization framework is built upon the future suite. Defaults to FALSE.

Value

A tibble with rows corresponding to the specified biomarkers. Each row contains an estimate of the treatment-modification variable importance parameter, its standard error, z-score, and the nominal and adjusted p-values of the accompanying test. FDR and FWER adjustments are performed using the Benjamini-Hochberg method and Holm's procedure, respectively. The biomarkers are ordered by significance.