Introduction to Chevron

Introduction

The chevron R package provides functions to produce standard tables, listings and graphs (TLGs) used to analyze and report clinical trials data. The ensemble of function used to produce a particular output are stored in an chevron_tlg object of class S4 also called pipelines. Each standard output is associated with one pipeline. They contain the following objects: * A main function also refereed to as TLG-function. * A preprocess function. * A postprocess function * A adam_dataset character vector of the name of the AdAM datasets required to create the output.

TLG-functions

The TLG-functions in chevron use other packages to produce the final outputs, for example rtables and tern are used to create listings and tables, and ggplot2, lattice, and grid are used to create graphs.

TLG-functions in chevron such as dmt01_1_main, aet02_1_main, aet02_2_main have the following properties:

they produce a narrow defined output (currently standards in Roche GDS). Note, that the naming convention <gds template id>_<i>_main indicates that a Roche GDS defined standard may have different implementations. Or, alternatively, a GDS template id can be regarded as a guideline and the function name in chevron as a standard.
have very few arguments to modify the standard. Generally, arguments may change the structure of the table (arm variable, which variables are summarized) but not parameterize the cell content (i.e. alpha-level for p-value).
have always the first argument adam_db which is the collection of ADaM datasets (ADSL, ADAE, ADRS, etc.). Please read the The adam_db Argument vignette in this package for more details.
have a .study argument, read the The .study argument vignette for more detail.
have the ... argument to facilitate their incorporation in a pipeline.

pre-processing

The pre-process functions in chevron use dm and dunlin packages to process dm object and turn them into a suitable input for TLG-functions. The pre-processing step typically includes checks that will ensure that the dm input can be later processed by the TLG-functions.

pre-process in chevron such as dmt01_1_pre, aet02_1_pre, aet02_2_pre have the following properties: 1. they return a dm object amenable to processing by a TLG-functions or return rapidly an understandable error message. 1. have very few arguments to modify the standard. 1. have always the first argument adam_db which is the collection of ADaM datasets (ADSL, ADAE, ADRS, etc.). Please read the The adam_db Argument vignette in this package for more details. 1. can have the .study argument and other argument of the corresponding TLG-functions as they may inform the function on the column that will be required and facilitate the checking process. 1. have the ... argument to facilitate their incorporation in a pipeline.

post-processing`

Post-processing function are not provided but can be created by the user.

`adam_dataset`

The adam_dataset stores the name(s) of the data sets in the AdAM dm object that will be used in the process. This information is important when the pipeline is interfaced with other processes from the chevron ecosystem of packages such as citril.

Example `AET02`

For example, the GDS template aet02 is implemented in chevron with the pipeline that have the name aet02_*. The object documentation which is accessible with the help function, e.g. help('aet02_1') documents what is particular in the _1 implementation.

We first define the data and put it in a dm object, you can find more details about this in the adam_db vignette.

library(dm)
#> 
#> Attaching package: 'dm'
#> The following object is masked from 'package:stats':
#> 
#>     filter
library(scda)
#> 

syn_data <- synthetic_cdisc_data("latest")

adam_study_data <- dm(adsl = syn_data$adsl, adae = syn_data$adae) %>%
  dm_add_pk(adsl, c("USUBJID", "STUDYID")) %>%
  dm_add_fk(adae, c("USUBJID", "STUDYID"), ref_table = "adsl") %>%
  dm_add_pk(adae, c("USUBJID", "STUDYID", "ASTDTM", "AETERM", "AESEQ"))

validate_dm(adam_study_data)
#> Warning: `validate_dm()` was deprecated in dm 0.3.0.
#> Please use `dm_validate()` instead.

A the aet02_1 output is then created as follows:

library(chevron)
#> Registered S3 method overwritten by 'tern':
#>   method   from 
#>   tidy.glm broom
run(aet02_1, adam_study_data)
#> MedDRA System Organ Class                                     A: Drug X    B: Placebo   C: Combination
#>   MedDRA Preferred Term                                        (N=134)      (N=134)        (N=132)    
#> ——————————————————————————————————————————————————————————————————————————————————————————————————————
#> Total number of patients with at least one adverse event     100 (74.6%)   98 (73.1%)     103 (78%)   
#> Overall total number of events                                   502          480            604      
#> cl A.1                                                                                                
#>   Total number of patients with at least one adverse event   68 (50.7%)    58 (43.3%)     76 (57.6%)  
#>   Total number of events                                         115           99            137      
#>   dcd A.1.1.1.1                                              45 (33.6%)    31 (23.1%)     52 (39.4%)  
#>   dcd A.1.1.1.2                                              41 (30.6%)    39 (29.1%)     42 (31.8%)  
#> cl B.2                                                                                                
#>   Total number of patients with at least one adverse event   62 (46.3%)    56 (41.8%)     74 (56.1%)  
#>   Total number of events                                         102          106            127      
#>   dcd B.2.2.3.1                                              38 (28.4%)    40 (29.9%)     45 (34.1%)  
#>   dcd B.2.1.2.1                                              39 (29.1%)    34 (25.4%)     46 (34.8%)  
#> cl D.1                                                                                                
#>   Total number of patients with at least one adverse event   64 (47.8%)    54 (40.3%)     68 (51.5%)  
#>   Total number of events                                         106           84            114      
#>   dcd D.1.1.1.1                                              42 (31.3%)    32 (23.9%)     46 (34.8%)  
#>   dcd D.1.1.4.2                                              38 (28.4%)    34 (25.4%)     40 (30.3%)  
#> cl D.2                                                                                                
#>   Total number of patients with at least one adverse event   37 (27.6%)    46 (34.3%)     50 (37.9%)  
#>   Total number of events                                         49            57             65      
#>   dcd D.2.1.5.3                                              37 (27.6%)    46 (34.3%)     50 (37.9%)  
#> cl C.2                                                                                                
#>   Total number of patients with at least one adverse event   28 (20.9%)    36 (26.9%)     48 (36.4%)  
#>   Total number of events                                         39            40             57      
#>   dcd C.2.1.2.1                                              28 (20.9%)    36 (26.9%)     48 (36.4%)  
#> cl B.1                                                                                                
#>   Total number of patients with at least one adverse event   38 (28.4%)    37 (27.6%)     36 (27.3%)  
#>   Total number of events                                         44            43             50      
#>   dcd B.1.1.1.1                                              38 (28.4%)    37 (27.6%)     36 (27.3%)  
#> cl C.1                                                                                                
#>   Total number of patients with at least one adverse event   36 (26.9%)    34 (25.4%)     36 (27.3%)  
#>   Total number of events                                         47            51             54      
#>   dcd C.1.1.1.3                                              36 (26.9%)    34 (25.4%)     36 (27.3%)

The function associated with a particular slot can be retrieved with the corresponding method: get_main, get_preprocessand get_postprocess

get_main(aet02_1)
#> function (adam_db, armvar = .study$actualarm, lbl_overall = .study$lbl_overall, 
#>     prune_0 = TRUE, deco = std_deco("AET02"), .study = list(actualarm = "ACTARM", 
#>         lbl_overall = NULL)) 
#> {
#>     dbsel <- get_db_data(adam_db, "adsl", "adae")
#>     lyt <- aet02_1_lyt(armvar = armvar, lbl_overall = lbl_overall, 
#>         deco = deco)
#>     tbl <- build_table(lyt, dbsel$adae, alt_counts_df = dbsel$adsl)
#>     if (prune_0) {
#>         tbl <- smart_prune(tbl)
#>     }
#>     tbl_sorted <- tbl %>% sort_at_path(path = c("AEBODSYS"), 
#>         scorefun = cont_n_allcols) %>% sort_at_path(path = c("AEBODSYS", 
#>         "*", "AEDECOD"), scorefun = score_occurrences)
#>     tbl_sorted
#> }
#> <bytecode: 0x556acaa4c1a0>
#> <environment: namespace:chevron>

These are standard function that can be used on their own

get_preprocess(aet02_1)(adam_study_data)
#> ── Metadata ────────────────────────────────────────────────────────────────────
#> Tables: `adsl`, `adae`
#> Columns: 148
#> Primary keys: 2
#> Foreign keys: 1

# or
foo <- aet02_1@preprocess
foo(adam_study_data)
#> ── Metadata ────────────────────────────────────────────────────────────────────
#> Tables: `adsl`, `adae`
#> Columns: 148
#> Primary keys: 2
#> Foreign keys: 1

Pipeline customization

In some instances it is useful to customize the pipeline, for instance by changing the pre processing function. Be aware that you have to think carefully about argument names and compatibility with downstream functions.

aet02_1@preprocess <- function(adam_db) adam_db
get_preprocess(aet02_1)
#> function(adam_db) adam_db

Note that this operation creates a local version of the pipeline. The package version of the pipeline (accessible with chevron::aet01_1) remains unchanged.

Custom Pipeline creation

To create a pipeline from scratch, use the provided constructor:


my_pipeline <- chevron_tlg(
  main = aet02_1_main,
  preprocess = aet02_1_pre,
  postprocess = function(tlg, ...) {
    print(paste("Finished at", Sys.time()))
    tlg
  },
  adam_datasets = c("adsl", "adae")
)

run(my_pipeline, adam_study_data)
#> [1] "Finished at 2022-10-14 01:37:42"
#> MedDRA System Organ Class                                     A: Drug X    B: Placebo   C: Combination
#>   MedDRA Preferred Term                                        (N=134)      (N=134)        (N=132)    
#> ——————————————————————————————————————————————————————————————————————————————————————————————————————
#> Total number of patients with at least one adverse event     100 (74.6%)   98 (73.1%)     103 (78%)   
#> Overall total number of events                                   502          480            604      
#> cl A.1                                                                                                
#>   Total number of patients with at least one adverse event   68 (50.7%)    58 (43.3%)     76 (57.6%)  
#>   Total number of events                                         115           99            137      
#>   dcd A.1.1.1.1                                              45 (33.6%)    31 (23.1%)     52 (39.4%)  
#>   dcd A.1.1.1.2                                              41 (30.6%)    39 (29.1%)     42 (31.8%)  
#> cl B.2                                                                                                
#>   Total number of patients with at least one adverse event   62 (46.3%)    56 (41.8%)     74 (56.1%)  
#>   Total number of events                                         102          106            127      
#>   dcd B.2.2.3.1                                              38 (28.4%)    40 (29.9%)     45 (34.1%)  
#>   dcd B.2.1.2.1                                              39 (29.1%)    34 (25.4%)     46 (34.8%)  
#> cl D.1                                                                                                
#>   Total number of patients with at least one adverse event   64 (47.8%)    54 (40.3%)     68 (51.5%)  
#>   Total number of events                                         106           84            114      
#>   dcd D.1.1.1.1                                              42 (31.3%)    32 (23.9%)     46 (34.8%)  
#>   dcd D.1.1.4.2                                              38 (28.4%)    34 (25.4%)     40 (30.3%)  
#> cl D.2                                                                                                
#>   Total number of patients with at least one adverse event   37 (27.6%)    46 (34.3%)     50 (37.9%)  
#>   Total number of events                                         49            57             65      
#>   dcd D.2.1.5.3                                              37 (27.6%)    46 (34.3%)     50 (37.9%)  
#> cl C.2                                                                                                
#>   Total number of patients with at least one adverse event   28 (20.9%)    36 (26.9%)     48 (36.4%)  
#>   Total number of events                                         39            40             57      
#>   dcd C.2.1.2.1                                              28 (20.9%)    36 (26.9%)     48 (36.4%)  
#> cl B.1                                                                                                
#>   Total number of patients with at least one adverse event   38 (28.4%)    37 (27.6%)     36 (27.3%)  
#>   Total number of events                                         44            43             50      
#>   dcd B.1.1.1.1                                              38 (28.4%)    37 (27.6%)     36 (27.3%)  
#> cl C.1                                                                                                
#>   Total number of patients with at least one adverse event   36 (26.9%)    34 (25.4%)     36 (27.3%)  
#>   Total number of events                                         47            51             54      
#>   dcd C.1.1.1.3                                              36 (26.9%)    34 (25.4%)     36 (27.3%)

Note that to ensure the correct execution of the run function, the name of the first argument of the main function must be adam_db; the input dm object to pre-process. The name of the first argument of the preprocess function must be adam_db; the input dm object to create TLG output and finally, the name of the first argument of the postprocess function must be TLG, the input TableTree object to post-process. Validation criteria enforce these rules upon creation of a pipeline.

Adrian Waddell

2022-10-14