Skip to contents

Introduction

In this vignette we create a

  • demographic table
  • adverse event table
  • response table
  • time-to-event analysis table

using the rtables layout facility. That is, we demonstrate how the layout based tabulation framework can specify the structure and relations that are commonly found when analyzing clinical trials data.

Note that all the data is created using random number generators. All ex_* data which is currently attached to the rtables package is provided by the formatters package and was created using the publicly available random.cdisc.data R package.

The packages used in this vignette are:

Demographic Table

Demographic tables summarize the variables content for different population subsets (encoded in the columns).

One feature of analyze() that we have not introduced in the previous vignette is that the analysis function afun can specify multiple rows with the in_rows() function:

ADSL <- ex_adsl # Example ADSL dataset

lyt <- basic_table() %>%
  split_cols_by("ARM") %>%
  analyze(vars = "AGE", afun = function(x) {
    in_rows(
      "Mean (sd)" = rcell(c(mean(x), sd(x)), format = "xx.xx (xx.xx)"),
      "Range" = rcell(range(x), format = "xx.xx - xx.xx")
    )
  })

tbl <- build_table(lyt, ADSL)
tbl
#               A: Drug X      B: Placebo     C: Combination
# ——————————————————————————————————————————————————————————
# Mean (sd)   33.77 (6.55)    35.43 (7.90)     35.43 (7.72) 
# Range       21.00 - 50.00   21.00 - 62.00   20.00 - 69.00

Multiple variables can be analyzed in one analyze() call:

lyt2 <- basic_table() %>%
  split_cols_by("ARM") %>%
  analyze(vars = c("AGE", "BMRKR1"), afun = function(x) {
    in_rows(
      "Mean (sd)" = rcell(c(mean(x), sd(x)), format = "xx.xx (xx.xx)"),
      "Range" = rcell(range(x), format = "xx.xx - xx.xx")
    )
  })

tbl2 <- build_table(lyt2, ADSL)
tbl2
#                 A: Drug X      B: Placebo     C: Combination
# ————————————————————————————————————————————————————————————
# AGE                                                         
#   Mean (sd)   33.77 (6.55)    35.43 (7.90)     35.43 (7.72) 
#   Range       21.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
# BMRKR1                                                      
#   Mean (sd)    5.97 (3.55)     5.70 (3.31)     5.62 (3.49)  
#   Range       0.41 - 17.67    0.65 - 14.24     0.17 - 21.39

Hence, if afun can process different data vector types (i.e. variables selected from the data) then we are fairly close to a standard demographic table. Here is a function that either creates a count table or some number summary if the argument x is a factor or numeric, respectively:

s_summary <- function(x) {
  if (is.numeric(x)) {
    in_rows(
      "n" = rcell(sum(!is.na(x)), format = "xx"),
      "Mean (sd)" = rcell(c(mean(x, na.rm = TRUE), sd(x, na.rm = TRUE)), format = "xx.xx (xx.xx)"),
      "IQR" = rcell(IQR(x, na.rm = TRUE), format = "xx.xx"),
      "min - max" = rcell(range(x, na.rm = TRUE), format = "xx.xx - xx.xx")
    )
  } else if (is.factor(x)) {
    vs <- as.list(table(x))
    do.call(in_rows, lapply(vs, rcell, format = "xx"))
  } else {
    stop("type not supported")
  }
}

Note we use rcell to wrap the results in order to add formatting instructions for rtables. We can use s_summary outside the context of tabulation:

s_summary(ADSL$AGE)
# RowsVerticalSection (in_rows) object print method:
# ----------------------------
#    row_name formatted_cell indent_mod row_label
# 1         n            400          0         n
# 2 Mean (sd)   34.88 (7.44)          0 Mean (sd)
# 3       IQR          10.00          0       IQR
# 4 min - max  20.00 - 69.00          0 min - max

and

s_summary(ADSL$SEX)
# RowsVerticalSection (in_rows) object print method:
# ----------------------------
#           row_name formatted_cell indent_mod        row_label
# 1                F            222          0                F
# 2                M            166          0                M
# 3                U              9          0                U
# 4 UNDIFFERENTIATED              3          0 UNDIFFERENTIATED

We can now create a commonly used variant of the demographic table:

summary_lyt <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  analyze(c("AGE", "SEX"), afun = s_summary)

summary_tbl <- build_table(summary_lyt, ADSL)
summary_tbl
#                        A: Drug X      B: Placebo     C: Combination
# ———————————————————————————————————————————————————————————————————
# AGE                                                                
#   n                       134             134             132      
#   Mean (sd)          33.77 (6.55)    35.43 (7.90)     35.43 (7.72) 
#   IQR                    11.00           10.00           10.00     
#   min - max          21.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
# SEX                                                                
#   F                       79              77               66      
#   M                       51              55               60      
#   U                        3               2               4       
#   UNDIFFERENTIATED         1               0               2

Note that analyze() can also be called multiple times in sequence:

summary_lyt2 <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  analyze("AGE", s_summary) %>%
  analyze("SEX", s_summary)

summary_tbl2 <- build_table(summary_lyt2, ADSL)
summary_tbl2
#                        A: Drug X      B: Placebo     C: Combination
# ———————————————————————————————————————————————————————————————————
# AGE                                                                
#   n                       134             134             132      
#   Mean (sd)          33.77 (6.55)    35.43 (7.90)     35.43 (7.72) 
#   IQR                    11.00           10.00           10.00     
#   min - max          21.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
# SEX                                                                
#   F                       79              77               66      
#   M                       51              55               60      
#   U                        3               2               4       
#   UNDIFFERENTIATED         1               0               2

which leads to the table identical to summary_tbl:

identical(summary_tbl, summary_tbl2)
# [1] TRUE

In clinical trials analyses the number of patients per column is often referred to as N (rather than the overall population which outside of clinical trials is commonly referred to as N). Column Ns are added by setting the show_colcounts argument in basic_table() to TRUE:

summary_lyt3 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARMCD") %>%
  analyze(c("AGE", "SEX"), s_summary)

summary_tbl3 <- build_table(summary_lyt3, ADSL)
summary_tbl3
#                          ARM A           ARM B           ARM C    
#                         (N=134)         (N=134)         (N=132)   
# ——————————————————————————————————————————————————————————————————
# AGE                                                               
#   n                       134             134             132     
#   Mean (sd)          33.77 (6.55)    35.43 (7.90)    35.43 (7.72) 
#   IQR                    11.00           10.00           10.00    
#   min - max          21.00 - 50.00   21.00 - 62.00   20.00 - 69.00
# SEX                                                               
#   F                       79              77              66      
#   M                       51              55              60      
#   U                        3               2               4      
#   UNDIFFERENTIATED         1               0               2

Variations on the Demographic Table

We will now show a couple of variations of the demographic table that we developed above. These variations are in structure and not in analysis, hence they don’t require a modification to the s_summary function.

We will start with a standard table analyzing the variables AGE and BMRKR2 variables:

lyt <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARM") %>%
  analyze(c("AGE", "BMRKR2"), s_summary)

tbl <- build_table(lyt, ADSL)
tbl
#                 A: Drug X      B: Placebo     C: Combination
#                  (N=134)         (N=134)         (N=132)    
# ————————————————————————————————————————————————————————————
# AGE                                                         
#   n                134             134             132      
#   Mean (sd)   33.77 (6.55)    35.43 (7.90)     35.43 (7.72) 
#   IQR             11.00           10.00           10.00     
#   min - max   21.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
# BMRKR2                                                      
#   LOW              50              45               40      
#   MEDIUM           37              56               42      
#   HIGH             47              33               50

Assume we would like to have this analysis carried out per gender encoded in the row space:

lyt <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by("SEX") %>%
  analyze(c("AGE", "BMRKR2"), s_summary)

tbl <- build_table(lyt, ADSL)
tbl
#                      A: Drug X      B: Placebo     C: Combination
#                       (N=134)         (N=134)         (N=132)    
# —————————————————————————————————————————————————————————————————
# F                                                                
#   AGE                                                            
#     n                   79              77               66      
#     Mean (sd)      32.76 (6.09)    34.12 (7.06)     35.20 (7.43) 
#     IQR                9.00            8.00             6.75     
#     min - max      21.00 - 47.00   23.00 - 58.00   21.00 - 64.00 
#   BMRKR2                                                         
#     LOW                 26              21               26      
#     MEDIUM              21              38               17      
#     HIGH                32              18               23      
# M                                                                
#   AGE                                                            
#     n                   51              55               60      
#     Mean (sd)      35.57 (7.08)    37.44 (8.69)     35.38 (8.24) 
#     IQR                11.00           9.00            11.00     
#     min - max      23.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
#   BMRKR2                                                         
#     LOW                 21              23               11      
#     MEDIUM              15              18               23      
#     HIGH                15              14               26      
# U                                                                
#   AGE                                                            
#     n                    3               2               4       
#     Mean (sd)      31.67 (3.21)    31.00 (5.66)     35.25 (3.10) 
#     IQR                3.00            4.00             3.25     
#     min - max      28.00 - 34.00   27.00 - 35.00   31.00 - 38.00 
#   BMRKR2                                                         
#     LOW                  2               1               1       
#     MEDIUM               1               0               2       
#     HIGH                 0               1               1       
# UNDIFFERENTIATED                                                 
#   AGE                                                            
#     n                    1               0               2       
#     Mean (sd)       28.00 (NA)          NA          45.00 (1.41) 
#     IQR                0.00             NA              1.00     
#     min - max      28.00 - 28.00    Inf - -Inf     44.00 - 46.00 
#   BMRKR2                                                         
#     LOW                  1               0               2       
#     MEDIUM               0               0               0       
#     HIGH                 0               0               0

We will now subset ADSL to include only males and females in the analysis in order to reduce the number of rows in the table:

ADSL_M_F <- filter(ADSL, SEX %in% c("M", "F"))

lyt2 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by("SEX") %>%
  analyze(c("AGE", "BMRKR2"), s_summary)

tbl2 <- build_table(lyt2, ADSL_M_F)
tbl2
#                      A: Drug X      B: Placebo     C: Combination
#                       (N=130)         (N=132)         (N=126)    
# —————————————————————————————————————————————————————————————————
# F                                                                
#   AGE                                                            
#     n                   79              77               66      
#     Mean (sd)      32.76 (6.09)    34.12 (7.06)     35.20 (7.43) 
#     IQR                9.00            8.00             6.75     
#     min - max      21.00 - 47.00   23.00 - 58.00   21.00 - 64.00 
#   BMRKR2                                                         
#     LOW                 26              21               26      
#     MEDIUM              21              38               17      
#     HIGH                32              18               23      
# M                                                                
#   AGE                                                            
#     n                   51              55               60      
#     Mean (sd)      35.57 (7.08)    37.44 (8.69)     35.38 (8.24) 
#     IQR                11.00           9.00            11.00     
#     min - max      23.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
#   BMRKR2                                                         
#     LOW                 21              23               11      
#     MEDIUM              15              18               23      
#     HIGH                15              14               26      
# U                                                                
#   AGE                                                            
#     n                    0               0               0       
#     Mean (sd)           NA              NA               NA      
#     IQR                 NA              NA               NA      
#     min - max       Inf - -Inf      Inf - -Inf       Inf - -Inf  
#   BMRKR2                                                         
#     LOW                  0               0               0       
#     MEDIUM               0               0               0       
#     HIGH                 0               0               0       
# UNDIFFERENTIATED                                                 
#   AGE                                                            
#     n                    0               0               0       
#     Mean (sd)           NA              NA               NA      
#     IQR                 NA              NA               NA      
#     min - max       Inf - -Inf      Inf - -Inf       Inf - -Inf  
#   BMRKR2                                                         
#     LOW                  0               0               0       
#     MEDIUM               0               0               0       
#     HIGH                 0               0               0

Note that the UNDIFFERENTIATED and U levels still show up in the table. This is because tabulation respects the factor levels and level order, exactly as the split and table function do. If empty levels should be dropped then rtables needs to know that at splitting time via the split_fun argument in split_rows_by(). There are a number of predefined functions. For this example drop_split_levels() is required to drop the empty levels at splitting time. Splitting is a big topic and will be eventually addressed in a specific package vignette.

lyt3 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by("SEX", split_fun = drop_split_levels, child_labels = "visible") %>%
  analyze(c("AGE", "BMRKR2"), s_summary)

tbl3 <- build_table(lyt3, ADSL_M_F)
tbl3
#                   A: Drug X      B: Placebo     C: Combination
#                    (N=130)         (N=132)         (N=126)    
# ——————————————————————————————————————————————————————————————
# F                                                             
#   AGE                                                         
#     n                79              77               66      
#     Mean (sd)   32.76 (6.09)    34.12 (7.06)     35.20 (7.43) 
#     IQR             9.00            8.00             6.75     
#     min - max   21.00 - 47.00   23.00 - 58.00   21.00 - 64.00 
#   BMRKR2                                                      
#     LOW              26              21               26      
#     MEDIUM           21              38               17      
#     HIGH             32              18               23      
# M                                                             
#   AGE                                                         
#     n                51              55               60      
#     Mean (sd)   35.57 (7.08)    37.44 (8.69)     35.38 (8.24) 
#     IQR             11.00           9.00            11.00     
#     min - max   23.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
#   BMRKR2                                                      
#     LOW              21              23               11      
#     MEDIUM           15              18               23      
#     HIGH             15              14               26

In the table above the labels M and F are not very descriptive. You can add the full labels as follows:

ADSL_M_F_l <- ADSL_M_F %>%
  mutate(lbl_sex = case_when(
    SEX == "M" ~ "Male",
    SEX == "F" ~ "Female",
    SEX == "U" ~ "Unknown",
    SEX == "UNDIFFERENTIATED" ~ "Undifferentiated"
  ))

lyt4 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "visible") %>%
  analyze(c("AGE", "BMRKR2"), s_summary)

tbl4 <- build_table(lyt4, ADSL_M_F_l)
tbl4
#                   A: Drug X      B: Placebo     C: Combination
#                    (N=130)         (N=132)         (N=126)    
# ——————————————————————————————————————————————————————————————
# Female                                                        
#   AGE                                                         
#     n                79              77               66      
#     Mean (sd)   32.76 (6.09)    34.12 (7.06)     35.20 (7.43) 
#     IQR             9.00            8.00             6.75     
#     min - max   21.00 - 47.00   23.00 - 58.00   21.00 - 64.00 
#   BMRKR2                                                      
#     LOW              26              21               26      
#     MEDIUM           21              38               17      
#     HIGH             32              18               23      
# Male                                                          
#   AGE                                                         
#     n                51              55               60      
#     Mean (sd)   35.57 (7.08)    37.44 (8.69)     35.38 (8.24) 
#     IQR             11.00           9.00            11.00     
#     min - max   23.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
#   BMRKR2                                                      
#     LOW              21              23               11      
#     MEDIUM           15              18               23      
#     HIGH             15              14               26

For the next table variation we only stratify by gender for the AGE analysis. To do this the nested argument has to be set to FALSE in analyze() call:

lyt5 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "visible") %>%
  analyze("AGE", s_summary, show_labels = "visible") %>%
  analyze("BMRKR2", s_summary, nested = FALSE, show_labels = "visible")

tbl5 <- build_table(lyt5, ADSL_M_F_l)
tbl5
#                   A: Drug X      B: Placebo     C: Combination
#                    (N=130)         (N=132)         (N=126)    
# ——————————————————————————————————————————————————————————————
# Female                                                        
#   AGE                                                         
#     n                79              77               66      
#     Mean (sd)   32.76 (6.09)    34.12 (7.06)     35.20 (7.43) 
#     IQR             9.00            8.00             6.75     
#     min - max   21.00 - 47.00   23.00 - 58.00   21.00 - 64.00 
# Male                                                          
#   AGE                                                         
#     n                51              55               60      
#     Mean (sd)   35.57 (7.08)    37.44 (8.69)     35.38 (8.24) 
#     IQR             11.00           9.00            11.00     
#     min - max   23.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
# BMRKR2                                                        
#   LOW                47              44               37      
#   MEDIUM             36              56               40      
#   HIGH               47              32               49

Once we split the rows into groups (Male and Female here) one might want to summarize groups: usually by showing count and column percentages. This is especially important if we have missing data. For example, if we create the above table but add missing data to the AGE variable:

insert_NAs <- function(x) {
  x[sample(c(TRUE, FALSE), length(x), TRUE, prob = c(0.2, 0.8))] <- NA
  x
}

set.seed(1)
ADSL_NA <- ADSL_M_F_l %>%
  mutate(AGE = insert_NAs(AGE))

lyt6 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by(
    "SEX",
    labels_var = "lbl_sex",
    split_fun = drop_split_levels,
    child_labels = "visible"
  ) %>%
  analyze("AGE", s_summary) %>%
  analyze("BMRKR2", s_summary, nested = FALSE, show_labels = "visible")

tbl6 <- build_table(lyt6, filter(ADSL_NA, SEX %in% c("M", "F")))
tbl6
#                 A: Drug X      B: Placebo     C: Combination
#                  (N=130)         (N=132)         (N=126)    
# ————————————————————————————————————————————————————————————
# Female                                                      
#   n                65              61               54      
#   Mean (sd)   32.71 (6.07)    34.33 (7.31)     34.61 (6.78) 
#   IQR             9.00            10.00            6.75     
#   min - max   21.00 - 47.00   23.00 - 58.00   21.00 - 54.00 
# Male                                                        
#   n                44              44               50      
#   Mean (sd)   35.66 (6.78)    36.93 (8.18)     35.64 (8.42) 
#   IQR             10.50           8.25            10.75     
#   min - max   24.00 - 48.00   21.00 - 58.00   20.00 - 69.00 
# BMRKR2                                                      
#   LOW              47              44               37      
#   MEDIUM           36              56               40      
#   HIGH             47              32               49

Here it is not easy to see how many females and males there are in each arm as n represents the number of non-missing data elements in the variables. Groups within rows that are defined by splitting can be summarized with summarize_row_groups(), for example:

lyt7 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels) %>%
  summarize_row_groups() %>%
  analyze("AGE", s_summary) %>%
  analyze("BMRKR2", afun = s_summary, nested = FALSE, show_labels = "visible")

tbl7 <- build_table(lyt7, filter(ADSL_NA, SEX %in% c("M", "F")))
tbl7
#                 A: Drug X      B: Placebo     C: Combination
#                  (N=130)         (N=132)         (N=126)    
# ————————————————————————————————————————————————————————————
# Female         79 (60.8%)      77 (58.3%)       66 (52.4%)  
#   n                65              61               54      
#   Mean (sd)   32.71 (6.07)    34.33 (7.31)     34.61 (6.78) 
#   IQR             9.00            10.00            6.75     
#   min - max   21.00 - 47.00   23.00 - 58.00   21.00 - 54.00 
# Male           51 (39.2%)      55 (41.7%)       60 (47.6%)  
#   n                44              44               50      
#   Mean (sd)   35.66 (6.78)    36.93 (8.18)     35.64 (8.42) 
#   IQR             10.50           8.25            10.75     
#   min - max   24.00 - 48.00   21.00 - 58.00   20.00 - 69.00 
# BMRKR2                                                      
#   LOW              47              44               37      
#   MEDIUM           36              56               40      
#   HIGH             47              32               49

There are a couple of things to note here:

  • Group summaries produce “content” rows. Visually, it’s impossible to distinguish data rows from content rows. Their difference is justified (and it’s an important design decision) because when we paginate tables the content rows are by default repeated if a group gets divided via pagination.
  • Conceptually the content rows summarize the patient population which is analyzed and hence are often the count & group percentages (default behavior of summarize_row_groups()).

We can recreate this default behavior (count percentage) by defining a cfun for illustrative purposes here as it results in the same table as above:

lyt8 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels) %>%
  summarize_row_groups(cfun = function(df, labelstr, .N_col, ...) {
    in_rows(
      rcell(nrow(df) * c(1, 1 / .N_col), format = "xx (xx.xx%)"),
      .labels = labelstr
    )
  }) %>%
  analyze("AGE", s_summary) %>%
  analyze("BEP01FL", afun = s_summary, nested = FALSE, show_labels = "visible")

tbl8 <- build_table(lyt8, filter(ADSL_NA, SEX %in% c("M", "F")))
tbl8
#                 A: Drug X      B: Placebo     C: Combination
#                  (N=130)         (N=132)         (N=126)    
# ————————————————————————————————————————————————————————————
# Female         79 (60.77%)     77 (58.33%)     66 (52.38%)  
#   n                65              61               54      
#   Mean (sd)   32.71 (6.07)    34.33 (7.31)     34.61 (6.78) 
#   IQR             9.00            10.00            6.75     
#   min - max   21.00 - 47.00   23.00 - 58.00   21.00 - 54.00 
# Male           51 (39.23%)     55 (41.67%)     60 (47.62%)  
#   n                44              44               50      
#   Mean (sd)   35.66 (6.78)    36.93 (8.18)     35.64 (8.42) 
#   IQR             10.50           8.25            10.75     
#   min - max   24.00 - 48.00   21.00 - 58.00   20.00 - 69.00 
# BEP01FL                                                     
#   Y                67              63               65      
#   N                63              69               61

Note that cfun, like afun (which is used in analyze()), can operate on either variables, passed via the x argument, or data.frames or tibbles, which are passed via the df argument (afun can optionally request df too). Unlike afun, cfun must accept labelstr as the second argument which gives the default group label (factor level from splitting) and hence it could be modified:

lyt9 <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "hidden") %>%
  summarize_row_groups(cfun = function(df, labelstr, .N_col, ...) {
    in_rows(
      rcell(nrow(df) * c(1, 1 / .N_col), format = "xx (xx.xx%)"),
      .labels = paste0(labelstr, ": count (perc.)")
    )
  }) %>%
  analyze("AGE", s_summary) %>%
  analyze("BEP01FL", s_summary, nested = FALSE, show_labels = "visible")

tbl9 <- build_table(lyt9, filter(ADSL_NA, SEX %in% c("M", "F")))
tbl9
#                           A: Drug X      B: Placebo     C: Combination
# ——————————————————————————————————————————————————————————————————————
# Female: count (perc.)    79 (60.77%)     77 (58.33%)     66 (52.38%)  
#   n                          65              61               54      
#   Mean (sd)             32.71 (6.07)    34.33 (7.31)     34.61 (6.78) 
#   IQR                       9.00            10.00            6.75     
#   min - max             21.00 - 47.00   23.00 - 58.00   21.00 - 54.00 
# Male: count (perc.)      51 (39.23%)     55 (41.67%)     60 (47.62%)  
#   n                          44              44               50      
#   Mean (sd)             35.66 (6.78)    36.93 (8.18)     35.64 (8.42) 
#   IQR                       10.50           8.25            10.75     
#   min - max             24.00 - 48.00   21.00 - 58.00   20.00 - 69.00 
# BEP01FL                                                               
#   Y                          67              63               65      
#   N                          63              69               61

Using Layouts

Layouts have a couple of advantages over tabulating the tables directly:

  • the creation of layouts requires the analyst to describe the problem in an abstract way
    • i.e. they separate the analyses description from the actual data
  • referencing variable names happens via strings (no non-standard evaluation (NSE) is needed, though this is arguably either a feature or a shortcoming)
  • layouts can be reused

Here is an example that demonstrates the reusability of layouts:

adsl_lyt <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARM") %>%
  analyze(c("AGE", "SEX"), afun = s_summary)

adsl_lyt
# A Pre-data Table Layout
# 
# Column-Split Structure:
# ARM (lvls) 
# 
# Row-Split Structure:
# AGE:SEX (** multivar analysis **)

We can now build a table for ADSL

adsl_tbl <- build_table(adsl_lyt, ADSL)
adsl_tbl
#                        A: Drug X      B: Placebo     C: Combination
#                         (N=134)         (N=134)         (N=132)    
# ———————————————————————————————————————————————————————————————————
# AGE                                                                
#   n                       134             134             132      
#   Mean (sd)          33.77 (6.55)    35.43 (7.90)     35.43 (7.72) 
#   IQR                    11.00           10.00           10.00     
#   min - max          21.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
# SEX                                                                
#   F                       79              77               66      
#   M                       51              55               60      
#   U                        3               2               4       
#   UNDIFFERENTIATED         1               0               2

or for all patients that are older than 18:

adsl_f_tbl <- build_table(lyt, ADSL %>% filter(AGE > 18))
# Warning in min(x): no non-missing arguments to min; returning Inf
# Warning in max(x): no non-missing arguments to max; returning -Inf
adsl_f_tbl
#                      A: Drug X      B: Placebo     C: Combination
#                       (N=134)         (N=134)         (N=132)    
# —————————————————————————————————————————————————————————————————
# F                                                                
#   AGE                                                            
#     n                   79              77               66      
#     Mean (sd)      32.76 (6.09)    34.12 (7.06)     35.20 (7.43) 
#     IQR                9.00            8.00             6.75     
#     min - max      21.00 - 47.00   23.00 - 58.00   21.00 - 64.00 
#   BMRKR2                                                         
#     LOW                 26              21               26      
#     MEDIUM              21              38               17      
#     HIGH                32              18               23      
# M                                                                
#   AGE                                                            
#     n                   51              55               60      
#     Mean (sd)      35.57 (7.08)    37.44 (8.69)     35.38 (8.24) 
#     IQR                11.00           9.00            11.00     
#     min - max      23.00 - 50.00   21.00 - 62.00   20.00 - 69.00 
#   BMRKR2                                                         
#     LOW                 21              23               11      
#     MEDIUM              15              18               23      
#     HIGH                15              14               26      
# U                                                                
#   AGE                                                            
#     n                    3               2               4       
#     Mean (sd)      31.67 (3.21)    31.00 (5.66)     35.25 (3.10) 
#     IQR                3.00            4.00             3.25     
#     min - max      28.00 - 34.00   27.00 - 35.00   31.00 - 38.00 
#   BMRKR2                                                         
#     LOW                  2               1               1       
#     MEDIUM               1               0               2       
#     HIGH                 0               1               1       
# UNDIFFERENTIATED                                                 
#   AGE                                                            
#     n                    1               0               2       
#     Mean (sd)       28.00 (NA)          NA          45.00 (1.41) 
#     IQR                0.00             NA              1.00     
#     min - max      28.00 - 28.00    Inf - -Inf     44.00 - 46.00 
#   BMRKR2                                                         
#     LOW                  1               0               2       
#     MEDIUM               0               0               0       
#     HIGH                 0               0               0

Adverse Events

There are a number of different adverse event tables. We will now present two tables that show adverse events by ID and then by grade and by ID.

This time we won’t use the ADAE dataset from random.cdisc.data but rather generate a dataset on the fly (see Adrian’s 2016 Phuse paper):

set.seed(1)

lookup <- tribble(
  ~AEDECOD,                          ~AEBODSYS,                                         ~AETOXGR,
  "HEADACHE",                        "NERVOUS SYSTEM DISORDERS",                        "5",
  "BACK PAIN",                       "MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS", "2",
  "GINGIVAL BLEEDING",               "GASTROINTESTINAL DISORDERS",                      "1",
  "HYPOTENSION",                     "VASCULAR DISORDERS",                              "3",
  "FAECES SOFT",                     "GASTROINTESTINAL DISORDERS",                      "2",
  "ABDOMINAL DISCOMFORT",            "GASTROINTESTINAL DISORDERS",                      "1",
  "DIARRHEA",                        "GASTROINTESTINAL DISORDERS",                      "1",
  "ABDOMINAL FULLNESS DUE TO GAS",   "GASTROINTESTINAL DISORDERS",                      "1",
  "NAUSEA (INTERMITTENT)",           "GASTROINTESTINAL DISORDERS",                      "2",
  "WEAKNESS",                        "MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS", "3",
  "ORTHOSTATIC HYPOTENSION",         "VASCULAR DISORDERS",                              "4"
)

normalize <- function(x) x / sum(x)
weightsA <- normalize(c(0.1, dlnorm(seq(0, 5, length.out = 25), meanlog = 3)))
weightsB <- normalize(c(0.2, dlnorm(seq(0, 5, length.out = 25))))

N_pop <- 300
ADSL2 <- data.frame(
  USUBJID = seq(1, N_pop, by = 1),
  ARM = sample(c("ARM A", "ARM B"), N_pop, TRUE),
  SEX = sample(c("F", "M"), N_pop, TRUE),
  AGE = 20 + rbinom(N_pop, size = 40, prob = 0.7)
)

l.adae <- mapply(
  ADSL2$USUBJID,
  ADSL2$ARM,
  ADSL2$SEX,
  ADSL2$AGE,
  FUN = function(id, arm, sex, age) {
    n_ae <- sample(0:25, 1, prob = if (arm == "ARM A") weightsA else weightsB)
    i <- sample(seq_len(nrow(lookup)), size = n_ae, replace = TRUE, prob = c(6, rep(1, 10)) / 16)
    lookup[i, ] %>%
      mutate(
        AESEQ = seq_len(n()),
        USUBJID = id, ARM = arm, SEX = sex, AGE = age
      )
  },
  SIMPLIFY = FALSE
)

ADAE2 <- do.call(rbind, l.adae)
ADAE2 <- ADAE2 %>%
  mutate(
    ARM = factor(ARM, levels = c("ARM A", "ARM B")),
    AEDECOD = as.factor(AEDECOD),
    AEBODSYS = as.factor(AEBODSYS),
    AETOXGR = factor(AETOXGR, levels = as.character(1:5))
  ) %>%
  select(USUBJID, ARM, AGE, SEX, AESEQ, AEDECOD, AEBODSYS, AETOXGR)

ADAE2
# # A tibble: 3,118 × 8
#    USUBJID ARM     AGE SEX   AESEQ AEDECOD               AEBODSYS        AETOXGR
#      <dbl> <fct> <dbl> <chr> <int> <fct>                 <fct>           <fct>  
#  1       1 ARM A    45 F         1 NAUSEA (INTERMITTENT) GASTROINTESTIN… 2      
#  2       1 ARM A    45 F         2 HEADACHE              NERVOUS SYSTEM… 5      
#  3       1 ARM A    45 F         3 HEADACHE              NERVOUS SYSTEM… 5      
#  4       1 ARM A    45 F         4 HEADACHE              NERVOUS SYSTEM… 5      
#  5       1 ARM A    45 F         5 HEADACHE              NERVOUS SYSTEM… 5      
#  6       1 ARM A    45 F         6 HEADACHE              NERVOUS SYSTEM… 5      
#  7       1 ARM A    45 F         7 HEADACHE              NERVOUS SYSTEM… 5      
#  8       1 ARM A    45 F         8 HEADACHE              NERVOUS SYSTEM… 5      
#  9       1 ARM A    45 F         9 HEADACHE              NERVOUS SYSTEM… 5      
# 10       1 ARM A    45 F        10 FAECES SOFT           GASTROINTESTIN… 2      
# # ℹ 3,108 more rows

Adverse Events By ID

We start by defining an events summary function:

s_events_patients <- function(x, labelstr, .N_col) {
  in_rows(
    "Total number of patients with at least one event" =
      rcell(length(unique(x)) * c(1, 1 / .N_col), format = "xx (xx.xx%)"),
    "Total number of events" = rcell(length(x), format = "xx")
  )
}

So, for a population of 5 patients where

  • one patient has 2 AEs
  • one patient has 1 AE
  • three patients have no AEs

we would get the following summary:

s_events_patients(x = c("id 1", "id 1", "id 2"), .N_col = 5)
# RowsVerticalSection (in_rows) object print method:
# ----------------------------
#                                           row_name formatted_cell indent_mod
# 1 Total number of patients with at least one event     2 (40.00%)          0
# 2                           Total number of events              3          0
#                                          row_label
# 1 Total number of patients with at least one event
# 2                           Total number of events

The .N_col argument is a special keyword argument by which build_table() passes the population size for each respective column. For a list of keyword arguments for the functions passed to afun in analyze(), refer to the documentation with ?analyze.

We now use the s_events_patients summary function in a tabulation:

adae_lyt <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARM") %>%
  analyze("USUBJID", s_events_patients)

adae_tbl <- build_table(adae_lyt, ADAE2)
adae_tbl
#                                                       ARM A         ARM B    
#                                                     (N=2060)       (N=1058)  
# —————————————————————————————————————————————————————————————————————————————
# Total number of patients with at least one event   114 (5.53%)   150 (14.18%)
# Total number of events                                2060           1058

Note that the column Ns are wrong as by default they are set to the number of rows per group (i.e. number of AEs per arm here). This also affects the percentages. For this table we are interested in the number of patients per column/arm which is usually taken from ADSL (var ADSL2 here).

rtables handles this by allowing us to override how the column counts are computed. We can specify an alt_counts_df in build_table(). When we do this, rtables calculates the column counts by applying the same column faceting to alt_counts_df as it does to the primary data during tabulation:

adae_adsl_tbl <- build_table(adae_lyt, ADAE2, alt_counts_df = ADSL2)
adae_adsl_tbl
#                                                       ARM A          ARM B    
#                                                      (N=146)        (N=154)   
# ——————————————————————————————————————————————————————————————————————————————
# Total number of patients with at least one event   114 (78.08%)   150 (97.40%)
# Total number of events                                 2060           1058

Alternatively, if the desired column counts are already calculated, they can be specified directly via the col_counts argument to build_table(), though specifying an alt_counts_df is the preferred mechanism (the number of rows will be used, but no duplicate checking!!!).

We next calculate this information per system organ class:

adae_soc_lyt <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARM") %>%
  analyze("USUBJID", s_events_patients) %>%
  split_rows_by("AEBODSYS", child_labels = "visible", nested = FALSE) %>%
  summarize_row_groups("USUBJID", cfun = s_events_patients)

adae_soc_tbl <- build_table(adae_soc_lyt, ADAE2, alt_counts_df = ADSL2)
adae_soc_tbl
#                                                         ARM A          ARM B    
#                                                        (N=146)        (N=154)   
# ————————————————————————————————————————————————————————————————————————————————
# Total number of patients with at least one event     114 (78.08%)   150 (97.40%)
# Total number of events                                   2060           1058    
# GASTROINTESTINAL DISORDERS                                                      
#   Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
#   Total number of events                                 760            374     
# MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
#   Total number of patients with at least one event   98 (67.12%)    81 (52.60%) 
#   Total number of events                                 273            142     
# NERVOUS SYSTEM DISORDERS                                                        
#   Total number of patients with at least one event   113 (77.40%)   133 (86.36%)
#   Total number of events                                 787            420     
# VASCULAR DISORDERS                                                              
#   Total number of patients with at least one event   93 (63.70%)    75 (48.70%) 
#   Total number of events                                 240            122

We now have to add a count table of AEDECOD for each AEBODSYS. The default analyze() behavior for a factor is to create the count table per level (using rtab_inner):

adae_soc_lyt2 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARM") %>%
  split_rows_by("AEBODSYS", child_labels = "visible", indent_mod = 1) %>%
  summarize_row_groups("USUBJID", cfun = s_events_patients) %>%
  analyze("AEDECOD", indent_mod = -1)

adae_soc_tbl2 <- build_table(adae_soc_lyt2, ADAE2, alt_counts_df = ADSL2)
adae_soc_tbl2
#                                                           ARM A          ARM B    
#                                                          (N=146)        (N=154)   
# ——————————————————————————————————————————————————————————————————————————————————
#   GASTROINTESTINAL DISORDERS                                                      
#     Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
#     Total number of events                                 760            374     
#     ABDOMINAL DISCOMFORT                                   113             65     
#     ABDOMINAL FULLNESS DUE TO GAS                          119             65     
#     BACK PAIN                                               0              0      
#     DIARRHEA                                               107             53     
#     FAECES SOFT                                            122             58     
#     GINGIVAL BLEEDING                                      147             71     
#     HEADACHE                                                0              0      
#     HYPOTENSION                                             0              0      
#     NAUSEA (INTERMITTENT)                                  152             62     
#     ORTHOSTATIC HYPOTENSION                                 0              0      
#     WEAKNESS                                                0              0      
#   MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
#     Total number of patients with at least one event   98 (67.12%)    81 (52.60%) 
#     Total number of events                                 273            142     
#     ABDOMINAL DISCOMFORT                                    0              0      
#     ABDOMINAL FULLNESS DUE TO GAS                           0              0      
#     BACK PAIN                                              135             75     
#     DIARRHEA                                                0              0      
#     FAECES SOFT                                             0              0      
#     GINGIVAL BLEEDING                                       0              0      
#     HEADACHE                                                0              0      
#     HYPOTENSION                                             0              0      
#     NAUSEA (INTERMITTENT)                                   0              0      
#     ORTHOSTATIC HYPOTENSION                                 0              0      
#     WEAKNESS                                               138             67     
#   NERVOUS SYSTEM DISORDERS                                                        
#     Total number of patients with at least one event   113 (77.40%)   133 (86.36%)
#     Total number of events                                 787            420     
#     ABDOMINAL DISCOMFORT                                    0              0      
#     ABDOMINAL FULLNESS DUE TO GAS                           0              0      
#     BACK PAIN                                               0              0      
#     DIARRHEA                                                0              0      
#     FAECES SOFT                                             0              0      
#     GINGIVAL BLEEDING                                       0              0      
#     HEADACHE                                               787            420     
#     HYPOTENSION                                             0              0      
#     NAUSEA (INTERMITTENT)                                   0              0      
#     ORTHOSTATIC HYPOTENSION                                 0              0      
#     WEAKNESS                                                0              0      
#   VASCULAR DISORDERS                                                              
#     Total number of patients with at least one event   93 (63.70%)    75 (48.70%) 
#     Total number of events                                 240            122     
#     ABDOMINAL DISCOMFORT                                    0              0      
#     ABDOMINAL FULLNESS DUE TO GAS                           0              0      
#     BACK PAIN                                               0              0      
#     DIARRHEA                                                0              0      
#     FAECES SOFT                                             0              0      
#     GINGIVAL BLEEDING                                       0              0      
#     HEADACHE                                                0              0      
#     HYPOTENSION                                            104             58     
#     NAUSEA (INTERMITTENT)                                   0              0      
#     ORTHOSTATIC HYPOTENSION                                136             64     
#     WEAKNESS                                                0              0

The indent_mod argument enables relative indenting changes if the tree structure of the table does not result in the desired indentation by default.

This table so far is however not the usual adverse event table as it counts the total number of events and not the number of subjects for one or more events for a particular term. To get the correct table we need to write a custom analysis function:

table_count_once_per_id <- function(df, termvar = "AEDECOD", idvar = "USUBJID") {
  x <- df[[termvar]]
  id <- df[[idvar]]

  counts <- table(x[!duplicated(id)])

  in_rows(
    .list = as.vector(counts),
    .labels = names(counts)
  )
}

table_count_once_per_id(ADAE2)
# RowsVerticalSection (in_rows) object print method:
# ----------------------------
#                         row_name formatted_cell indent_mod
# 1           ABDOMINAL DISCOMFORT             23          0
# 2  ABDOMINAL FULLNESS DUE TO GAS             21          0
# 3                      BACK PAIN             20          0
# 4                       DIARRHEA              7          0
# 5                    FAECES SOFT             11          0
# 6              GINGIVAL BLEEDING             15          0
# 7                       HEADACHE            100          0
# 8                    HYPOTENSION             16          0
# 9          NAUSEA (INTERMITTENT)             21          0
# 10       ORTHOSTATIC HYPOTENSION             14          0
# 11                      WEAKNESS             16          0
#                        row_label
# 1           ABDOMINAL DISCOMFORT
# 2  ABDOMINAL FULLNESS DUE TO GAS
# 3                      BACK PAIN
# 4                       DIARRHEA
# 5                    FAECES SOFT
# 6              GINGIVAL BLEEDING
# 7                       HEADACHE
# 8                    HYPOTENSION
# 9          NAUSEA (INTERMITTENT)
# 10       ORTHOSTATIC HYPOTENSION
# 11                      WEAKNESS

So the desired AE table is:

adae_soc_lyt3 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARM") %>%
  split_rows_by("AEBODSYS", child_labels = "visible", indent_mod = 1) %>%
  summarize_row_groups("USUBJID", cfun = s_events_patients) %>%
  analyze("AEDECOD", afun = table_count_once_per_id, show_labels = "hidden", indent_mod = -1)

adae_soc_tbl3 <- build_table(adae_soc_lyt3, ADAE2, alt_counts_df = ADSL2)
adae_soc_tbl3
#                                                           ARM A          ARM B    
#                                                          (N=146)        (N=154)   
# ——————————————————————————————————————————————————————————————————————————————————
#   GASTROINTESTINAL DISORDERS                                                      
#     Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
#     Total number of events                                 760            374     
#     ABDOMINAL DISCOMFORT                                    24             28     
#     ABDOMINAL FULLNESS DUE TO GAS                           18             26     
#     BACK PAIN                                               0              0      
#     DIARRHEA                                                17             17     
#     FAECES SOFT                                             17             14     
#     GINGIVAL BLEEDING                                       18             25     
#     HEADACHE                                                0              0      
#     HYPOTENSION                                             0              0      
#     NAUSEA (INTERMITTENT)                                   20             20     
#     ORTHOSTATIC HYPOTENSION                                 0              0      
#     WEAKNESS                                                0              0      
#   MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
#     Total number of patients with at least one event   98 (67.12%)    81 (52.60%) 
#     Total number of events                                 273            142     
#     ABDOMINAL DISCOMFORT                                    0              0      
#     ABDOMINAL FULLNESS DUE TO GAS                           0              0      
#     BACK PAIN                                               58             45     
#     DIARRHEA                                                0              0      
#     FAECES SOFT                                             0              0      
#     GINGIVAL BLEEDING                                       0              0      
#     HEADACHE                                                0              0      
#     HYPOTENSION                                             0              0      
#     NAUSEA (INTERMITTENT)                                   0              0      
#     ORTHOSTATIC HYPOTENSION                                 0              0      
#     WEAKNESS                                                40             36     
#   NERVOUS SYSTEM DISORDERS                                                        
#     Total number of patients with at least one event   113 (77.40%)   133 (86.36%)
#     Total number of events                                 787            420     
#     ABDOMINAL DISCOMFORT                                    0              0      
#     ABDOMINAL FULLNESS DUE TO GAS                           0              0      
#     BACK PAIN                                               0              0      
#     DIARRHEA                                                0              0      
#     FAECES SOFT                                             0              0      
#     GINGIVAL BLEEDING                                       0              0      
#     HEADACHE                                               113            133     
#     HYPOTENSION                                             0              0      
#     NAUSEA (INTERMITTENT)                                   0              0      
#     ORTHOSTATIC HYPOTENSION                                 0              0      
#     WEAKNESS                                                0              0      
#   VASCULAR DISORDERS                                                              
#     Total number of patients with at least one event   93 (63.70%)    75 (48.70%) 
#     Total number of events                                 240            122     
#     ABDOMINAL DISCOMFORT                                    0              0      
#     ABDOMINAL FULLNESS DUE TO GAS                           0              0      
#     BACK PAIN                                               0              0      
#     DIARRHEA                                                0              0      
#     FAECES SOFT                                             0              0      
#     GINGIVAL BLEEDING                                       0              0      
#     HEADACHE                                                0              0      
#     HYPOTENSION                                             44             31     
#     NAUSEA (INTERMITTENT)                                   0              0      
#     ORTHOSTATIC HYPOTENSION                                 49             44     
#     WEAKNESS                                                0              0

Note that we are missing the overall summary in the first two rows. This can be added with an initial analyze() call.

adae_soc_lyt4 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARM") %>%
  analyze("USUBJID", afun = s_events_patients) %>%
  split_rows_by("AEBODSYS", child_labels = "visible", indent_mod = 1, section_div = "") %>%
  summarize_row_groups("USUBJID", cfun = s_events_patients) %>%
  analyze("AEDECOD", table_count_once_per_id, show_labels = "hidden", indent_mod = -1)

adae_soc_tbl4 <- build_table(adae_soc_lyt4, ADAE2, alt_counts_df = ADSL2)
adae_soc_tbl4
#                                                           ARM A          ARM B    
#                                                          (N=146)        (N=154)   
# ——————————————————————————————————————————————————————————————————————————————————
# Total number of patients with at least one event       114 (78.08%)   150 (97.40%)
# Total number of events                                     2060           1058    
#   GASTROINTESTINAL DISORDERS                                                      
#     Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
#     Total number of events                                 760            374     
#     ABDOMINAL DISCOMFORT                                    24             28     
#     ABDOMINAL FULLNESS DUE TO GAS                           18             26     
#     BACK PAIN                                               0              0      
#     DIARRHEA                                                17             17     
#     FAECES SOFT                                             17             14     
#     GINGIVAL BLEEDING                                       18             25     
#     HEADACHE                                                0              0      
#     HYPOTENSION                                             0              0      
#     NAUSEA (INTERMITTENT)                                   20             20     
#     ORTHOSTATIC HYPOTENSION                                 0              0      
#     WEAKNESS                                                0              0      
# 
#   MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
#     Total number of patients with at least one event   98 (67.12%)    81 (52.60%) 
#     Total number of events                                 273            142     
#     ABDOMINAL DISCOMFORT                                    0              0      
#     ABDOMINAL FULLNESS DUE TO GAS                           0              0      
#     BACK PAIN                                               58             45     
#     DIARRHEA                                                0              0      
#     FAECES SOFT                                             0              0      
#     GINGIVAL BLEEDING                                       0              0      
#     HEADACHE                                                0              0      
#     HYPOTENSION                                             0              0      
#     NAUSEA (INTERMITTENT)                                   0              0      
#     ORTHOSTATIC HYPOTENSION                                 0              0      
#     WEAKNESS                                                40             36     
# 
#   NERVOUS SYSTEM DISORDERS                                                        
#     Total number of patients with at least one event   113 (77.40%)   133 (86.36%)
#     Total number of events                                 787            420     
#     ABDOMINAL DISCOMFORT                                    0              0      
#     ABDOMINAL FULLNESS DUE TO GAS                           0              0      
#     BACK PAIN                                               0              0      
#     DIARRHEA                                                0              0      
#     FAECES SOFT                                             0              0      
#     GINGIVAL BLEEDING                                       0              0      
#     HEADACHE                                               113            133     
#     HYPOTENSION                                             0              0      
#     NAUSEA (INTERMITTENT)                                   0              0      
#     ORTHOSTATIC HYPOTENSION                                 0              0      
#     WEAKNESS                                                0              0      
# 
#   VASCULAR DISORDERS                                                              
#     Total number of patients with at least one event   93 (63.70%)    75 (48.70%) 
#     Total number of events                                 240            122     
#     ABDOMINAL DISCOMFORT                                    0              0      
#     ABDOMINAL FULLNESS DUE TO GAS                           0              0      
#     BACK PAIN                                               0              0      
#     DIARRHEA                                                0              0      
#     FAECES SOFT                                             0              0      
#     GINGIVAL BLEEDING                                       0              0      
#     HEADACHE                                                0              0      
#     HYPOTENSION                                             44             31     
#     NAUSEA (INTERMITTENT)                                   0              0      
#     ORTHOSTATIC HYPOTENSION                                 49             44     
#     WEAKNESS                                                0              0

Finally, if we wanted to prune the 0 count rows we can do that with the trim_rows() function:

trim_rows(adae_soc_tbl4)
#                                                           ARM A          ARM B    
#                                                          (N=146)        (N=154)   
# ——————————————————————————————————————————————————————————————————————————————————
# Total number of patients with at least one event       114 (78.08%)   150 (97.40%)
# Total number of events                                     2060           1058    
#   GASTROINTESTINAL DISORDERS                                                      
#     Total number of patients with at least one event   114 (78.08%)   130 (84.42%)
#     Total number of events                                 760            374     
#     ABDOMINAL DISCOMFORT                                    24             28     
#     ABDOMINAL FULLNESS DUE TO GAS                           18             26     
#     DIARRHEA                                                17             17     
#     FAECES SOFT                                             17             14     
#     GINGIVAL BLEEDING                                       18             25     
#     NAUSEA (INTERMITTENT)                                   20             20     
# 
#   MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                                 
#     Total number of patients with at least one event   98 (67.12%)    81 (52.60%) 
#     Total number of events                                 273            142     
#     BACK PAIN                                               58             45     
#     WEAKNESS                                                40             36     
# 
#   NERVOUS SYSTEM DISORDERS                                                        
#     Total number of patients with at least one event   113 (77.40%)   133 (86.36%)
#     Total number of events                                 787            420     
#     HEADACHE                                               113            133     
# 
#   VASCULAR DISORDERS                                                              
#     Total number of patients with at least one event   93 (63.70%)    75 (48.70%) 
#     Total number of events                                 240            122     
#     HYPOTENSION                                             44             31     
#     ORTHOSTATIC HYPOTENSION                                 49             44

Pruning is a larger topic with a separate rtables package vignette.

Adverse Events By ID and By Grade

The adverse events table by ID and by grade shows how many patients had at least one adverse event per grade for different subsets of the data (e.g. defined by system organ class).

For this table we do not show the zero count grades. Note that we add the “overall” groups with a custom split function.

table_count_grade_once_per_id <- function(df,
                                          labelstr = "",
                                          gradevar = "AETOXGR",
                                          idvar = "USUBJID",
                                          grade_levels = NULL) {
  id <- df[[idvar]]
  grade <- df[[gradevar]]

  if (!is.null(grade_levels)) {
    stopifnot(all(grade %in% grade_levels))
    grade <- factor(grade, levels = grade_levels)
  }

  id_sel <- !duplicated(id)

  in_rows(
    "--Any Grade--" = sum(id_sel),
    .list = as.list(table(grade[id_sel]))
  )
}

table_count_grade_once_per_id(ex_adae, grade_levels = 1:5)
# RowsVerticalSection (in_rows) object print method:
# ----------------------------
#        row_name formatted_cell indent_mod     row_label
# 1 --Any Grade--            365          0 --Any Grade--
# 2             1            131          0             1
# 3             2             70          0             2
# 4             3             74          0             3
# 5             4             25          0             4
# 6             5             65          0             5

All of the layouting concepts needed to create this table have already been introduced so far:

adae_grade_lyt <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARM") %>%
  analyze(
    "AETOXGR",
    afun = table_count_grade_once_per_id,
    extra_args = list(grade_levels = 1:5),
    var_labels = "- Any adverse events -",
    show_labels = "visible"
  ) %>%
  split_rows_by("AEBODSYS", child_labels = "visible", indent_mod = 1) %>%
  summarize_row_groups(cfun = table_count_grade_once_per_id, format = "xx", indent_mod = 1) %>%
  split_rows_by("AEDECOD", child_labels = "visible", indent_mod = -2) %>%
  analyze(
    "AETOXGR",
    afun = table_count_grade_once_per_id,
    extra_args = list(grade_levels = 1:5),
    show_labels = "hidden"
  )

adae_grade_tbl <- build_table(adae_grade_lyt, ADAE2, alt_counts_df = ADSL2)
adae_grade_tbl
#                                                      ARM A     ARM B 
#                                                     (N=146)   (N=154)
# —————————————————————————————————————————————————————————————————————
# - Any adverse events -                                               
#   --Any Grade--                                       114       150  
#   1                                                   32        34   
#   2                                                   22        30   
#   3                                                   11        21   
#   4                                                    8         6   
#   5                                                   41        59   
#   GASTROINTESTINAL DISORDERS                                         
#         --Any Grade--                                 114       130  
#         1                                             77        96   
#         2                                             37        34   
#         3                                              0         0   
#         4                                              0         0   
#         5                                              0         0   
#     ABDOMINAL DISCOMFORT                                             
#       --Any Grade--                                   68        49   
#       1                                               68        49   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     ABDOMINAL FULLNESS DUE TO GAS                                    
#       --Any Grade--                                   73        51   
#       1                                               73        51   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     BACK PAIN                                                        
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     DIARRHEA                                                         
#       --Any Grade--                                   68        40   
#       1                                               68        40   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     FAECES SOFT                                                      
#       --Any Grade--                                   76        44   
#       1                                                0         0   
#       2                                               76        44   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     GINGIVAL BLEEDING                                                
#       --Any Grade--                                   80        52   
#       1                                               80        52   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     HEADACHE                                                         
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     HYPOTENSION                                                      
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     NAUSEA (INTERMITTENT)                                            
#       --Any Grade--                                   83        50   
#       1                                                0         0   
#       2                                               83        50   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     ORTHOSTATIC HYPOTENSION                                          
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     WEAKNESS                                                         
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#   MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS                    
#         --Any Grade--                                 98        81   
#         1                                              0         0   
#         2                                             58        45   
#         3                                             40        36   
#         4                                              0         0   
#         5                                              0         0   
#     ABDOMINAL DISCOMFORT                                             
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     ABDOMINAL FULLNESS DUE TO GAS                                    
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     BACK PAIN                                                        
#       --Any Grade--                                   79        62   
#       1                                                0         0   
#       2                                               79        62   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     DIARRHEA                                                         
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     FAECES SOFT                                                      
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     GINGIVAL BLEEDING                                                
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     HEADACHE                                                         
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     HYPOTENSION                                                      
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     NAUSEA (INTERMITTENT)                                            
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     ORTHOSTATIC HYPOTENSION                                          
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     WEAKNESS                                                         
#       --Any Grade--                                   73        43   
#       1                                                0         0   
#       2                                                0         0   
#       3                                               73        43   
#       4                                                0         0   
#       5                                                0         0   
#   NERVOUS SYSTEM DISORDERS                                           
#         --Any Grade--                                 113       133  
#         1                                              0         0   
#         2                                              0         0   
#         3                                              0         0   
#         4                                              0         0   
#         5                                             113       133  
#     ABDOMINAL DISCOMFORT                                             
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     ABDOMINAL FULLNESS DUE TO GAS                                    
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     BACK PAIN                                                        
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     DIARRHEA                                                         
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     FAECES SOFT                                                      
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     GINGIVAL BLEEDING                                                
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     HEADACHE                                                         
#       --Any Grade--                                   113       133  
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                               113       133  
#     HYPOTENSION                                                      
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     NAUSEA (INTERMITTENT)                                            
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     ORTHOSTATIC HYPOTENSION                                          
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     WEAKNESS                                                         
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#   VASCULAR DISORDERS                                                 
#         --Any Grade--                                 93        75   
#         1                                              0         0   
#         2                                              0         0   
#         3                                             44        31   
#         4                                             49        44   
#         5                                              0         0   
#     ABDOMINAL DISCOMFORT                                             
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     ABDOMINAL FULLNESS DUE TO GAS                                    
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     BACK PAIN                                                        
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     DIARRHEA                                                         
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     FAECES SOFT                                                      
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     GINGIVAL BLEEDING                                                
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     HEADACHE                                                         
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     HYPOTENSION                                                      
#       --Any Grade--                                   66        43   
#       1                                                0         0   
#       2                                                0         0   
#       3                                               66        43   
#       4                                                0         0   
#       5                                                0         0   
#     NAUSEA (INTERMITTENT)                                            
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0   
#     ORTHOSTATIC HYPOTENSION                                          
#       --Any Grade--                                   70        54   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                               70        54   
#       5                                                0         0   
#     WEAKNESS                                                         
#       --Any Grade--                                    0         0   
#       1                                                0         0   
#       2                                                0         0   
#       3                                                0         0   
#       4                                                0         0   
#       5                                                0         0

Response Table

The response table that we will create here is composed of 3 parts:

  1. Binary response table
  2. Unstratified analysis comparison vs. control group
  3. Multinomial response table

Let’s start with the first part which is fairly simple to derive:

ADRS_BESRSPI <- ex_adrs %>%
  filter(PARAMCD == "BESRSPI") %>%
  mutate(
    rsp = factor(AVALC %in% c("CR", "PR"), levels = c(TRUE, FALSE), labels = c("Responders", "Non-Responders")),
    is_rsp = (rsp == "Responders")
  )

s_proportion <- function(x, .N_col) {
  in_rows(
    .list = lapply(
      as.list(table(x)),
      function(xi) rcell(xi * c(1, 1 / .N_col), format = "xx.xx (xx.xx%)")
    )
  )
}

rsp_lyt <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARMCD", ref_group = "ARM A") %>%
  analyze("rsp", s_proportion, show_labels = "hidden")

rsp_tbl <- build_table(rsp_lyt, ADRS_BESRSPI)
rsp_tbl
#                       ARM A            ARM B             ARM C     
#                      (N=134)          (N=134)           (N=132)    
# ———————————————————————————————————————————————————————————————————
# Responders       114.00 (85.07%)   90.00 (67.16%)   120.00 (90.91%)
# Non-Responders   20.00 (14.93%)    44.00 (32.84%)    12.00 (9.09%)

Note that we did set the ref_group argument in split_cols_by() which for the current table had no effect as we only use the cell data for the responder and non-responder counts. The ref_group argument is needed for the part 2 and 3 of the table.

We will now look the implementation of part 2: unstratified analysis comparison vs. control group. Let’s start with the analysis function:

s_unstrat_resp <- function(x, .ref_group, .in_ref_col) {
  if (.in_ref_col) {
    return(in_rows(
      "Difference in Response Rates (%)" = rcell(numeric(0)),
      "95% CI (Wald, with correction)" = rcell(numeric(0)),
      "p-value (Chi-Squared Test)" = rcell(numeric(0)),
      "Odds Ratio (95% CI)" = rcell(numeric(0))
    ))
  }

  fit <- stats::prop.test(
    x = c(sum(x), sum(.ref_group)),
    n = c(length(x), length(.ref_group)),
    correct = FALSE
  )

  fit_glm <- stats::glm(
    formula = rsp ~ group,
    data = data.frame(
      rsp = c(.ref_group, x),
      group = factor(rep(c("ref", "x"), times = c(length(.ref_group), length(x))), levels = c("ref", "x"))
    ),
    family = binomial(link = "logit")
  )

  in_rows(
    "Difference in Response Rates (%)" = non_ref_rcell(
      (mean(x) - mean(.ref_group)) * 100,
      .in_ref_col,
      format = "xx.xx"
    ),
    "95% CI (Wald, with correction)" = non_ref_rcell(
      fit$conf.int * 100,
      .in_ref_col,
      format = "(xx.xx, xx.xx)"
    ),
    "p-value (Chi-Squared Test)" = non_ref_rcell(
      fit$p.value,
      .in_ref_col,
      format = "x.xxxx | (<0.0001)"
    ),
    "Odds Ratio (95% CI)" = non_ref_rcell(
      c(
        exp(stats::coef(fit_glm)[-1]),
        exp(stats::confint.default(fit_glm, level = .95)[-1, , drop = FALSE])
      ),
      .in_ref_col,
      format = "xx.xx (xx.xx - xx.xx)"
    )
  )
}

s_unstrat_resp(
  x = ADRS_BESRSPI %>% filter(ARM == "A: Drug X") %>% pull(is_rsp),
  .ref_group = ADRS_BESRSPI %>% filter(ARM == "B: Placebo") %>% pull(is_rsp),
  .in_ref_col = FALSE
)
# RowsVerticalSection (in_rows) object print method:
# ----------------------------
#                           row_name     formatted_cell indent_mod
# 1 Difference in Response Rates (%)              17.91          0
# 2   95% CI (Wald, with correction)      (7.93, 27.89)          0
# 3       p-value (Chi-Squared Test)             0.0006          0
# 4              Odds Ratio (95% CI) 2.79 (1.53 - 5.06)          0
#                          row_label
# 1 Difference in Response Rates (%)
# 2   95% CI (Wald, with correction)
# 3       p-value (Chi-Squared Test)
# 4              Odds Ratio (95% CI)

Hence we can now add the next vignette to the table:

rsp_lyt2 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARMCD", ref_group = "ARM A") %>%
  analyze("rsp", s_proportion, show_labels = "hidden") %>%
  analyze(
    "is_rsp", s_unstrat_resp,
    show_labels = "visible",
    var_labels = "Unstratified Response Analysis"
  )

rsp_tbl2 <- build_table(rsp_lyt2, ADRS_BESRSPI)
rsp_tbl2
#                                           ARM A              ARM B                ARM C       
#                                          (N=134)            (N=134)              (N=132)      
# ——————————————————————————————————————————————————————————————————————————————————————————————
# Responders                           114.00 (85.07%)     90.00 (67.16%)      120.00 (90.91%)  
# Non-Responders                       20.00 (14.93%)      44.00 (32.84%)       12.00 (9.09%)   
# Unstratified Response Analysis                                                                
#   Difference in Response Rates (%)                           -17.91                5.83       
#   95% CI (Wald, with correction)                        (-27.89, -7.93)       (-1.94, 13.61)  
#   p-value (Chi-Squared Test)                                 0.0006               0.1436      
#   Odds Ratio (95% CI)                                  0.36 (0.20 - 0.65)   1.75 (0.82 - 3.75)

Next we will add part 3: the multinomial response table. To do so, we are adding a row-split by response level, and then doing the same thing as we did for the binary response table above.

s_prop <- function(df, .N_col) {
  in_rows(
    "95% CI (Wald, with correction)" = rcell(binom.test(nrow(df), .N_col)$conf.int * 100, format = "(xx.xx, xx.xx)")
  )
}

s_prop(
  df = ADRS_BESRSPI %>% filter(ARM == "A: Drug X", AVALC == "CR"),
  .N_col = sum(ADRS_BESRSPI$ARM == "A: Drug X")
)
# RowsVerticalSection (in_rows) object print method:
# ----------------------------
#                         row_name formatted_cell indent_mod
# 1 95% CI (Wald, with correction) (49.38, 66.67)          0
#                        row_label
# 1 95% CI (Wald, with correction)

We can now create the final response table with all three parts:

rsp_lyt3 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARMCD", ref_group = "ARM A") %>%
  analyze("rsp", s_proportion, show_labels = "hidden") %>%
  analyze(
    "is_rsp", s_unstrat_resp,
    show_labels = "visible", var_labels = "Unstratified Response Analysis"
  ) %>%
  split_rows_by(
    var = "AVALC",
    split_fun = reorder_split_levels(neworder = c("CR", "PR", "SD", "NON CR/PD", "PD", "NE"), drlevels = TRUE),
    nested = FALSE
  ) %>%
  summarize_row_groups() %>%
  analyze("AVALC", afun = s_prop)

rsp_tbl3 <- build_table(rsp_lyt3, ADRS_BESRSPI)
rsp_tbl3
#                                           ARM A              ARM B                ARM C       
#                                          (N=134)            (N=134)              (N=132)      
# ——————————————————————————————————————————————————————————————————————————————————————————————
# Responders                           114.00 (85.07%)     90.00 (67.16%)      120.00 (90.91%)  
# Non-Responders                       20.00 (14.93%)      44.00 (32.84%)       12.00 (9.09%)   
# Unstratified Response Analysis                                                                
#   Difference in Response Rates (%)                           -17.91                5.83       
#   95% CI (Wald, with correction)                        (-27.89, -7.93)       (-1.94, 13.61)  
#   p-value (Chi-Squared Test)                                 0.0006               0.1436      
#   Odds Ratio (95% CI)                                  0.36 (0.20 - 0.65)   1.75 (0.82 - 3.75)
# CR                                     78 (58.2%)          55 (41.0%)           97 (73.5%)    
#   95% CI (Wald, with correction)     (49.38, 66.67)      (32.63, 49.87)       (65.10, 80.79)  
# PR                                     36 (26.9%)          35 (26.1%)           23 (17.4%)    
#   95% CI (Wald, with correction)     (19.58, 35.20)      (18.92, 34.41)       (11.38, 24.99)  
# SD                                     20 (14.9%)          44 (32.8%)           12 (9.1%)     
#   95% CI (Wald, with correction)      (9.36, 22.11)      (24.97, 41.47)       (4.79, 15.34)

In the case that we wanted to rename the levels of AVALC and remove the CI for NE we could do that as follows:

rsp_label <- function(x) {
  rsp_full_label <- c(
    CR = "Complete Response (CR)",
    PR = "Partial Response (PR)",
    SD = "Stable Disease (SD)",
    `NON CR/PD` = "Non-CR or Non-PD (NON CR/PD)",
    PD = "Progressive Disease (PD)",
    NE = "Not Evaluable (NE)",
    Missing = "Missing",
    `NE/Missing` = "Missing or unevaluable"
  )
  stopifnot(all(x %in% names(rsp_full_label)))
  rsp_full_label[x]
}


rsp_lyt4 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARMCD", ref_group = "ARM A") %>%
  analyze("rsp", s_proportion, show_labels = "hidden") %>%
  analyze(
    "is_rsp", s_unstrat_resp,
    show_labels = "visible", var_labels = "Unstratified Response Analysis"
  ) %>%
  split_rows_by(
    var = "AVALC",
    split_fun = keep_split_levels(c("CR", "PR", "SD", "PD"), reorder = TRUE),
    nested = FALSE
  ) %>%
  summarize_row_groups(cfun = function(df, labelstr, .N_col) {
    in_rows(nrow(df) * c(1, 1 / .N_col), .formats = "xx (xx.xx%)", .labels = rsp_label(labelstr))
  }) %>%
  analyze("AVALC", afun = s_prop) %>%
  analyze("AVALC", afun = function(x, .N_col) {
    in_rows(rcell(sum(x == "NE") * c(1, 1 / .N_col), format = "xx.xx (xx.xx%)"), .labels = rsp_label("NE"))
  }, nested = FALSE)

rsp_tbl4 <- build_table(rsp_lyt4, ADRS_BESRSPI)
rsp_tbl4
#                                           ARM A              ARM B                ARM C       
#                                          (N=134)            (N=134)              (N=132)      
# ——————————————————————————————————————————————————————————————————————————————————————————————
# Responders                           114.00 (85.07%)     90.00 (67.16%)      120.00 (90.91%)  
# Non-Responders                       20.00 (14.93%)      44.00 (32.84%)       12.00 (9.09%)   
# Unstratified Response Analysis                                                                
#   Difference in Response Rates (%)                           -17.91                5.83       
#   95% CI (Wald, with correction)                        (-27.89, -7.93)       (-1.94, 13.61)  
#   p-value (Chi-Squared Test)                                 0.0006               0.1436      
#   Odds Ratio (95% CI)                                  0.36 (0.20 - 0.65)   1.75 (0.82 - 3.75)
# Complete Response (CR)                 78 (58.21%)        55 (41.04%)          97 (73.48%)    
#   95% CI (Wald, with correction)     (49.38, 66.67)      (32.63, 49.87)       (65.10, 80.79)  
# Partial Response (PR)                  36 (26.87%)        35 (26.12%)          23 (17.42%)    
#   95% CI (Wald, with correction)     (19.58, 35.20)      (18.92, 34.41)       (11.38, 24.99)  
# Stable Disease (SD)                    20 (14.93%)        44 (32.84%)           12 (9.09%)    
#   95% CI (Wald, with correction)      (9.36, 22.11)      (24.97, 41.47)       (4.79, 15.34)   
# Progressive Disease (PD)                0 (0.00%)          0 (0.00%)            0 (0.00%)     
#   95% CI (Wald, with correction)      (0.00, 2.72)        (0.00, 2.72)         (0.00, 2.76)   
# Not Evaluable (NE)                    0.00 (0.00%)        0.00 (0.00%)         0.00 (0.00%)

Note that the table is missing the rows gaps to make it more readable. The row spacing feature is on the rtables roadmap and will be implemented in future.

Time to Event Analysis Table

The time to event analysis table that will be constructed consists of four parts:

  1. Overall subject counts
  2. Censored subjects summary
  3. Cox proportional-hazards analysis
  4. Time-to-event analysis

The table is constructed by sequential use of the analyze() function, with four custom analysis functions corresponding to each of the four parts listed above. In addition the table includes referential footnotes relevant to the table contents. The table will be faceted column-wise by arm.

First we will start by loading the necessary packages and preparing the data to be used in the construction of this table.

library(survival)

adtte <- ex_adaette %>%
  dplyr::filter(PARAMCD == "AETTE2", SAFFL == "Y")

# Add censoring to data for example
adtte[adtte$AVAL > 1.0, ] <- adtte[adtte$AVAL > 1.0, ] %>% mutate(AVAL = 1.0, CNSR = 1)

adtte2 <- adtte %>%
  mutate(CNSDTDSC = ifelse(CNSDTDSC == "", "__none__", CNSDTDSC))

The adtte dataset will be used in preparing the models while the adtte2 dataset handles missing values in the “Censor Date Description” column and will be used to produce the final table. We add censoring into the data for example purposes.

Next we create a basic analysis function, a_count_subjs which prints the overall unique subject counts and percentages within the data.

a_count_subjs <- function(x, .N_col) {
  in_rows(
    "Subjects with Adverse Events n (%)" = rcell(length(unique(x)) * c(1, 1 / .N_col), format = "xx (xx.xx%)")
  )
}

Then an analysis function is created to generate the counts of censored subjects for each level of a factor variable in the dataset. In this case the cnsr_counter function will be applied with the CNSDTDSC variable which contains a censor date description for each censored subject.

cnsr_counter <- function(df, .var, .N_col) {
  x <- df[!duplicated(df$USUBJID), .var]
  x <- x[x != "__none__"]
  lapply(table(x), function(xi) rcell(xi * c(1, 1 / .N_col), format = "xx (xx.xx%)"))
}

This function generates counts and fractions of unique subjects corresponding to each factor level, excluding missing values (uncensored patients).

A Cox proportional-hazards (Cox P-H) analysis is generated next with a third custom analysis function, a_cph. Prior to creating the analysis function, the Cox P-H model is fit to our data using the coxph() and Surv() functions from the survival package. Then this model is used as input to the a_cph analysis function which returns hazard ratios, 95% confidence intervals, and p-values comparing against the reference group - in this case the leftmost column.

cph <- coxph(Surv(AVAL, CNSR == 0) ~ ACTARM + STRATA1, ties = "exact", data = adtte)

a_cph <- function(df, .var, .in_ref_col, .ref_full, full_cox_fit) {
  if (.in_ref_col) {
    ret <- replicate(3, list(rcell(NULL)))
  } else {
    curtrt <- df[[.var]][1]
    coefs <- coef(full_cox_fit)
    sel_pos <- grep(curtrt, names(coefs), fixed = TRUE)
    hrval <- exp(coefs[sel_pos])
    sdf <- survdiff(Surv(AVAL, CNSR == 0) ~ ACTARM + STRATA1, data = rbind(df, .ref_full))
    pval <- (1 - pchisq(sdf$chisq, length(sdf$n) - 1)) / 2
    ci_val <- exp(unlist(confint(full_cox_fit)[sel_pos, ]))
    ret <- list(
      rcell(hrval, format = "xx.x"),
      rcell(ci_val, format = "(xx.x, xx.x)"),
      rcell(pval, format = "x.xxxx | (<0.0001)")
    )
  }
  in_rows(
    .list = ret,
    .names = c("Hazard ratio", "95% confidence interval", "p-value (one-sided stratified log rank)")
  )
}

The fourth and final analysis function, a_tte, generates a time to first adverse event table with three rows corresponding to Median, 95% Confidence Interval, and Min Max respectively. First a survival table is constructed from the summary table of a survival model using the survfit() and Surv() functions from the survival package. This table is then given as input to a_tte which produces the table of time to first adverse event consisting of the previously mentioned summary statistics.

surv_tbl <- as.data.frame(
  summary(survfit(Surv(AVAL, CNSR == 0) ~ ACTARM, data = adtte, conf.type = "log-log"))$table
) %>%
  dplyr::mutate(
    ACTARM = factor(gsub("ACTARM=", "", row.names(.)), levels = levels(adtte$ACTARM)),
    ind = FALSE
  )

a_tte <- function(df, .var, kp_table) {
  ind <- grep(df[[.var]][1], row.names(kp_table), fixed = TRUE)
  minmax <- range(df[["AVAL"]])
  mm_val_str <- format_value(minmax, format = "xx.x, xx.x")
  rowfn <- list()
  if (all(df$CNSR[df$AVAL == minmax[2]])) {
    mm_val_str <- paste0(mm_val_str, "*")
    rowfn <- "* indicates censoring"
  }
  in_rows(
    Median = kp_table[ind, "median", drop = TRUE],
    "95% confidence interval" = unlist(kp_table[ind, c("0.95LCL", "0.95UCL")]),
    "Min Max" = mm_val_str,
    .formats = c("xx.xx", "xx.xx - xx.xx", "xx"),
    .row_footnotes = list(NULL, NULL, rowfn)
  )
}

Additionally, the a_tte function creates a referential footnote within the table to indicate where censoring occurred in the data.

Now we are able to use these four analysis functions to build our time to event analysis table.

lyt <- basic_table(show_colcounts = TRUE) %>%
  ## Column faceting
  split_cols_by("ARM", ref_group = "A: Drug X") %>%
  ## Overall count
  analyze("USUBJID", a_count_subjs, show_labels = "hidden") %>%
  ## Censored subjects summary
  analyze("CNSDTDSC", cnsr_counter, var_labels = "Censored Subjects", show_labels = "visible") %>%
  ## Cox P-H analysis
  analyze("ARM", a_cph, extra_args = list(full_cox_fit = cph), show_labels = "hidden") %>%
  ## Time-to-event analysis
  analyze(
    "ARM", a_tte,
    var_labels = "Time to first adverse event", show_labels = "visible",
    extra_args = list(kp_table = surv_tbl),
    table_names = "kapmeier"
  )

tbl_tte <- build_table(lyt, adtte2)

We set the show_colcounts argument of basic_table() to TRUE to first print the total subject counts for each column. Next we use split_cols_by() to split the table into three columns corresponding to the three different levels of ARM, and specify that the first arm, "A: Drug X" should act as the reference group to be compared against - this reference group is used for the Cox P-H analysis. Then we call analyze() sequentially using each of the four custom analysis functions as argument afun and specifying additional arguments where necessary. Then we use build_table() to construct our rtable using the adtte2 dataset.

Finally, we annotate the table using the fnotes_at_path() function to specify that product-limit estimates are used to calculate the statistics listed under the “Time to first adverse event” heading within the table. The referential footnote created earlier in the time-to-event analysis function (a_tte) is also displayed.

fnotes_at_path(
  tbl_tte,
  c("ma_USUBJID_CNSDTDSC_ARM_kapmeier", "kapmeier")
) <- "Product-limit (Kaplan-Meier) estimates."

tbl_tte
#                                             A: Drug X      B: Placebo     C: Combination
#                                              (N=134)         (N=134)         (N=132)    
# ————————————————————————————————————————————————————————————————————————————————————————
# Subjects with Adverse Events n (%)        134 (100.00%)   134 (100.00%)   132 (100.00%) 
# Censored Subjects                                                                       
#   Clinical Cut Off                          6 (4.48%)       3 (2.24%)      14 (10.61%)  
#   Completion or Discontinuation             9 (6.72%)       5 (3.73%)       9 (6.82%)   
#   End of AE Reporting Period               14 (10.45%)      7 (5.22%)      14 (10.61%)  
#   Preferred Term                           11 (8.21%)       5 (3.73%)       13 (9.85%)  
# Hazard ratio                                                   0.7             1.0      
# 95% confidence interval                                    (0.5, 0.9)       (0.8, 1.4)  
# p-value (one-sided stratified log rank)                      0.1070           0.4880    
# Time to first adverse event {1}                                                         
#   Median                                      0.23            0.39             0.29     
#   95% confidence interval                  0.18 - 0.33     0.29 - 0.49     0.22 - 0.35  
#   Min Max {2}                               0.0, 1.0*       0.0, 1.0*       0.0, 1.0*   
# ————————————————————————————————————————————————————————————————————————————————————————
# 
# {1} - Product-limit (Kaplan-Meier) estimates.
# {2} - * indicates censoring
# ————————————————————————————————————————————————————————————————————————————————————————