Split on static or dynamic cuts of the data — split_cols_by

Create columns (or row splits) based on values (such as quartiles) of var.

Usage

split_cols_by_cuts(
  lyt,
  var,
  cuts,
  cutlabels = NULL,
  split_label = var,
  nested = TRUE,
  cumulative = FALSE,
  show_colcounts = FALSE,
  colcount_format = NULL
)

split_rows_by_cuts(
  lyt,
  var,
  cuts,
  cutlabels = NULL,
  split_label = var,
  parent_name = var,
  format = NULL,
  na_str = NA_character_,
  nested = TRUE,
  cumulative = FALSE,
  label_pos = "hidden",
  section_div = NA_character_
)

split_cols_by_cutfun(
  lyt,
  var,
  cutfun = qtile_cuts,
  cutlabelfun = function(x) NULL,
  split_label = var,
  nested = TRUE,
  extra_args = list(),
  cumulative = FALSE,
  show_colcounts = FALSE,
  colcount_format = NULL
)

split_cols_by_quartiles(
  lyt,
  var,
  split_label = var,
  nested = TRUE,
  extra_args = list(),
  cumulative = FALSE,
  show_colcounts = FALSE,
  colcount_format = NULL
)

split_rows_by_quartiles(
  lyt,
  var,
  split_label = var,
  parent_name = var,
  format = NULL,
  na_str = NA_character_,
  nested = TRUE,
  child_labels = c("default", "visible", "hidden"),
  extra_args = list(),
  cumulative = FALSE,
  indent_mod = 0L,
  label_pos = "hidden",
  section_div = NA_character_
)

split_rows_by_cutfun(
  lyt,
  var,
  cutfun = qtile_cuts,
  cutlabelfun = function(x) NULL,
  split_label = var,
  parent_name = var,
  format = NULL,
  na_str = NA_character_,
  nested = TRUE,
  child_labels = c("default", "visible", "hidden"),
  extra_args = list(),
  cumulative = FALSE,
  indent_mod = 0L,
  label_pos = "hidden",
  section_div = NA_character_
)

Arguments

lyt: (PreDataTableLayouts)
layout object pre-data used for tabulation.
var: (string)
variable name.
cuts: (numeric)
cuts to use.
cutlabels: (character or NULL)
labels for the cuts.
split_label: (string)
label to be associated with the table generated by the split. Not to be confused with labels assigned to each child (which are based on the data and type of split during tabulation).
nested: (logical)
whether this layout instruction should be applied within the existing layout structure if possible (TRUE, the default) or as a new top-level element (FALSE). Ignored if it would nest a split underneath analyses, which is not allowed.
cumulative: (flag)
whether the cuts should be treated as cumulative. Defaults to FALSE.
show_colcounts: (logical(1))
should column counts be displayed at the level facets created by this split. Defaults to FALSE.
colcount_format: (character(1))
if show_colcounts is TRUE, the format which should be used to display column counts for facets generated by this split. Defaults to "(N=xx)".
parent_name: (character(1))
Name to assign to the table corresponding to the split or group of sibling analyses, for split_rows_by* and analyze* when analyzing more than one variable, respectively. Ignored when analyzing a single variable.
format: (string, function, or list)
format associated with this split. Formats can be declared via strings ("xx.x") or function. In cases such as analyze calls, they can be character vectors or lists of functions. See formatters::list_valid_format_labels() for a list of all available format strings.
na_str: (string)
string that should be displayed when the value of x is missing. Defaults to "NA".
label_pos: (string)
location where the variable label should be displayed. Accepts "hidden" (default for non-analyze row splits), "visible", "topleft", and "default" (for analyze splits only). For analyze calls, "default" indicates that the variable should be visible if and only if multiple variables are analyzed at the same level of nesting.
section_div: (string)
string which should be repeated as a section divider after each group defined by this split instruction, or NA_character_ (the default) for no section divider.
cutfun: (function)
function which accepts the full vector of var values and returns cut points to be used (via cut) when splitting data during tabulation.
cutlabelfun: (function)
function which returns either labels for the cuts or NULL when passed the return value of cutfun.
extra_args: (list)
extra arguments to be passed to the tabulation function. Element position in the list corresponds to the children of this split. Named elements in the child-specific lists are ignored if they do not match a formal argument of the tabulation function.
child_labels: (string)
the display behavior for the labels (i.e. label rows) of the children of this split. Accepts "default", "visible", and "hidden". Defaults to "default" which flags the label row as visible only if the child has 0 content rows.
indent_mod: (numeric)
modifier for the default indent position for the structure created by this function (subtable, content table, or row) and all of that structure's children. Defaults to 0, which corresponds to the unmodified default behavior.

Value

A PreDataTableLayouts object suitable for passing to further layouting functions, and to build_table().

Details

For dynamic cuts, the cut is transformed into a static cut by build_table() based on the full dataset, before proceeding. Thus even when nested within another split in column/row space, the resulting split will reflect the overall values (e.g., quartiles) in the dataset, NOT the values for subset it is nested under.

Author

Gabriel Becker

Examples

library(dplyr)

# split_cols_by_cuts
lyt <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_cols_by_cuts("AGE",
    split_label = "Age",
    cuts = c(0, 25, 35, 1000),
    cutlabels = c("young", "medium", "old")
  ) %>%
  analyze(c("BMRKR2", "STRATA2")) %>%
  append_topleft("counts")

tbl <- build_table(lyt, ex_adsl)
tbl
#>                 A: Drug X              B: Placebo           C: Combination   
#> counts     young   medium   old   young   medium   old   young   medium   old
#> —————————————————————————————————————————————————————————————————————————————
#> BMRKR2                                                                       
#>   LOW        4       30     16      4       17     24      5       19     16 
#>   MEDIUM     6       12     19      2       28     26      4       25     13 
#>   HIGH       4       24     19      2       17     14      1       21     28 
#> STRATA2                                                                      
#>   S1         8       33     32      2       27     38      5       25     26 
#>   S2         6       33     22      6       35     26      5       40     31 

# split_rows_by_cuts
lyt2 <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_rows_by_cuts("AGE",
    split_label = "Age",
    cuts = c(0, 25, 35, 1000),
    cutlabels = c("young", "medium", "old")
  ) %>%
  analyze(c("BMRKR2", "STRATA2")) %>%
  append_topleft("counts")


tbl2 <- build_table(lyt2, ex_adsl)
tbl2
#> counts       A: Drug X   B: Placebo   C: Combination
#> ————————————————————————————————————————————————————
#> young                                               
#>   BMRKR2                                            
#>     LOW          4           4              5       
#>     MEDIUM       6           2              4       
#>     HIGH         4           2              1       
#>   STRATA2                                           
#>     S1           8           2              5       
#>     S2           6           6              5       
#> medium                                              
#>   BMRKR2                                            
#>     LOW         30           17             19      
#>     MEDIUM      12           28             25      
#>     HIGH        24           17             21      
#>   STRATA2                                           
#>     S1          33           27             25      
#>     S2          33           35             40      
#> old                                                 
#>   BMRKR2                                            
#>     LOW         16           24             16      
#>     MEDIUM      19           26             13      
#>     HIGH        19           14             28      
#>   STRATA2                                           
#>     S1          32           38             26      
#>     S2          22           26             31      

# split_cols_by_quartiles

lyt3 <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_cols_by_quartiles("AGE", split_label = "Age") %>%
  analyze(c("BMRKR2", "STRATA2")) %>%
  append_topleft("counts")

tbl3 <- build_table(lyt3, ex_adsl)
tbl3
#>                             A: Drug X                                    B: Placebo                                  C: Combination               
#> counts     [min, Q1]   (Q1, Q2]   (Q2, Q3]   (Q3, max]   [min, Q1]   (Q1, Q2]   (Q2, Q3]   (Q3, max]   [min, Q1]   (Q1, Q2]   (Q2, Q3]   (Q3, max]
#> ——————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> BMRKR2                                                                                                                                            
#>   LOW         18          16         7           9          12          8          10         15           8          11         13          8    
#>   MEDIUM      11          7          9          10          14          15         14         13          12          13         7          10    
#>   HIGH        14          11         14          8           6          10         9           8           7          12         13         18    
#> STRATA2                                                                                                                                           
#>   S1          22          18         18         15          15          11         22         19          11          14         12         19    
#>   S2          21          16         12         12          17          22         11         17          16          22         21         17    

# split_rows_by_quartiles
lyt4 <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARM") %>%
  split_rows_by_quartiles("AGE", split_label = "Age") %>%
  analyze("BMRKR2") %>%
  append_topleft(c("Age Quartiles", " Counts BMRKR2"))

tbl4 <- build_table(lyt4, ex_adsl)
tbl4
#> Age Quartiles    A: Drug X   B: Placebo   C: Combination
#>  Counts BMRKR2    (N=134)     (N=134)        (N=132)    
#> ————————————————————————————————————————————————————————
#> Age                                                     
#>   [min, Q1]                                             
#>     LOW             18           12             8       
#>     MEDIUM          11           14             12      
#>     HIGH            14           6              7       
#>   (Q1, Q2]                                              
#>     LOW             16           8              11      
#>     MEDIUM           7           15             13      
#>     HIGH            11           10             12      
#>   (Q2, Q3]                                              
#>     LOW              7           10             13      
#>     MEDIUM           9           14             7       
#>     HIGH            14           9              13      
#>   (Q3, max]                                             
#>     LOW              9           15             8       
#>     MEDIUM          10           13             10      
#>     HIGH             8           8              18      

# split_cols_by_cutfun
cutfun <- function(x) {
  cutpoints <- c(
    min(x),
    mean(x),
    max(x)
  )

  names(cutpoints) <- c("", "Younger", "Older")
  cutpoints
}

lyt5 <- basic_table() %>%
  split_cols_by_cutfun("AGE", cutfun = cutfun) %>%
  analyze("SEX")

tbl5 <- build_table(lyt5, ex_adsl)
tbl5
#>                    Younger   Older
#> ——————————————————————————————————
#> F                    124      98  
#> M                    75       91  
#> U                     5        4  
#> UNDIFFERENTIATED      1        2  

# split_rows_by_cutfun
lyt6 <- basic_table() %>%
  split_cols_by("SEX") %>%
  split_rows_by_cutfun("AGE", cutfun = cutfun) %>%
  analyze("BMRKR2")

tbl6 <- build_table(lyt6, ex_adsl)
tbl6
#>              F    M    U   UNDIFFERENTIATED
#> ———————————————————————————————————————————
#> AGE                                        
#>   Younger                                  
#>     LOW      43   26   3          1        
#>     MEDIUM   47   23   2          0        
#>     HIGH     34   26   0          0        
#>   Older                                    
#>     LOW      30   29   1          2        
#>     MEDIUM   29   33   1          0        
#>     HIGH     39   29   2          0