Skip to contents

Will generate children for each subset of a categorical variable

Usage

split_cols_by(
  lyt,
  var,
  labels_var = var,
  split_label = var,
  split_fun = NULL,
  format = NULL,
  nested = TRUE,
  child_labels = c("default", "visible", "hidden"),
  extra_args = list(),
  ref_group = NULL
)

Arguments

lyt

layout object pre-data used for tabulation

var

string, variable name

labels_var

string, name of variable containing labels to be displayed for the values of var

split_label

string. Label string to be associated with the table generated by the split. Not to be confused with labels assigned to each child (which are based on the data and type of split during tabulation).

split_fun

function/NULL. custom splitting function See custom_split_funs

format

FormatSpec. Format associated with this split. Formats can be declared via strings ("xx.x") or function. In cases such as analyze calls, they can character vectors or lists of functions.

nested

boolean. Should this layout instruction be applied within the existing layout structure if possible (TRUE, the default) or as a new top-level element (`FALSE). Ignored if it would nest a split underneath analyses, which is not allowed.

child_labels

string. One of "default", "visible", "hidden". What should the display behavior be for the labels (i.e. label rows) of the children of this split. Defaults to "default" which flags the label row as visible only if the child has 0 content rows.

extra_args

list. Extra arguments to be passed to the tabulation function. Element position in the list corresponds to the children of this split. Named elements in the child-specific lists are ignored if they do not match a formal argument of the tabulation function.

ref_group

character(1) or NULL. Level of var which should be considered ref_group/reference

Value

A PreDataTableLayouts object suitable for passing to further layouting functions, and to build_table.

Custom Splitting Function Details

User-defined custom split functions can perform any type of computation on the incoming data provided that they meet the contract for generating 'splits' of the incoming data 'based on' the split object.

Split functions are functions that accept:

df

data.frame of incoming data to be split

spl

a Split object. this is largely an internal detail custom functions will not need to worry about, but obj_name(spl), for example, will give the name of the split as it will appear in paths in the resulting table

vals

Any pre-calculated values. If given non-null values, the values returned should match these. Should be NULL in most cases and can likely be ignored

labels

Any pre-calculated value labels. Same as above for values

trim

If TRUE, resulting splits that are empty should be removed

(Optional) .spl_context

a data.frame describing previously performed splits which collectively arrived at df

The function must then output a named list with the following elements:

values

The vector of all values corresponding to the splits of df

datasplit

a list of data.frames representing the groupings of the actual observations from df.

labels

a character vector giving a string label for each value listed in the values element above

(Optional) extras

If present, extra arguments are to be passed to summary and analysis functions whenever they are executed on the corresponding element of datasplit or a subset thereof

One way to generate custom splitting functions is to wrap existing split functions and modify either the incoming data before they are called or their outputs.

Author

Gabriel Becker

Examples


lyt <- basic_table() %>%
  split_cols_by("ARM") %>%
  analyze(c("AGE", "BMRKR2"))

tbl <- build_table(lyt, ex_adsl)
tbl
#>            A: Drug X   B: Placebo   C: Combination
#> ——————————————————————————————————————————————————
#> AGE                                               
#>   Mean       33.77       35.43          35.43     
#> BMRKR2                                            
#>   LOW         50           45             40      
#>   MEDIUM      37           56             42      
#>   HIGH        47           33             50      

# Let's look at the splits in more detail

lyt1 <- basic_table() %>% split_cols_by("ARM")
lyt1
#> A Pre-data Table Layout
#> 
#> Column-Split Structure:
#> ARM (lvls) 
#> 
#> Row-Split Structure:
#>  () 
#> 

# add an analysis (summary)
lyt2 <- lyt1 %>%
  analyze(c("AGE", "COUNTRY"),
    afun = list_wrap_x(summary),
    format = "xx.xx"
  )
lyt2
#> A Pre-data Table Layout
#> 
#> Column-Split Structure:
#> ARM (lvls) 
#> 
#> Row-Split Structure:
#> AGE:COUNTRY (** multivar analysis **) 
#> 

tbl2 <- build_table(lyt2, DM)
tbl2
#>             A: Drug X   B: Placebo   C: Combination
#> ———————————————————————————————————————————————————
#> AGE                                                
#>   Min.        20.00       21.00          22.00     
#>   1st Qu.     29.00       29.00          30.00     
#>   Median      33.00       32.00          33.00     
#>   Mean        34.91       33.02          34.57     
#>   3rd Qu.     39.00       37.00          38.00     
#>   Max.        60.00       55.00          53.00     
#> COUNTRY                                            
#>   CHN         62.00       48.00          69.00     
#>   USA         13.00       14.00          17.00     
#>   BRA         9.00        13.00           7.00     
#>   PAK         8.00         8.00          12.00     
#>   NGA         10.00        5.00           9.00     
#>   RUS         9.00         5.00           6.00     
#>   JPN         5.00         8.00           5.00     
#>   GBR         2.00         3.00           2.00     
#>   CAN         3.00         2.00           2.00     
#>   CHE         0.00         0.00           0.00     

# By default sequentially adding layouts results in nesting
library(dplyr)
DM_MF <- DM %>%
  filter(SEX %in% c("M", "F")) %>%
  mutate(SEX = droplevels(SEX))

lyt3 <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_cols_by("SEX") %>%
  analyze(c("AGE", "COUNTRY"),
    afun = list_wrap_x(summary),
    format = "xx.xx"
  )
lyt3
#> A Pre-data Table Layout
#> 
#> Column-Split Structure:
#> ARM (lvls) -> SEX (lvls) 
#> 
#> Row-Split Structure:
#> AGE:COUNTRY (** multivar analysis **) 
#> 

tbl3 <- build_table(lyt3, DM_MF)
tbl3
#>               A: Drug X      B: Placebo      C: Combination  
#>               F       M       F       M        F         M   
#> —————————————————————————————————————————————————————————————
#> AGE                                                          
#>   Min.      20.00   24.00   21.00   21.00    22.00     25.00 
#>   1st Qu.   29.00   31.00   29.00   28.00    30.00     29.00 
#>   Median    32.00   35.00   33.00   31.00    35.00     32.00 
#>   Mean      33.71   36.55   33.84   32.10    34.89     34.28 
#>   3rd Qu.   38.00   41.50   38.00   35.75    39.00     38.00 
#>   Max.      58.00   60.00   55.00   47.00    53.00     53.00 
#> COUNTRY                                                      
#>   CHN       34.00   28.00   29.00   19.00    31.00     38.00 
#>   USA       8.00    5.00    6.00    8.00     10.00     7.00  
#>   BRA       6.00    3.00    6.00    7.00     3.00      4.00  
#>   PAK       2.00    6.00    5.00    3.00     5.00      7.00  
#>   NGA       6.00    4.00    2.00    3.00     5.00      4.00  
#>   RUS       7.00    2.00    1.00    4.00     2.00      4.00  
#>   JPN       2.00    3.00    3.00    5.00     4.00      1.00  
#>   GBR       2.00    0.00    3.00    0.00     1.00      1.00  
#>   CAN       3.00    0.00    1.00    1.00     0.00      2.00  
#>   CHE       0.00    0.00    0.00    0.00     0.00      0.00  

# nested=TRUE vs not
lyt4 <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_rows_by("SEX", split_fun = drop_split_levels) %>%
  split_rows_by("RACE", split_fun = drop_split_levels) %>%
  analyze("AGE")
lyt4
#> A Pre-data Table Layout
#> 
#> Column-Split Structure:
#> ARM (lvls) 
#> 
#> Row-Split Structure:
#> SEX (lvls) -> RACE (lvls) -> AGE (** analysis **) 
#> 

tbl4 <- build_table(lyt4, DM)
tbl4
#>                               A: Drug X   B: Placebo   C: Combination
#> —————————————————————————————————————————————————————————————————————
#> F                                                                    
#>   ASIAN                                                              
#>     Mean                        33.55       34.00          34.90     
#>   BLACK OR AFRICAN AMERICAN                                          
#>     Mean                        33.17       30.58          33.85     
#>   WHITE                                                              
#>     Mean                        35.88       38.57          36.50     
#> M                                                                    
#>   ASIAN                                                              
#>     Mean                        35.03       31.10          34.39     
#>   BLACK OR AFRICAN AMERICAN                                          
#>     Mean                        37.40       32.83          34.14     
#>   WHITE                                                              
#>     Mean                        44.00       35.29          34.00     

lyt5 <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_rows_by("SEX", split_fun = drop_split_levels) %>%
  analyze("AGE") %>%
  split_rows_by("RACE", nested = FALSE, split_fun = drop_split_levels) %>%
  analyze("AGE")
lyt5
#> A Pre-data Table Layout
#> 
#> Column-Split Structure:
#> ARM (lvls) 
#> 
#> Row-Split Structure:
#> SEX (lvls) -> AGE (** analysis **) 
#> RACE (lvls) -> AGE (** analysis **) 
#> 

tbl5 <- build_table(lyt5, DM)
tbl5
#>                             A: Drug X   B: Placebo   C: Combination
#> ———————————————————————————————————————————————————————————————————
#> F                                                                  
#>   Mean                        33.71       33.84          34.89     
#> M                                                                  
#>   Mean                        36.55       32.10          34.28     
#> ASIAN                                                              
#>   Mean                        34.20       32.68          34.63     
#> BLACK OR AFRICAN AMERICAN                                          
#>   Mean                        34.68       31.71          34.00     
#> WHITE                                                              
#>   Mean                        39.36       36.93          35.11