Split functions — split_funcs • rtables

Split functions

Usage

remove_split_levels(excl)

keep_split_levels(only, reorder = TRUE)

drop_split_levels(df, spl, vals = NULL, labels = NULL, trim = FALSE)

drop_and_remove_levels(excl)

reorder_split_levels(neworder, newlabels = neworder, drlevels = TRUE)

trim_levels_in_group(innervar, drop_outlevs = TRUE)

Arguments

excl: character. Levels to be excluded (they will not be reflected in the resulting table structure regardless of presence in the data).
only: character. Levels to retain (all others will be dropped).
reorder: logical(1). Should the order of only be used as the order of the children of the split. defaults to TRUE
df: dataset (data.frame or tibble)
spl: A Split object defining a partitioning or analysis/tabulation of the data.
vals: ANY. For internal use only.
labels: character. Labels to use for the remaining levels instead of the existing ones.
trim: logical(1). Should splits corresponding with 0 observations be kept when tabulating.
neworder: character. New order or factor levels.
newlabels: character. Labels for (new order of) factor levels
drlevels: logical(1). Should levels in the data which do not appear in neworder be dropped. Defaults to TRUE
innervar: character(1). Variable whose factor levels should be trimmed (e.g., empty levels dropped) separately within each grouping defined at this point in the structure
drop_outlevs: logical(1). Should empty levels in the variable being split on (i.e. the 'outer' variable, not innervar) be dropped? Defaults to TRUE

Value

a closure suitable for use as a splitting function (splfun) when creating a table layout

Custom Splitting Function Details

User-defined custom split functions can perform any type of computation on the incoming data provided that they meet the contract for generating 'splits' of the incoming data 'based on' the split object.

Split functions are functions that accept:

df: data.frame of incoming data to be split
spl: a Split object. this is largely an internal detail custom functions will not need to worry about, but obj_name(spl), for example, will give the name of the split as it will appear in paths in the resulting table
vals: Any pre-calculated values. If given non-null values, the values returned should match these. Should be NULL in most cases and can likely be ignored
labels: Any pre-calculated value labels. Same as above for values
trim: If TRUE, resulting splits that are empty should be removed
(Optional) .spl_context: a data.frame describing previously performed splits which collectively arrived at df

The function must then output a named list with the following elements:

values: The vector of all values corresponding to the splits of df
datasplit: a list of data.frames representing the groupings of the actual observations from df.
labels: a character vector giving a string label for each value listed in the values element above
(Optional) extras: If present, extra arguments are to be passed to summary and analysis functions whenever they are executed on the corresponding element of datasplit or a subset thereof

One way to generate custom splitting functions is to wrap existing split functions and modify either the incoming data before they are called or their outputs.

Examples

lyt <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_rows_by("COUNTRY",
                split_fun = remove_split_levels(c("USA", "CAN",
                                                  "CHE", "BRA"))) %>%
  analyze("AGE")

tbl <- build_table(lyt, DM)
tbl
#>          A: Drug X   B: Placebo   C: Combination
#> ————————————————————————————————————————————————
#> CHN                                             
#>   Mean     36.08       34.12          33.71     
#> PAK                                             
#>   Mean     35.38       33.12          36.75     
#> NGA                                             
#>   Mean     31.20       31.40          35.78     
#> RUS                                             
#>   Mean     33.33       34.20          33.00     
#> JPN                                             
#>   Mean     31.20       32.50          36.20     
#> GBR                                             
#>   Mean     32.00       29.00          30.00     

lyt <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_rows_by("COUNTRY",
                split_fun = keep_split_levels(c("USA", "CAN", "BRA"))) %>%
  analyze("AGE")

tbl <- build_table(lyt, DM)
tbl
#>          A: Drug X   B: Placebo   C: Combination
#> ————————————————————————————————————————————————
#> USA                                             
#>   Mean     36.77       32.57          36.41     
#> CAN                                             
#>   Mean     36.00       34.00          29.50     
#> BRA                                             
#>   Mean     31.78       30.62          36.14     
lyt <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_rows_by("SEX", split_fun = drop_split_levels) %>%
  analyze("AGE")

tbl <- build_table(lyt, DM)
tbl
#>          A: Drug X   B: Placebo   C: Combination
#> ————————————————————————————————————————————————
#> F                                               
#>   Mean     33.71       33.84          34.89     
#> M                                               
#>   Mean     36.55       32.10          34.28     
lyt <- basic_table() %>%
  split_cols_by("ARM") %>%
  split_rows_by("SEX", split_fun = drop_and_remove_levels(c("M", "U"))) %>%
  analyze("AGE")

tbl <- build_table(lyt, DM)
tbl
#>          A: Drug X   B: Placebo   C: Combination
#> ————————————————————————————————————————————————
#> F                                               
#>   Mean     33.71       33.84          34.89