This is a collection of useful, default split function that can help you in dividing the data, hence the
table rows or columns, into different parts or groups (splits). You can also create your own split function if you
need to create a custom division as specific as you need. Please consider reading custom_split_funs if
this is the case. Beyond this list of functions, you can also use add_overall_level() and add_combo_levels()
for adding or modifying levels and trim_levels_to_map() to provide possible level combinations to filter the split
with.
Usage
keep_split_levels(only, reorder = TRUE)
remove_split_levels(excl)
drop_split_levels(df, spl, vals = NULL, labels = NULL, trim = FALSE)
drop_and_remove_levels(excl)
reorder_split_levels(neworder, newlabels = neworder, drlevels = TRUE)
trim_levels_in_group(innervar, drop_outlevs = TRUE)Arguments
- only
- ( - character)
 levels to retain (all others will be dropped). If none of the levels is present an empty table is returned.
- reorder
- ( - flag)
 whether the order of- onlyshould be used as the order of the children of the split. Defaults to- TRUE.
- excl
- ( - character)
 levels to be excluded (they will not be reflected in the resulting table structure regardless of presence in the data).
- df
- ( - data.frameor- tibble)
 dataset.
- spl
- ( - Split)
 a- Splitobject defining a partitioning or analysis/tabulation of the data.
- vals
- ( - ANY)
 for internal use only.
- labels
- ( - character)
 labels to use for the remaining levels instead of the existing ones.
- trim
- ( - flag)
 whether splits corresponding with 0 observations should be kept when tabulating.
- neworder
- ( - character)
 new order of factor levels. All need to be present in the data. To add empty levels, rely on pre-processing or create your custom_split_funs.
- newlabels
- ( - character)
 labels for (new order of) factor levels. If named, the levels are matched. Otherwise, the order of- neworderis used.
- drlevels
- ( - flag)
 whether levels that are not in- newordershould be dropped. Default is- TRUE. Note:- drlevels = TRUEdoes not drop levels that are not originally in the data. Rely on pre-processing or use a combination of split functions with- make_split_fun()to also drop unused levels.
- innervar
- ( - string)
 variable whose factor levels should be trimmed (e.g. empty levels dropped) separately within each grouping defined at this point in the structure.
- drop_outlevs
- ( - flag)
 whether empty levels in the variable being split on (i.e. the "outer" variable, not- innervar) should be dropped. Defaults to- TRUE.
Value
A function that can be used to split the data accordingly. The actual function signature is similar to the one you can define when creating a fully custom one. For more details see custom_split_funs.
Functions
- keep_split_levels(): keeps only specified levels (- only) in the split variable. If any of the specified levels is not present, an error is returned.- reorder = TRUE(the default) orders the split levels according to the order of- only.
- remove_split_levels(): Removes specified levels (- excl) from the split variable. Nothing done if not in data.
- drop_split_levels(): Drops levels that have no representation in the data.
- drop_and_remove_levels(): Removes specified levels- excland drops all levels that are not in the data.
- reorder_split_levels(): Reorders split levels following- neworder, which needs to be of same size as the levels in data.
- trim_levels_in_group(): Takes the split groups and removes levels of- innervarif not present in those split groups. If you want to specify a filter of possible combinations, please consider using- trim_levels_to_map().
Note
The following parameters are also documented here but they are only the default
signature of a split function: df (data to be split), spl (split object), and vals = NULL,
labels = NULL, trim = FALSE (last three only for internal use). See custom_split_funs for more details
and make_split_fun() for a more advanced API.
Examples
# keep_split_levels keeps specified levels (reorder = TRUE by default)
lyt <- basic_table() %>%
  split_rows_by("COUNTRY",
    split_fun = keep_split_levels(c("USA", "CAN", "BRA"))
  ) %>%
  analyze("AGE")
tbl <- build_table(lyt, DM)
tbl
#>          all obs
#> ————————————————
#> USA             
#>   Mean    35.30 
#> CAN             
#>   Mean    33.57 
#> BRA             
#>   Mean    32.31 
# remove_split_levels removes specified split levels
lyt <- basic_table() %>%
  split_rows_by("COUNTRY",
    split_fun = remove_split_levels(c(
      "USA", "CAN",
      "CHE", "BRA"
    ))
  ) %>%
  analyze("AGE")
tbl <- build_table(lyt, DM)
tbl
#>          all obs
#> ————————————————
#> CHN             
#>   Mean    34.64 
#> PAK             
#>   Mean    35.32 
#> NGA             
#>   Mean    32.96 
#> RUS             
#>   Mean    33.45 
#> JPN             
#>   Mean    33.17 
#> GBR             
#>   Mean    30.14 
# drop_split_levels drops levels that are not present in the data
lyt <- basic_table() %>%
  split_rows_by("SEX", split_fun = drop_split_levels) %>%
  analyze("AGE")
tbl <- build_table(lyt, DM)
tbl
#>          all obs
#> ————————————————
#> F               
#>   Mean    34.13 
#> M               
#>   Mean    34.32 
# Removing "M" and "U" directly, then "UNDIFFERENTIATED" because not in data
lyt <- basic_table() %>%
  split_rows_by("SEX", split_fun = drop_and_remove_levels(c("M", "U"))) %>%
  analyze("AGE")
tbl <- build_table(lyt, DM)
tbl
#>          all obs
#> ————————————————
#> F               
#>   Mean    34.13 
# Reordering levels in split variable
lyt <- basic_table() %>%
  split_rows_by(
    "SEX",
    split_fun = reorder_split_levels(
      neworder = c("U", "F"),
      newlabels = c(U = "Uu", `F` = "Female")
    )
  ) %>%
  analyze("AGE")
tbl <- build_table(lyt, DM)
tbl
#>          all obs
#> ————————————————
#> Uu              
#>   Mean     NA   
#> Female          
#>   Mean    34.13 
# Reordering levels in split variable but keeping all the levels
lyt <- basic_table() %>%
  split_rows_by(
    "SEX",
    split_fun = reorder_split_levels(
      neworder = c("U", "F"),
      newlabels = c("Uu", "Female"),
      drlevels = FALSE
    )
  ) %>%
  analyze("AGE")
tbl <- build_table(lyt, DM)
tbl
#>                    all obs
#> ——————————————————————————
#> Uu                        
#>   Mean               NA   
#> Female                    
#>   Mean              34.13 
#> M                         
#>   Mean              34.32 
#> UNDIFFERENTIATED          
#>   Mean               NA   
# trim_levels_in_group() trims levels within each group defined by the split variable
dat <- data.frame(
  col1 = factor(c("A", "B", "C"), levels = c("A", "B", "C", "N")),
  col2 = factor(c("a", "b", "c"), levels = c("a", "b", "c", "x"))
) # N is removed if drop_outlevs = TRUE, x is removed always
tbl <- basic_table() %>%
  split_rows_by("col1", split_fun = trim_levels_in_group("col2")) %>%
  analyze("col2") %>%
  build_table(dat)
tbl
#>       all obs
#> —————————————
#> A            
#>   a      1   
#> B            
#>   b      1   
#> C            
#>   c      1