Add Rows according to levels of a variable
Usage
split_rows_by(
lyt,
var,
labels_var = var,
split_label = var,
split_fun = NULL,
format = NULL,
na_str = NA_character_,
nested = TRUE,
child_labels = c("default", "visible", "hidden"),
label_pos = "hidden",
indent_mod = 0L,
page_by = FALSE,
page_prefix = split_label,
section_div = NA_character_
)
Arguments
- lyt
layout object pre-data used for tabulation
- var
string, variable name
- labels_var
string, name of variable containing labels to be displayed for the values of
var
- split_label
string. Label string to be associated with the table generated by the split. Not to be confused with labels assigned to each child (which are based on the data and type of split during tabulation).
- split_fun
function/NULL. custom splitting function See
custom_split_funs
- format
FormatSpec
. Format associated with this split. Formats can be declared via strings ("xx.x"
) or function. In cases such asanalyze
calls, they can character vectors or lists of functions.- na_str
character(1). String that should be displayed when the value of
x
is missing. Defaults to"NA"
.- nested
boolean. Should this layout instruction be applied within the existing layout structure if possible (
TRUE
, the default) or as a new top-level element (`FALSE). Ignored if it would nest a split underneath analyses, which is not allowed.- child_labels
string. One of
"default"
,"visible"
,"hidden"
. What should the display behavior be for the labels (i.e. label rows) of the children of this split. Defaults to"default"
which flags the label row as visible only if the child has 0 content rows.- label_pos
character(1). Location the variable label should be displayed, Accepts
"hidden"
(default for non-analyze row splits),"visible"
,"topleft"
, and - for analyze splits only -"default"
. For analyze calls,"default"
indicates that the variable should be visible if and only if multiple variables are analyzed at the same level of nesting.- indent_mod
numeric. Modifier for the default indent position for the structure created by this function(subtable, content table, or row) and all of that structure's children. Defaults to 0, which corresponds to the unmodified default behavior.
- page_by
logical(1). Should pagination be forced between different children resulting form this split. An error will rise if the selected split does not contain at least one value that is not
NA
.- page_prefix
character(1). Prefix, to be appended with the split value, when forcing pagination between the children of this split/table
- section_div
character(1). String which should be repeated as a section divider after each group defined by this split instruction, or
NA_character_
(the default) for no section divider.
Value
A PreDataTableLayouts
object suitable for passing to further
layouting functions, and to build_table
.
Note
If var
is a factor with empty unobserved levels and
labels_var
is specified, it must also be a factor
with the same number of levels as var
. Currently the
error that occurs when this is not the case is not very informative,
but that will change in the future.
Custom Splitting Function Details
User-defined custom split functions can perform any type of computation on the incoming data provided that they meet the contract for generating 'splits' of the incoming data 'based on' the split object.
Split functions are functions that accept:
- df
data.frame of incoming data to be split
- spl
a Split object. this is largely an internal detail custom functions will not need to worry about, but
obj_name(spl)
, for example, will give the name of the split as it will appear in paths in the resulting table- vals
Any pre-calculated values. If given non-null values, the values returned should match these. Should be NULL in most cases and can likely be ignored
- labels
Any pre-calculated value labels. Same as above for
values
- trim
If
TRUE
, resulting splits that are empty should be removed- (Optional) .spl_context
a data.frame describing previously performed splits which collectively arrived at
df
The function must then output a named list
with the following
elements:
- values
The vector of all values corresponding to the splits of
df
- datasplit
a list of data.frames representing the groupings of the actual observations from
df
.- labels
a character vector giving a string label for each value listed in the
values
element above- (Optional) extras
If present, extra arguments are to be passed to summary and analysis functions whenever they are executed on the corresponding element of
datasplit
or a subset thereof
One way to generate custom splitting functions is to wrap existing split functions and modify either the incoming data before they are called or their outputs.
Examples
lyt <- basic_table() %>%
split_cols_by("ARM") %>%
split_rows_by("RACE", split_fun = drop_split_levels) %>%
analyze("AGE", mean, var_labels = "Age", format = "xx.xx")
tbl <- build_table(lyt, DM)
tbl
#> A: Drug X B: Placebo C: Combination
#> ———————————————————————————————————————————————————————————————————
#> ASIAN
#> mean 34.20 32.68 34.63
#> BLACK OR AFRICAN AMERICAN
#> mean 34.68 31.71 34.00
#> WHITE
#> mean 39.36 36.93 35.11
lyt2 <- basic_table() %>%
split_cols_by("ARM") %>%
split_rows_by("RACE") %>%
analyze("AGE", mean, var_labels = "Age", format = "xx.xx")
tbl2 <- build_table(lyt2, DM)
tbl2
#> A: Drug X B: Placebo C: Combination
#> ———————————————————————————————————————————————————————————————————————————————————
#> ASIAN
#> mean 34.20 32.68 34.63
#> BLACK OR AFRICAN AMERICAN
#> mean 34.68 31.71 34.00
#> WHITE
#> mean 39.36 36.93 35.11
#> AMERICAN INDIAN OR ALASKA NATIVE
#> mean NA NA NA
#> MULTIPLE
#> mean NA NA NA
#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER
#> mean NA NA NA
#> OTHER
#> mean NA NA NA
#> UNKNOWN
#> mean NA NA NA
lyt3 <- basic_table() %>%
split_cols_by("ARM") %>%
split_cols_by("SEX") %>%
summarize_row_groups(label_fstr = "Overall (N)") %>%
split_rows_by("RACE", split_label = "Ethnicity", labels_var = "ethn_lab",
split_fun = drop_split_levels) %>%
summarize_row_groups("RACE", label_fstr = "%s (n)") %>%
analyze("AGE", var_labels = "Age", afun = mean, format = "xx.xx")
lyt3
#> A Pre-data Table Layout
#>
#> Column-Split Structure:
#> ARM (lvls) -> SEX (lvls)
#>
#> Row-Split Structure:
#> RACE (lvls) -> AGE (** analysis **)
#>
library(dplyr)
DM2 <- DM %>%
filter(SEX %in% c("M", "F")) %>%
mutate(
SEX = droplevels(SEX),
gender_lab = c("F" = "Female", "M" = "Male",
"U" = "Unknown",
"UNDIFFERENTIATED" = "Undifferentiated")[SEX],
ethn_lab = c(
"ASIAN" = "Asian",
"BLACK OR AFRICAN AMERICAN" = "Black or African American",
"WHITE" = "White",
"AMERICAN INDIAN OR ALASKA NATIVE" = "American Indian or Alaska Native",
"MULTIPLE" = "Multiple",
"NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER" =
"Native Hawaiian or Other Pacific Islander",
"OTHER" = "Other", "UNKNOWN" = "Unknown"
)[RACE]
)
tbl3 <- build_table(lyt3, DM2)
tbl3
#> A: Drug X B: Placebo C: Combination
#> F M F M F M
#> ———————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> Overall (N) 70 (100.0%) 51 (100.0%) 56 (100.0%) 50 (100.0%) 61 (100.0%) 68 (100.0%)
#> Asian (n) 44 (62.9%) 35 (68.6%) 37 (66.1%) 31 (62.0%) 40 (65.6%) 44 (64.7%)
#> mean 33.55 35.03 34.00 31.10 34.90 34.39
#> Black or African American (n) 18 (25.7%) 10 (19.6%) 12 (21.4%) 12 (24.0%) 13 (21.3%) 14 (20.6%)
#> mean 33.17 37.40 30.58 32.83 33.85 34.14
#> White (n) 8 (11.4%) 6 (11.8%) 7 (12.5%) 7 (14.0%) 8 (13.1%) 10 (14.7%)
#> mean 35.88 44.00 38.57 35.29 36.50 34.00