Create a custom splitting function
Arguments
- pre
(
list
)
zero or more functions which operate on the incoming data and return a new data frame that should split viacore_split
. They will be called on the data in the order they appear in the list.- core_split
(
function
orNULL
)
if non-NULL
, a function which accepts the same arguments thatdo_base_split
does, and returns the same type of named list. Custom functions which override this behavior cannot be used in column splits.- post
(
list
)
zero or more functions which should be called on the list output by splitting.
Details
Custom split functions can be thought of as (up to) 3 different types of manipulations of the splitting process:
Pre-processing of the incoming data to be split.
(Row-splitting only) Customization of the core mapping of incoming data to facets.
Post-processing operations on the set of facets (groups) generated by the split.
This function provides an interface to create custom split functions by implementing and specifying sets of operations in each of those classes of customization independently.
Pre-processing functions (1), must accept: df
, spl
, vals
, and labels
, and can optionally accept
.spl_context
. They then manipulate df
(the incoming data for the split) and return a modified data frame.
This modified data frame must contain all columns present in the incoming data frame, but can add columns if
necessary (though we note that these new columns cannot be used in the layout as split or analysis variables,
because they will not be present when validity checking is done).
The preprocessing component is useful for things such as manipulating factor levels, e.g., to trim unobserved ones or to reorder levels based on observed counts, etc.
Customization of core splitting (2) is currently only supported in row splits. Core splitting functions override the
fundamental splitting procedure, and are only necessary in rare cases. These must accept spl
, df
, vals
, and
labels
, and can optionally accept .spl_context
. They must return a named list with elements, all of the same
length, as follows:
datasplit
, containing a list ofdata.frame
objects.values
, containing values associated with the facets, which must becharacter
orSplitValue
objects. These values will appear in the paths of the resulting table.labels
, containing the character labels associated withvalues
Post-processing functions (3) must accept the result of the core split as their first argument (which can be
anything), in addition to spl
, and fulldf
, and can optionally accept .spl_context
. They must each return a
modified version of the same structure specified above for core splitting.
In both the pre- and post-processing cases, multiple functions can be specified. When this happens, they are applied
sequentially, in the order they appear in the list passed to the relevant argument (pre
and post
, respectively).
See also
custom_split_funs for a more detailed discussion on what custom split functions do.
Other make_custom_split:
add_combo_facet()
,
drop_facet_levels()
,
make_split_result()
,
trim_levels_in_facets()
Examples
mysplitfun <- make_split_fun(
pre = list(drop_facet_levels),
post = list(add_overall_facet("ALL", "All Arms"))
)
basic_table(show_colcounts = TRUE) %>%
split_cols_by("ARM", split_fun = mysplitfun) %>%
analyze("AGE") %>%
build_table(subset(DM, ARM %in% c("B: Placebo", "C: Combination")))
#> B: Placebo C: Combination All Arms
#> (N=106) (N=129) (N=235)
#> —————————————————————————————————————————————
#> Mean 33.02 34.57 33.87
## post (and pre) arguments can take multiple functions, here
## we add an overall facet and the reorder the facets
reorder_facets <- function(splret, spl, fulldf, ...) {
ord <- order(names(splret$values))
make_split_result(
splret$values[ord],
splret$datasplit[ord],
splret$labels[ord]
)
}
mysplitfun2 <- make_split_fun(
pre = list(drop_facet_levels),
post = list(
add_overall_facet("ALL", "All Arms"),
reorder_facets
)
)
basic_table(show_colcounts = TRUE) %>%
split_cols_by("ARM", split_fun = mysplitfun2) %>%
analyze("AGE") %>%
build_table(subset(DM, ARM %in% c("B: Placebo", "C: Combination")))
#> All Arms B: Placebo C: Combination
#> (N=235) (N=106) (N=129)
#> —————————————————————————————————————————————
#> Mean 33.87 33.02 34.57
very_stupid_core <- function(spl, df, vals, labels, .spl_context) {
make_split_result(c("stupid", "silly"),
datasplit = list(df[1:10, ], df[11:30, ]),
labels = c("first 10", "second 20")
)
}
dumb_30_facet <- add_combo_facet("dumb",
label = "thirty patients",
levels = c("stupid", "silly")
)
nonsense_splfun <- make_split_fun(
core_split = very_stupid_core,
post = list(dumb_30_facet)
)
## recall core split overriding is not supported in column space
## currently, but we can see it in action in row space
lyt_silly <- basic_table() %>%
split_rows_by("ARM", split_fun = nonsense_splfun) %>%
summarize_row_groups() %>%
analyze("AGE")
silly_table <- build_table(lyt_silly, DM)
silly_table
#> all obs
#> ———————————————————————————
#> first 10 10 (2.8%)
#> Mean 31.10
#> second 20 20 (5.6%)
#> Mean 34.25
#> thirty patients 30 (8.4%)
#> Mean 33.20