Use these functions to calculate multiple summaries of nested or hierarchical data
in a single call.
ard_stack_hierarchical()
: Calculates rates of events (e.g. adverse events) utilizing thedenominator
andid
arguments to identify the rows indata
to include in each rate calculation.ard_stack_hierarchical_count()
: Calculates counts of events utilizing all rows for each tabulation.
Usage
ard_stack_hierarchical(
data,
variables,
by = dplyr::group_vars(data),
id,
denominator,
include = everything(),
statistic = everything() ~ c("n", "N", "p"),
overall = FALSE,
over_variables = FALSE,
attributes = FALSE,
total_n = FALSE,
shuffle = FALSE
)
ard_stack_hierarchical_count(
data,
variables,
by = dplyr::group_vars(data),
denominator = NULL,
include = everything(),
overall = FALSE,
over_variables = FALSE,
attributes = FALSE,
total_n = FALSE,
shuffle = FALSE
)
Arguments
- data
(
data.frame
)
a data frame- variables
(
tidy-select
)
Specifies the nested/hierarchical structure of the data. The variables that are specified here and in theinclude
argument will have summary statistics calculated.- by
(
tidy-select
)
variables to perform tabulations by. All combinations of the variables specified here appear in results. Default isdplyr::group_vars(data)
.- id
(
tidy-select
)
argument used to subsetdata
to identify rows indata
to calculate event rates inard_stack_hierarchical()
. See details below.- denominator
-
(
data.frame
,integer
)
used to define the denominator and enhance the output. The argument is required forard_stack_hierarchical()
and optional forard_stack_hierarchical_count()
.the univariate tabulations of the
by
variables are calculated withdenominator
, when a data frame is passed, e.g. tabulation of the treatment assignment counts that may appear in the header of a table.the
denominator
argument must be specified whenid
is used to calculate the event rates.if
total_n=TRUE
, thedenominator
argument is used to return the total N
- include
(
tidy-select
)
Specify the subset a columns indicated in thevariables
argument for which summary statistics will be returned. Default iseverything()
.- statistic
(
formula-list-selector
)
a named list, a list of formulas, or a single formula where the list element one or more ofc("n", "N", "p", "n_cum", "p_cum")
(on the RHS of a formula).- overall
(scalar
logical
)
logical indicating whether overall statistics should be calculated (i.e. repeat the operations withby=NULL
in most cases, see below for details). Default isFALSE
.- over_variables
(scalar
logical
)
logical indicating whether summary statistics should be calculated over or across the columns listed in thevariables
argument. Default isFALSE
.- attributes
(scalar
logical
)
logical indicating whether to include the results ofard_attributes()
for all variables represented in the ARD. Default isFALSE
.- total_n
(scalar
logical
)
logical indicating whether to include ofard_total_n(denominator)
in the returned ARD.- shuffle
(scalar
logical
)
logical indicating whether to performshuffle_ard()
on the final result. Default isFALSE
.
Subsetting Data for Rate Calculations
To calculate event rates, the ard_stack_hierarchical()
function identifies
rows to include in the calculation.
First, the primary data frame is sorted by the columns identified in
the id
, by
, and variables
arguments.
As the function cycles over the variables specified in the variables
argument,
the data frame is grouped by id
, intersect(by, names(denominator))
, and variables
utilizing the last row within each of the groups.
For example, if the call is
ard_stack_hierarchical(data = ADAE, variables = c(AESOC, AEDECOD), id = USUBJID)
,
then we'd first subset ADAE to be one row within the grouping c(USUBJID, AESOC, AEDECOD)
to calculate the event rates in 'AEDECOD'
. We'd then repeat and
subset ADAE to be one row within the grouping c(USUBJID, AESOC)
to calculate the event rates in 'AESOC'
.
Overall Argument
When we set overall=TRUE
, we wish to re-run our calculations removing the
stratifying columns. For example, if we ran the code below, we results would
include results with the code chunk being re-run with by=NULL
.
ard_stack_hierarchical(
data = ADAE,
variables = c(AESOC, AEDECOD),
by = TRTA,
denominator = ADSL |> dplyr::rename(TRTA = ARM),
overall = TRUE
)
But there is another case to be aware of: when the by
argument includes
columns that are not present in the denominator
, for example when tabulating
results by AE grade or severity in addition to treatment assignment.
In the example below, we're tabulating results by treatment assignment and
AE severity. By specifying overall=TRUE
, we will re-run the to get
results with by = AESEV
and again with by = NULL
.
ard_stack_hierarchical(
data = ADAE,
variables = c(AESOC, AEDECOD),
by = c(TRTA, AESEV),
denominator = ADSL |> dplyr::rename(TRTA = ARM),
overall = TRUE
)
Examples
ard_stack_hierarchical(
ADAE,
variables = c(AESOC, AEDECOD),
by = TRTA,
denominator = ADSL |> dplyr::rename(TRTA = ARM),
id = USUBJID
)
#> {cards} data frame: 2394 x 13
#> group1 group1_level group2 group2_level variable variable_level stat_name
#> 1 TRTA Placebo <NA> AESOC CARDIAC … n
#> 2 TRTA Placebo <NA> AESOC CARDIAC … N
#> 3 TRTA Placebo <NA> AESOC CARDIAC … p
#> 4 TRTA Placebo <NA> AESOC CONGENIT… n
#> 5 TRTA Placebo <NA> AESOC CONGENIT… N
#> 6 TRTA Placebo <NA> AESOC CONGENIT… p
#> 7 TRTA Placebo <NA> AESOC EAR AND … n
#> 8 TRTA Placebo <NA> AESOC EAR AND … N
#> 9 TRTA Placebo <NA> AESOC EAR AND … p
#> 10 TRTA Placebo <NA> AESOC EYE DISO… n
#> stat_label stat
#> 1 n 13
#> 2 N 86
#> 3 % 0.151
#> 4 n 0
#> 5 N 86
#> 6 % 0
#> 7 n 1
#> 8 N 86
#> 9 % 0.012
#> 10 n 4
#> ℹ 2384 more rows
#> ℹ Use `print(n = ...)` to see more rows
#> ℹ 4 more variables: context, fmt_fn, warning, error
ard_stack_hierarchical_count(
ADAE,
variables = c(AESOC, AEDECOD),
by = TRTA,
denominator = ADSL |> dplyr::rename(TRTA = ARM)
)
#> {cards} data frame: 804 x 13
#> group1 group1_level group2 group2_level variable variable_level stat_name
#> 1 TRTA Placebo <NA> AESOC CARDIAC … n
#> 2 TRTA Placebo <NA> AESOC CONGENIT… n
#> 3 TRTA Placebo <NA> AESOC EAR AND … n
#> 4 TRTA Placebo <NA> AESOC EYE DISO… n
#> 5 TRTA Placebo <NA> AESOC GASTROIN… n
#> 6 TRTA Placebo <NA> AESOC GENERAL … n
#> 7 TRTA Placebo <NA> AESOC HEPATOBI… n
#> 8 TRTA Placebo <NA> AESOC IMMUNE S… n
#> 9 TRTA Placebo <NA> AESOC INFECTIO… n
#> 10 TRTA Placebo <NA> AESOC INJURY, … n
#> stat_label stat
#> 1 n 27
#> 2 n 0
#> 3 n 2
#> 4 n 8
#> 5 n 26
#> 6 n 48
#> 7 n 1
#> 8 n 0
#> 9 n 35
#> 10 n 9
#> ℹ 794 more rows
#> ℹ Use `print(n = ...)` to see more rows
#> ℹ 4 more variables: context, fmt_fn, warning, error