The goal of this article is to illustrate which {cards} functions are used to create long data summaries: think summaries from ADAE, ADLB, ADCM, and other similarly structured data sets.
Generally, the solution to long data summaries lies with
ard_stack_hierarchical*()
, ard_strata()
, or
even a call to a more basic function like ard_tabulate()
.
Herein, we will review these function and when each is needed.
Hierarchical or Nested Summaries
The ard_stack_hierarchical*()
family of functions are
useful when tabulating hierarchical or nested data and
the tabulation needs to be repeated across more than one of the
hierarchies. The most common example is the summary of adverse event
(AE) data.
Primary System Organ Class Dictionary-Derived Term |
Placebo N = 86 |
Xanomeline High Dose N = 84 |
Xanomeline Low Dose N = 84 |
---|---|---|---|
Number Subjects with AE | 69 (80%) | 79 (94%) | 77 (92%) |
GASTROINTESTINAL DISORDERS | 17 (20%) | 21 (25%) | 15 (18%) |
DIARRHOEA | 9 (10%) | 4 (4.8%) | 5 (6.0%) |
GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS | 21 (24%) | 40 (48%) | 47 (56%) |
APPLICATION SITE ERYTHEMA | 3 (3.5%) | 15 (18%) | 12 (14%) |
APPLICATION SITE PRURITUS | 6 (7.0%) | 22 (26%) | 22 (26%) |
APPLICATION SITE VESICLES | 1 (1.2%) | 6 (7.1%) | 4 (4.8%) |
SKIN AND SUBCUTANEOUS TISSUE DISORDERS | 21 (24%) | 42 (50%) | 42 (50%) |
ERYTHEMA | 9 (10%) | 14 (17%) | 15 (18%) |
PRURITUS GENERALISED | 0 (0%) | 1 (1.2%) | 1 (1.2%) |
Printing a few illustrative rows from the full table. |
In the table above, the AE rates are reported for both the
system organ class (SOC) and AE term. That is, each AE is counted once
per subject and then rate for each AE calculated; this is repeated for
SOC. A call to the ard_stack_hierarchical()
function will
return an ARD with the adverse event rates, the system organ class
rates, the overall rates (row one from the example table), and the
counts that appear in the header.
To create the ARD for this table, use the
ard_stack_hierarchical()
function.
ard_ae <-
ADAE |>
ard_stack_hierarchical(
variables = c(AESOC, AEDECOD), # report rates for SOC and AE within SOC
by = TRTA, # report all statistics by treatment
id = USUBJID, # used to remove duplicate AEs within subject
denominator = ADSL, # specified the denominator for rate calculations
over_variables = TRUE # include summary statistics for Any AE
)
The returned ARD contains four stacked sections: AE rates, SOC rates, Any AE rates, and Treatment counts. Let’s inspect each of these four sections.
Adverse Event Rates
To calculate the AE event counts, the ADAE
data frame is
subset to remove duplicate AEs reported from each subject. From there,
the rates are calculated using the ADSL
data frame passed
in the denominator
argument.
ard_ae |>
filter(variable == "AEDECOD")
#> {cards} data frame: 2178 x 13
#> group1 group1_level group2 group2_level variable variable_level stat_name stat_label stat
#> 1 TRTA Placebo AESOC CARDIAC … AEDECOD ATRIAL F… n n 1
#> 2 TRTA Placebo AESOC CARDIAC … AEDECOD ATRIAL F… N N 86
#> 3 TRTA Placebo AESOC CARDIAC … AEDECOD ATRIAL F… p % 0.012
#> 4 TRTA Xanomeli… AESOC CARDIAC … AEDECOD ATRIAL F… n n 3
#> 5 TRTA Xanomeli… AESOC CARDIAC … AEDECOD ATRIAL F… N N 84
#> 6 TRTA Xanomeli… AESOC CARDIAC … AEDECOD ATRIAL F… p % 0.036
#> 7 TRTA Xanomeli… AESOC CARDIAC … AEDECOD ATRIAL F… n n 1
#> 8 TRTA Xanomeli… AESOC CARDIAC … AEDECOD ATRIAL F… N N 84
#> 9 TRTA Xanomeli… AESOC CARDIAC … AEDECOD ATRIAL F… p % 0.012
#> 10 TRTA Placebo AESOC CARDIAC … AEDECOD ATRIAL F… n n 0
#> ℹ 2168 more rows
#> ℹ Use `print(n = ...)` to see more rows
#> ℹ 4 more variables: context, fmt_fun, warning, error
System Organ Class Rates
The AE rate process is repeated for SOC.
ard_ae |>
filter(variable == "AESOC") |>
select(-all_missing_columns())
#> {cards} data frame: 207 x 9
#> group1 group1_level variable variable_level stat_name stat_label stat
#> 1 TRTA Placebo AESOC CARDIAC … n n 13
#> 2 TRTA Placebo AESOC CARDIAC … N N 86
#> 3 TRTA Placebo AESOC CARDIAC … p % 0.151
#> 4 TRTA Xanomeli… AESOC CARDIAC … n n 18
#> 5 TRTA Xanomeli… AESOC CARDIAC … N N 84
#> 6 TRTA Xanomeli… AESOC CARDIAC … p % 0.214
#> 7 TRTA Xanomeli… AESOC CARDIAC … n n 13
#> 8 TRTA Xanomeli… AESOC CARDIAC … N N 84
#> 9 TRTA Xanomeli… AESOC CARDIAC … p % 0.155
#> 10 TRTA Placebo AESOC CONGENIT… n n 0
#> ℹ 197 more rows
#> ℹ Use `print(n = ...)` to see more rows
#> ℹ 2 more variables: context, fmt_fun
Any AE Rates
The process is then repeated to calculate rates of any adverse event.
ard_ae |>
filter(variable == "..ard_hierarchical_overall..") |>
select(-all_missing_columns())
#> {cards} data frame: 9 x 9
#> group1 group1_level variable variable_level stat_name stat_label stat
#> 1 TRTA Placebo ..ard_hierarchical_overall.. TRUE n n 69
#> 2 TRTA Placebo ..ard_hierarchical_overall.. TRUE N N 86
#> 3 TRTA Placebo ..ard_hierarchical_overall.. TRUE p % 0.802
#> 4 TRTA Xanomeli… ..ard_hierarchical_overall.. TRUE n n 79
#> 5 TRTA Xanomeli… ..ard_hierarchical_overall.. TRUE N N 84
#> 6 TRTA Xanomeli… ..ard_hierarchical_overall.. TRUE p % 0.94
#> 7 TRTA Xanomeli… ..ard_hierarchical_overall.. TRUE n n 77
#> 8 TRTA Xanomeli… ..ard_hierarchical_overall.. TRUE N N 84
#> 9 TRTA Xanomeli… ..ard_hierarchical_overall.. TRUE p % 0.917
#> ℹ 2 more variables: context, fmt_fun
Treatment Counts
Finally, a univariate tabulation of the TRTA
column from
ADSL
is included.
ard_ae |>
filter(variable == "TRTA") |>
select(-all_missing_columns())
#> {cards} data frame: 9 x 7
#> variable variable_level context stat_name stat_label stat
#> 1 TRTA Placebo tabulate n n 86
#> 2 TRTA Placebo tabulate N N 254
#> 3 TRTA Placebo tabulate p % 0.339
#> 4 TRTA Xanomeli… tabulate n n 84
#> 5 TRTA Xanomeli… tabulate N N 254
#> 6 TRTA Xanomeli… tabulate p % 0.331
#> 7 TRTA Xanomeli… tabulate n n 84
#> 8 TRTA Xanomeli… tabulate N N 254
#> 9 TRTA Xanomeli… tabulate p % 0.331
#> ℹ 1 more variable: fmt_fun
The package exports a similar function for counting adverse event,
rather than calculating rates:
ard_stack_hierarchical_count()
.
Stratified Summaries
There are many types of stratified summaries that may be needed to report results from a trial. We will focus on a common lab summary where summary statistics are reported by lab type, visit and treatment.
Lab Visit |
Placebo
N = 86 |
Xanomeline High Dose
N = 84 |
Xanomeline Low Dose
N = 84 |
|||
---|---|---|---|---|---|---|
Value at Visit | Change from Baseline | Value at Visit | Change from Baseline | Value at Visit | Change from Baseline | |
Bilirubin (umol/L) | ||||||
Baseline | ||||||
n | 7 | 7 | 6 | |||
Mean (SD) | 8.6 (3.0) | 10.7 (5.1) | 9.1 (3.0) | |||
Median | 8.6 | 10.3 | 7.7 | |||
Min - Max | 5.1 - 12.0 | 5.1 - 18.8 | 6.8 - 13.7 | |||
Week 24 | ||||||
n | 5 | 5 | 4 | 4 | 2 | 2 |
Mean (SD) | 7.87 (2.59) | -0.34 (3.06) | 10.26 (4.19) | -2.99 (0.86) | 9.41 (1.21) | 0.00 (4.84) |
Median | 6.84 | 0.00 | 9.41 | -3.42 | 9.41 | 0.00 |
Min - Max | 5.13 - 11.97 | -3.42 - 3.42 | 6.84 - 15.39 | -3.42 - -1.71 | 8.55 - 10.26 | -3.42 - 3.42 |
Creatinine (umol/L) | ||||||
Baseline | ||||||
n | 7 | 7 | 6 | |||
Mean (SD) | 97 (18) | 102 (15) | 106 (22) | |||
Median | 88 | 106 | 111 | |||
Min - Max | 80 - 124 | 80 - 124 | 80 - 133 | |||
Week 24 | ||||||
n | 5 | 5 | 4 | 4 | 2 | 2 |
Mean (SD) | 99 (20) | 5.3 (4.8) | 106 (23) | 2.2 (8.5) | 102 (31) | 0.0 (0.0) |
Median | 97 | 8.8 | 106 | 4.4 | 102 | 0.0 |
Min - Max | 80 - 133 | 0.0 - 8.8 | 80 - 133 | -8.8 - 8.8 | 80 - 124 | 0.0 - 0.0 |
Printing a few illustrative rows from the full table. |
To build the ARD for this table, we use the
ard_summary()
.
ard_summary(by="TRTA")
: Use theby
argument ensures each level of treatment has all associated summary statistics, even if there are combinations that are unobserved or allNA
.ard_summary(strata=c("PARAM", "AVISIT")
: The strata argument will produce summary statistics for all observed combinations of'PARAM'
and'AVISIT'
. We opt to usestrata
because it is common a trial will not collect all labs at each visit, and we don’t want to report that bilirubin had no observations at week xx when it was never meant to be collected at that visit, for example.ard_summary(variables=c("AVAL", "CHG"))
: These are the variables that will be summarized within'TRTA'
,'PARAM'
, and'AVISIT'
.
ADLB |>
# subset on two labs and two study visits
filter(
PARAMCD %in% c("BILI", "CREAT"),
AVISIT %in% c("Baseline", "Week 24")
) |>
ard_summary(
# calculate statistics by observed combinations of PARAM on AVISIT
strata = c("PARAM", "AVISIT"),
# `by='TRTA'` will provide results for each of the treatments, even if unobserved
by = "TRTA",
# provide summaries for the measurement and its change from baseline
variables = c("AVAL", "CHG")
)
#> {cards} data frame: 192 x 14
#> group1 group1_level group2 group2_level group3 group3_level variable stat_name stat_label stat
#> 1 TRTA Placebo PARAM Bilirubi… AVISIT Baseline AVAL N N 7
#> 2 TRTA Placebo PARAM Bilirubi… AVISIT Baseline AVAL mean Mean 8.55
#> 3 TRTA Placebo PARAM Bilirubi… AVISIT Baseline AVAL sd SD 2.962
#> 4 TRTA Placebo PARAM Bilirubi… AVISIT Baseline AVAL median Median 8.55
#> 5 TRTA Placebo PARAM Bilirubi… AVISIT Baseline AVAL p25 Q1 5.13
#> 6 TRTA Placebo PARAM Bilirubi… AVISIT Baseline AVAL p75 Q3 11.97
#> 7 TRTA Placebo PARAM Bilirubi… AVISIT Baseline AVAL min Min 5.13
#> 8 TRTA Placebo PARAM Bilirubi… AVISIT Baseline AVAL max Max 11.97
#> 9 TRTA Placebo PARAM Bilirubi… AVISIT Baseline CHG N N 0
#> 10 TRTA Placebo PARAM Bilirubi… AVISIT Baseline CHG mean Mean NaN
#> ℹ 182 more rows
#> ℹ Use `print(n = ...)` to see more rows
#> ℹ 4 more variables: context, fmt_fun, warning, error
There are some cases, where slightly different behavior is needed
within stratum. In these cases, use the ard_strata()
function. For example, if we were tabulating character
AVALC
values, and the possible values are different
depending on PARAM
, the code may look something like
this:
ard_strata(
data = ADLB,
.strata = "PARAM",
.f = \(data_param) {
# set factor depending on the PARAM value
if (data_param$PARAM[1] == "XXX") data_param$AVALC <- factor(data_param$AVALC, levels = c("No", "Yes"))
if (data_param$PARAM[1] == "YYY") data_param$AVALC <- factor(data_param$AVALC, levels = c("Low", "High"))
ard_tabulate(
data_param,
strata = "AVISIT",
by = "TRTA",
variable = "AVALC"
)
}
)