Using functions from the {gtsummary} package we can produce nested tables and hierarchical tables commonly used in clinical reporting.

Hierarchical Tables

Event Rate Tables: tbl_hierarchical()

To begin, let’s create a basic tbl_hierarchical() adverse event table summarizing adverse events (AEs) within System Organ Class (SOC).

ADAE_subset <- cards::ADAE |>
  dplyr::filter(AEBODSYS %in% c("SKIN AND SUBCUTANEOUS TISSUE DISORDERS", "EAR AND LABYRINTH DISORDERS")) |>
  dplyr::filter(.by = AEBODSYS, dplyr::row_number() < 20)

tbl <-
  tbl_hierarchical(
    data = ADAE_subset,
    variables = c(AEBODSYS, AEDECOD),
    by = TRTA,
    denominator = cards::ADSL,
    id = USUBJID,
    overall_row = TRUE
  )

tbl
Body System or Organ Class
    Dictionary-Derived Term
Placebo
N = 86
1
Xanomeline High Dose
N = 84
1
Xanomeline Low Dose
N = 84
1
Number of patients with event 3 (3.5%) 3 (3.6%) 5 (6.0%)
EAR AND LABYRINTH DISORDERS 1 (1.2%) 1 (1.2%) 2 (2.4%)
    CERUMEN IMPACTION 0 (0%) 0 (0%) 1 (1.2%)
    EAR PAIN 1 (1.2%) 0 (0%) 0 (0%)
    TINNITUS 0 (0%) 0 (0%) 1 (1.2%)
    VERTIGO 0 (0%) 1 (1.2%) 1 (1.2%)
SKIN AND SUBCUTANEOUS TISSUE DISORDERS 2 (2.3%) 2 (2.4%) 3 (3.6%)
    ACTINIC KERATOSIS 0 (0%) 1 (1.2%) 0 (0%)
    ERYTHEMA 1 (1.2%) 0 (0%) 3 (3.6%)
    PRURITUS 1 (1.2%) 0 (0%) 2 (2.4%)
    PRURITUS GENERALISED 0 (0%) 0 (0%) 1 (1.2%)
    RASH 0 (0%) 1 (1.2%) 0 (0%)
1 n (%)

The gtsummary::tbl_hierarchical() function was created to build hierarchy tables, which are most commonly seen when reporting rates of adverse events. .

The variables argument takes a vector of variable names, where the first corresponds to the outermost hierarchy level and the last corresponds to the innermost hierarchy level.

AE rates are calculated within each hierarchy for each level of the innermost variable.

By default, summary AE rates are also calculated on label rows at each outer level of the hierarchy. This setting is controlled by the include argument, such that any variable not in include will not include summary AE counts. For example, the table above includes summaries for both AE and SOC, but had we specified include = AEDECOD the resulting table would include rows for both AEs and SOCs, but only AE rows would have summary statistics.

The by parameter can be used to set a single variable on which the table will be split column-wise. To include an overall row where AE rates are calculated for the overall dataset at the top of the table, set overall_row = TRUE.

If multiple AEs are recorded for a subject, only one of these events is counted once for that subject. Unique subjects are identified via the variable supplied using the required id argument.

The dataset specified via the required denominator argument is used to calculate the denominators in percentage calculations as well as any N counts included in the table header.

Note that only observed combinations of hierarchy levels are included in the table—if all event rates within a given row are zero the row will be excluded.

As is typical of most tbl_*() functions in {gtsummary}, the tbl_hierarchical() function also has the statistic, label, and digits arguments to further customize hierarchy tables. Valid statistics include n, N, and p. The label argument can specify the top-left labels for each variable as well as the label used for the “overall” row (if overall_row = TRUE).

See below the same example with further customization applied.

tbl <-
  tbl_hierarchical(
    data = ADAE_subset,
    variables = c(AEBODSYS, AEDECOD),
    by = TRTA,
    denominator = cards::ADSL,
    id = USUBJID,
    include = AEDECOD,
    digits = everything() ~ list(p = 0),
    overall_row = TRUE,
    label = list(..ard_hierarchical_overall.. = "Any Adverse Event")
  ) |>
  add_overall()

tbl
Body System or Organ Class
    Dictionary-Derived Term
Overall
N = 254
1
Placebo
N = 86
1
Xanomeline High Dose
N = 84
1
Xanomeline Low Dose
N = 84
1
Any Adverse Event 11 (4%) 3 (3%) 3 (4%) 5 (6%)
EAR AND LABYRINTH DISORDERS



    CERUMEN IMPACTION 1 (0%) 0 (0%) 0 (0%) 1 (1%)
    EAR PAIN 1 (0%) 1 (1%) 0 (0%) 0 (0%)
    TINNITUS 1 (0%) 0 (0%) 0 (0%) 1 (1%)
    VERTIGO 2 (1%) 0 (0%) 1 (1%) 1 (1%)
SKIN AND SUBCUTANEOUS TISSUE DISORDERS



    ACTINIC KERATOSIS 1 (0%) 0 (0%) 1 (1%) 0 (0%)
    ERYTHEMA 4 (2%) 1 (1%) 0 (0%) 3 (4%)
    PRURITUS 3 (1%) 1 (1%) 0 (0%) 2 (2%)
    PRURITUS GENERALISED 1 (0%) 0 (0%) 0 (0%) 1 (1%)
    RASH 1 (0%) 0 (0%) 1 (1%) 0 (0%)
1 n (%)

Event Count Tables: tbl_hierarchical_count()

Similar to the tbl_hierarchical() function, the tbl_hierarchical_count() operates on nested data structures. Instead of event rates this function calculates event counts, with no option to specify an id parameter. This means that events from every row of data will be counted, regardless of whether they occur multiple times per subject. If denominator is specified, it is only used to calculate N values in the table header.

Only the n statistic is available when using this function.

ADAE_subset <- cards::ADAE |>
  dplyr::filter(AEBODSYS %in% c("SKIN AND SUBCUTANEOUS TISSUE DISORDERS", "EAR AND LABYRINTH DISORDERS")) |>
  dplyr::filter(.by = AEBODSYS, dplyr::row_number() < 20)

tbl <-
  tbl_hierarchical_count(
    data = ADAE_subset,
    variables = c(AEBODSYS, AEDECOD),
    by = TRTA,
    denominator = cards::ADSL,
    overall_row = TRUE
  )

tbl
Body System or Organ Class
    Dictionary-Derived Term
Placebo
N = 86
1
Xanomeline High Dose
N = 84
1
Xanomeline Low Dose
N = 84
1
Total number of events 6 4 15
EAR AND LABYRINTH DISORDERS 2 1 3
    CERUMEN IMPACTION 0 0 1
    EAR PAIN 2 0 0
    TINNITUS 0 0 1
    VERTIGO 0 1 1
SKIN AND SUBCUTANEOUS TISSUE DISORDERS 4 3 12
    ACTINIC KERATOSIS 0 1 0
    ERYTHEMA 3 0 5
    PRURITUS 1 0 3
    PRURITUS GENERALISED 0 0 4
    RASH 0 2 0
1 n

Tables of Event Rates by Highest Severity

In addition to the standard implementation of tbl_hierarchical() to create tables of event rates, this function can also be used to generate tables of event rates by highest severity.

This means that for each subject in data only one event—the event with highest severity—will be reported in the table. To do so, the innermost hierarchy variable (the last variable in variable) must be converted to a factor variable. For example, consider the following table:

ADAE_subset <- cards::ADAE |>
  dplyr::filter(
    AEBODSYS %in% c("SKIN AND SUBCUTANEOUS TISSUE DISORDERS", "EAR AND LABYRINTH DISORDERS", "CARDIAC DISORDERS")
  ) |>
  dplyr::filter(.by = AEBODSYS, dplyr::row_number() < 20) |>
  dplyr::mutate(AESEV = factor(AESEV, ordered = TRUE, levels = c("MILD", "MODERATE", "SEVERE")))

tbl_hierarchical(
  data = ADAE_subset,
  variables = c(AESOC, AESEV),
  id = USUBJID,
  denominator = cards::ADSL,
  by = TRTA,
  overall_row = TRUE,
  label = list(AESEV = "Highest Severity")
)
#>  Denominator set by "TRTA" column in `denominator` data
#> frame.
Primary System Organ Class
    Highest Severity
Placebo
N = 86
1
Xanomeline High Dose
N = 84
1
Xanomeline Low Dose
N = 84
1
Number of patients with event 6 (7.0%) 8 (9.5%) 5 (6.0%)
CARDIAC DISORDERS 4 (4.7%) 5 (6.0%) 1 (1.2%)
    MILD 2 (2.3%) 4 (4.8%) 0 (0%)
    MODERATE 1 (1.2%) 1 (1.2%) 1 (1.2%)
    SEVERE 1 (1.2%) 0 (0%) 0 (0%)
EAR AND LABYRINTH DISORDERS 1 (1.2%) 1 (1.2%) 2 (2.4%)
    MILD 1 (1.2%) 0 (0%) 0 (0%)
    MODERATE 0 (0%) 1 (1.2%) 2 (2.4%)
SKIN AND SUBCUTANEOUS TISSUE DISORDERS 2 (2.3%) 2 (2.4%) 3 (3.6%)
    MILD 1 (1.2%) 2 (2.4%) 1 (1.2%)
    MODERATE 1 (1.2%) 0 (0%) 2 (2.4%)
1 n (%)

Since each subject has exactly one event recorded for this type of table, the numbers within each hierarchy section will always add up to the numbers in the preceding summary row. For example, see in the table above that the total number of subjects with treatment Placebo in the CARDIAC DISORDERS section is 4, and the numbers within the section below, for all severity levels, also add up to 4.

ARD-First Hierarchical Tables: tbl_ard_hierarchical()

In addition to the above two functions which use a data frame-first approach to creating hierarchy tables, an ARD-first approach can be taken using the gtsummary::tbl_ard_hierarchical() function.

This function takes a hierarchical ARD object of class "card" and converts it to a hierarchical table. Hierarchical ARDs can be constructed using the cards::ard_stack_hierarchical() (for event rates) and cards::ard_stack_hierarchical_count() (for event counts) functions.

For example:

# Build the ARD
ard <-
  cards::ard_stack_hierarchical(
    data = ADAE_subset,
    variables = c(AESOC, AETERM),
    by = TRTA,
    denominator = cards::ADSL,
    id = USUBJID
  )

# Build the table from the ARD
tbl <-
  tbl_ard_hierarchical(
    cards = ard,
    variables = c(AESOC, AETERM),
    by = TRTA,
    label = list(AESOC = "Body System or Organ Class", AETERM = "Dictionary-Derived Term")
  )

tbl
Body System or Organ Class
    Dictionary-Derived Term
Placebo
N = 86
1
Xanomeline High Dose
N = 84
1
Xanomeline Low Dose
N = 84
1
CARDIAC DISORDERS 4 (4.7%) 5 (6.0%) 1 (1.2%)
    ATRIAL FIBRILLATION 0 (0.0%) 1 (1.2%) 0 (0.0%)
    ATRIAL FLUTTER 0 (0.0%) 1 (1.2%) 0 (0.0%)
    ATRIOVENTRICULAR BLOCK SECOND DEGREE 2 (2.3%) 1 (1.2%) 0 (0.0%)
    BUNDLE BRANCH BLOCK LEFT 1 (1.2%) 0 (0.0%) 0 (0.0%)
    CARDIAC DISORDER 0 (0.0%) 1 (1.2%) 0 (0.0%)
    MYOCARDIAL INFARCTION 1 (1.2%) 1 (1.2%) 0 (0.0%)
    PALPITATIONS 0 (0.0%) 0 (0.0%) 1 (1.2%)
    SINUS BRADYCARDIA 0 (0.0%) 2 (2.4%) 0 (0.0%)
    TACHYCARDIA 1 (1.2%) 0 (0.0%) 0 (0.0%)
    WOLFF-PARKINSON-WHITE SYNDROME 0 (0.0%) 0 (0.0%) 1 (1.2%)
EAR AND LABYRINTH DISORDERS 1 (1.2%) 1 (1.2%) 2 (2.4%)
    CERUMEN IMPACTION 0 (0.0%) 0 (0.0%) 1 (1.2%)
    EAR PAIN 1 (1.2%) 0 (0.0%) 0 (0.0%)
    TINNITUS 0 (0.0%) 0 (0.0%) 1 (1.2%)
    VERTIGO 0 (0.0%) 1 (1.2%) 1 (1.2%)
SKIN AND SUBCUTANEOUS TISSUE DISORDERS 2 (2.3%) 2 (2.4%) 3 (3.6%)
    ACTINIC KERATOSIS 0 (0.0%) 1 (1.2%) 0 (0.0%)
    ERYTHEMA 1 (1.2%) 0 (0.0%) 3 (3.6%)
    PRURITUS 1 (1.2%) 0 (0.0%) 2 (2.4%)
    PRURITUS GENERALISED 0 (0.0%) 0 (0.0%) 1 (1.2%)
    RASH 0 (0.0%) 1 (1.2%) 0 (0.0%)
1 n (%)

Combining Hierarchical and Other Tables

We can stack hierarchical table results with results from another table if additional rows are needed. For example, if we wanted to add an “Any AE” row which calculates the number of unique subjects for which any AE was observed to the previous hierarchical table, we could do so as follows using gtsummary::tbl_stack():

# Calculate subjects with any AE across TRTA and create a single-line table for this result
any_ae <-
  ADAE_subset |>
  distinct(USUBJID, TRTA) |>
  mutate(AE_ANY = TRUE) |>
  cards::ard_tabulate_value(
    by = TRTA,
    variables = AE_ANY,
    denominator = cards::ADSL
  ) |>
  tbl_ard_summary(by = TRTA, label = list(AE_ANY = "Any AE"))

any_ae
Characteristic Placebo1 Xanomeline High Dose1 Xanomeline Low Dose1
Any AE 6 (7.0%) 8 (9.5%) 5 (6.0%)
1 n (%)

The “Any AE” table header differs from the previous table header as it does not include labels or the column N values, so we specify attr_order = 2 in the following tbl_stack() call to keep the table header from the second table in the stack and not the first (and quiet = TRUE to silence the message about differing headers).

# Stack tables so that the "Any AE" row is added at the top of the previous table
tbl_stack(
  tbls = list(any_ae, tbl),
  attr_order = 2,
  quiet = TRUE
)
Body System or Organ Class
    Dictionary-Derived Term
Placebo
N = 86
1
Xanomeline High Dose
N = 84
1
Xanomeline Low Dose
N = 84
1
Any AE 6 (7.0%) 8 (9.5%) 5 (6.0%)
CARDIAC DISORDERS 4 (4.7%) 5 (6.0%) 1 (1.2%)
    ATRIAL FIBRILLATION 0 (0.0%) 1 (1.2%) 0 (0.0%)
    ATRIAL FLUTTER 0 (0.0%) 1 (1.2%) 0 (0.0%)
    ATRIOVENTRICULAR BLOCK SECOND DEGREE 2 (2.3%) 1 (1.2%) 0 (0.0%)
    BUNDLE BRANCH BLOCK LEFT 1 (1.2%) 0 (0.0%) 0 (0.0%)
    CARDIAC DISORDER 0 (0.0%) 1 (1.2%) 0 (0.0%)
    MYOCARDIAL INFARCTION 1 (1.2%) 1 (1.2%) 0 (0.0%)
    PALPITATIONS 0 (0.0%) 0 (0.0%) 1 (1.2%)
    SINUS BRADYCARDIA 0 (0.0%) 2 (2.4%) 0 (0.0%)
    TACHYCARDIA 1 (1.2%) 0 (0.0%) 0 (0.0%)
    WOLFF-PARKINSON-WHITE SYNDROME 0 (0.0%) 0 (0.0%) 1 (1.2%)
EAR AND LABYRINTH DISORDERS 1 (1.2%) 1 (1.2%) 2 (2.4%)
    CERUMEN IMPACTION 0 (0.0%) 0 (0.0%) 1 (1.2%)
    EAR PAIN 1 (1.2%) 0 (0.0%) 0 (0.0%)
    TINNITUS 0 (0.0%) 0 (0.0%) 1 (1.2%)
    VERTIGO 0 (0.0%) 1 (1.2%) 1 (1.2%)
SKIN AND SUBCUTANEOUS TISSUE DISORDERS 2 (2.3%) 2 (2.4%) 3 (3.6%)
    ACTINIC KERATOSIS 0 (0.0%) 1 (1.2%) 0 (0.0%)
    ERYTHEMA 1 (1.2%) 0 (0.0%) 3 (3.6%)
    PRURITUS 1 (1.2%) 0 (0.0%) 2 (2.4%)
    PRURITUS GENERALISED 0 (0.0%) 0 (0.0%) 1 (1.2%)
    RASH 0 (0.0%) 1 (1.2%) 0 (0.0%)
1 n (%)

Sorting & Filtering Hierarchical Tables

Hierarchical tables can be further customized after creation by sorting and filtering them using the gtsummary::sort_hierarchical() and gtsummary::filter_hierarchical() functions, respectively.

By default, hierarchical tables are sorted alphanumerically at each level of the hierarchy.

By applying the sort_hierarchical() function to an existing hierarchical table the sort order can be changed at each level of the hierarchy to “descending” - sorting rows by descending event rate/count sum such that the most prevalent events will be listed first.

The sort argument to sort_hierarchical() can accept either:

  • a string specifying the sorting criteria ("descending" or "alphanumeric") to apply at all levels of the hierarchy, or
  • a named list, a list of formulas, or a single formula specifying the sorting criteria to apply at specific levels of the hierarchy.

By default sort_hierarchical() changes the sorting criteria to “descending” at all levels of the hierarchy.

In addition to sorting, hierarchical tables can also be filtered using filter_hierarchical(), which uses a given expression to filter out rows of the table.

For examples of possible filters please see the documentation for the filter_hierarchical() function.

Filtering can be applied at any hierarchy level, but defaults to the innermost level of the hierarchy with all summary rows from corresponding outer hierarchy levels kept.

To change the variable level of the hierarchy at which filtering is applied you can specify the var argument to filter_hierarchical().

If all rows of a section are filtered out then the summary row for that section is also removed unless keep_empty = TRUE is specified.

If the table contains an overall column then this column is ignored when filtering.

See the following example where a hierarchy table is sorted in descending order and filtered so that only rows with more than one event recorded across all treatment groups are kept.

tbl <-
  tbl_hierarchical(
    data = ADAE_subset,
    variables = c(AEBODSYS, AEDECOD),
    by = TRTA,
    denominator = cards::ADSL,
    id = USUBJID,
    overall_row = TRUE
  ) |>
  add_overall()

tbl |>
  sort_hierarchical(everything() ~ "descending") |>
  filter_hierarchical(filter = n_overall > 1)
Body System or Organ Class
    Dictionary-Derived Term
Overall
N = 254
1
Placebo
N = 86
1
Xanomeline High Dose
N = 84
1
Xanomeline Low Dose
N = 84
1
Number of patients with event 19 (7.5%) 6 (7.0%) 8 (9.5%) 5 (6.0%)
CARDIAC DISORDERS 10 (3.9%) 4 (4.7%) 5 (6.0%) 1 (1.2%)
    ATRIOVENTRICULAR BLOCK SECOND DEGREE 3 (1.2%) 2 (2.3%) 1 (1.2%) 0 (0%)
    MYOCARDIAL INFARCTION 2 (0.8%) 1 (1.2%) 1 (1.2%) 0 (0%)
    SINUS BRADYCARDIA 2 (0.8%) 0 (0%) 2 (2.4%) 0 (0%)
SKIN AND SUBCUTANEOUS TISSUE DISORDERS 7 (2.8%) 2 (2.3%) 2 (2.4%) 3 (3.6%)
    ERYTHEMA 4 (1.6%) 1 (1.2%) 0 (0%) 3 (3.6%)
    PRURITUS 3 (1.2%) 1 (1.2%) 0 (0%) 2 (2.4%)
EAR AND LABYRINTH DISORDERS 4 (1.6%) 1 (1.2%) 1 (1.2%) 2 (2.4%)
    VERTIGO 2 (0.8%) 0 (0%) 1 (1.2%) 1 (1.2%)
1 n (%)

Tables with Stratified Nesting: tbl_strata_nested_stack()

Oftentimes in clinical trial reporting we want a similar nested structure to hierarchical tables but don’t want to calculate event rates or counts.

For these calculations, we use the gtsummary::tbl_strata_nested_stack() function so that instead of treating the variables provided as a hierarchy they are used as nested stratifying variables. Stratified sections are then stacked to produce the full table.

This function is similar to tbl_strata() but nests and indents each stratified section. Take for example the standard laboratory results table which reports summary statistics of the AVAL variable stratified by the PARAM and AVISIT variables:

ADLB_subset <- pharmaverseadam::adlb |>
  dplyr::filter(
    PARAMCD %in% c("ALB", "ALKPH"),
    AVISIT %in% c("Baseline", "Week 2", "Week 4")
  )

tbl_strata_nested_stack(
  ADLB_subset,
  strata = c(PARAM, AVISIT),
  .tbl_fun = ~ .x |>
    crane::tbl_roche_summary(
      include = AVAL,
      by = TRT01A,
      nonmissing = "always",
      type = list(AVAL = "continuous2"),
      statistic = list(AVAL = c("{mean} ({sd})", "{median}", "{min} - {max}"))
    ) |>
    modify_header(all_stat_cols() ~ "**{level}**")
)
Placebo Xanomeline High Dose Xanomeline Low Dose
Albumin (g/L)


    Baseline


        Analysis Value


            n 86 72 94
            Mean (SD) 39.84 (2.81) 40.40 (2.74) 39.74 (2.67)
            Median 40.00 40.00 40.00
            Min - Max 32.00 - 46.00 35.00 - 49.00 32.00 - 46.00
    Week 2


        Analysis Value


            n 83 70 88
            Mean (SD) 38.9 (3.1) 39.0 (2.8) 38.7 (3.1)
            Median 39.0 39.0 39.0
            Min - Max 31.0 - 46.0 33.0 - 46.0 31.0 - 46.0
    Week 4


        Analysis Value


            n 79 72 72
            Mean (SD) 38.8 (3.3) 39.1 (3.0) 38.6 (2.8)
            Median 39.0 39.0 38.5
            Min - Max 29.0 - 47.0 30.0 - 48.0 31.0 - 45.0
Alkaline Phosphatase (U/L)


    Baseline


        Analysis Value


            n 86 72 92
            Mean (SD) 78 (58) 72 (41) 72 (20)
            Median 70 66 71
            Min - Max 28 - 565 33 - 386 31 - 155
    Week 2


        Analysis Value


            n 84 70 88
            Mean (SD) 78 (70) 72 (42) 73 (22)
            Median 67 66 69
            Min - Max 32 - 672 34 - 390 28 - 146
    Week 4


        Analysis Value


            n 82 72 72
            Mean (SD) 78 (69) 71 (41) 73 (23)
            Median 68 65 71
            Min - Max 31 - 657 34 - 382 26 - 166