Skip to contents

Compute Analysis Results Data (ARD) for dichotomous summary statistics.

Usage

ard_dichotomous(data, ...)

# S3 method for class 'data.frame'
ard_dichotomous(
  data,
  variables,
  by = dplyr::group_vars(data),
  strata = NULL,
  value = maximum_variable_value(data[variables]),
  statistic = everything() ~ c("n", "N", "p"),
  denominator = NULL,
  fmt_fn = NULL,
  stat_label = everything() ~ default_stat_labels(),
  ...
)

Arguments

data

(data.frame)
a data frame

...

Arguments passed to methods.

variables

(tidy-select)
columns to include in summaries. Default is everything().

by, strata

(tidy-select)
columns to use for grouping or stratifying the table output. Arguments are similar, but with an important distinction:

by: results are tabulated by all combinations of the columns specified, including unobserved combinations and unobserved factor levels.

strata: results are tabulated by all observed combinations of the columns specified.

Arguments may be used in conjunction with one another.

value

(named list)
named list of dichotomous values to tabulate. Default is maximum_variable_value(data), which returns the largest/last value after a sort.

statistic

(formula-list-selector)
a named list, a list of formulas, or a single formula where the list element one or more of c("n", "N", "p", "n_cum", "p_cum") (on the RHS of a formula).

denominator

(string, data.frame, integer)
Specify this argument to change the denominator, e.g. the "N" statistic. Default is 'column'. See below for details.

fmt_fn

(formula-list-selector)
a named list, a list of formulas, or a single formula where the list element is a named list of functions (or the RHS of a formula), e.g. list(mpg = list(mean = \(x) round(x, digits = 2) |> as.character())).

stat_label

(formula-list-selector)
a named list, a list of formulas, or a single formula where the list element is either a named list or a list of formulas defining the statistic labels, e.g. everything() ~ list(n = "n", p = "pct") or everything() ~ list(n ~ "n", p ~ "pct").

Value

an ARD data frame of class 'card'

Denominators

By default, the ard_categorical() function returns the statistics "n", "N", and "p", where little "n" are the counts for the variable levels, and big "N" is the number of non-missing observations. The default calculation for the percentage is merely p = n/N.

However, it is sometimes necessary to provide a different "N" to use as the denominator in this calculation. For example, in a calculation of the rates of various observed adverse events, you may need to update the denominator to the number of enrolled subjects.

In such cases, use the denominator argument to specify a new definition of "N", and subsequently "p". The argument expects one of the following inputs:

  • a string: one of "column", "row", or "cell".

    • "column", the default, returns percentages where the sum is equal to one within the variable after the data frame has been subset with by/strata.

    • "row" gives 'row' percentages where by/strata columns are the 'top' of a cross table, and the variables are the rows. This is well-defined for a single by or strata variable, and care must be taken when there are more to ensure the the results are as you expect.

    • "cell" gives percentages where the denominator is the number of non-missing rows in the source data frame.

  • a data frame. Any columns in the data frame that overlap with the by/strata columns will be used to calculate the new "N".

  • an integer. This single integer will be used as the new "N"

  • a structured data frame. The data frame will include columns from by/strata. The last column must be named "...ard_N...". The integers in this column will be used as the updated "N" in the calculations.

Examples

ard_dichotomous(mtcars, by = vs, variables = c(cyl, am), value = list(cyl = 4))
#> {cards} data frame: 12 x 11
#>    group1 group1_level variable variable_level stat_name stat_label  stat
#> 1      vs            0      cyl              4         n          n     1
#> 2      vs            0      cyl              4         N          N    18
#> 3      vs            0      cyl              4         p          % 0.056
#> 4      vs            0       am              1         n          n     6
#> 5      vs            0       am              1         N          N    18
#> 6      vs            0       am              1         p          % 0.333
#> 7      vs            1      cyl              4         n          n    10
#> 8      vs            1      cyl              4         N          N    14
#> 9      vs            1      cyl              4         p          % 0.714
#> 10     vs            1       am              1         n          n     7
#> 11     vs            1       am              1         N          N    14
#> 12     vs            1       am              1         p          %   0.5
#>  4 more variables: context, fmt_fn, warning, error

mtcars |>
  dplyr::group_by(vs) |>
  ard_dichotomous(
    variables = c(cyl, am),
    value = list(cyl = 4),
    statistic = ~"p"
  )
#> {cards} data frame: 4 x 11
#>   group1 group1_level variable variable_level stat_name stat_label  stat
#> 1     vs            0      cyl              4         p          % 0.056
#> 2     vs            0       am              1         p          % 0.333
#> 3     vs            1      cyl              4         p          % 0.714
#> 4     vs            1       am              1         p          %   0.5
#>  4 more variables: context, fmt_fn, warning, error