Summarize Variables — summarize

We use the new S3 generic function s_summary() to implement summaries for different x objects. This is used as Statistics Function in combination with the new Analyze Function summarize_vars().

Usage

s_summary(x, na.rm = TRUE, denom, .N_row, .N_col, na_level, .var, ...)

# S3 method for numeric
s_summary(
  x,
  na.rm = TRUE,
  denom,
  .N_row,
  .N_col,
  na_level,
  .var,
  control = control_summarize_vars(),
  ...
)

# S3 method for factor
s_summary(
  x,
  na.rm = TRUE,
  denom = c("n", "N_row", "N_col"),
  .N_row,
  .N_col,
  na_level = "<Missing>",
  ...
)

# S3 method for character
s_summary(
  x,
  na.rm = TRUE,
  denom = c("n", "N_row", "N_col"),
  .N_row,
  .N_col,
  na_level = "<Missing>",
  .var,
  verbose = TRUE,
  ...
)

# S3 method for logical
s_summary(
  x,
  na.rm = TRUE,
  denom = c("n", "N_row", "N_col"),
  .N_row,
  .N_col,
  ...
)

a_summary(x, ..., .N_row, .N_col, .var)

# S3 method for numeric
a_summary(
  x,
  na.rm = TRUE,
  denom,
  .N_row,
  .N_col,
  na_level,
  .var,
  control = control_summarize_vars(),
  ...
)

# S3 method for factor
a_summary(
  x,
  na.rm = TRUE,
  denom = c("n", "N_row", "N_col"),
  .N_row,
  .N_col,
  na_level = "<Missing>",
  ...
)

# S3 method for character
a_summary(
  x,
  na.rm = TRUE,
  denom = c("n", "N_row", "N_col"),
  .N_row,
  .N_col,
  na_level = "<Missing>",
  .var,
  verbose = TRUE,
  ...
)

# S3 method for logical
a_summary(
  x,
  na.rm = TRUE,
  denom = c("n", "N_row", "N_col"),
  .N_row,
  .N_col,
  ...
)

create_afun_summary(.stats, .formats, .labels, .indent_mods)

summarize_vars(
  lyt,
  vars,
  var_labels = vars,
  nested = TRUE,
  ...,
  show_labels = "default",
  table_names = vars,
  .stats = c("n", "mean_sd", "median", "range", "count_fraction"),
  .formats = NULL,
  .labels = NULL,
  .indent_mods = NULL
)

Arguments

x

(numeric)
vector of numbers we want to analyze.

na.rm

(flag)
whether NA values should be removed from x prior to analysis.

denom

(string)
choice of denominator for proportion:
can be n (number of values in this row and column intersection), N_row (total number of values in this row across columns), or N_col (total number of values in this column across rows).

.N_row

(count)
column-wise N (column count) for the full column that is passed by rtables.

.N_col

(count)
row-wise N (row group count) for the group of observations being analyzed (i.e. with no column-based subsetting) that is passed by rtables.

na_level

(string)
used to replace all NA or empty values in factors.

.var

(string)
single variable name that is passed by rtables when requested by a statistics function.

...

arguments passed to s_summary().

control

a (list) of parameters for descriptive statistics details, specified by using
the helper function control_summarize_vars(). Some possible parameter options are:

conf_level: (proportion)
confidence level of the interval for mean and median.
quantiles: numeric vector of length two to specify the quantiles.
quantile_type (numeric)
between 1 and 9 selecting quantile algorithms to be used.
See more about type in stats::quantile().
test_mean: (numeric)
to test against the mean under the null hypothesis when calculating p-value.

verbose

defaults to TRUE. It prints out warnings and messages. It is mainly used to print out information about factor casting.

.stats

(character)
statistics to select for the table.

.formats

(named character or list)
formats for the statistics.

.labels

(named character)
labels for the statistics (without indent).

.indent_mods

(named integer)
indent modifiers for the labels.

lyt

(layout)
input layout where analyses will be added to.

vars

(character)
variable names for the primary analysis variable to be iterated over.

var_labels

character for label.

nested

boolean. Should this layout instruction be applied within the existing layout structure if possible (TRUE, the default) or as a new top-level element (`FALSE). Ignored if it would nest a split underneath analyses, which is not allowed.

show_labels

label visibility: one of "default", "visible" and "hidden".

table_names

(character)
this can be customized in case that the same vars are analyzed multiple times, to avoid warnings from rtables.

Value

If x is of class numeric, returns a list with named items:

n: the length() of x.
sum: the sum() of x.
mean: the mean() of x.
sd: the stats::sd() of x.
se: the standard error of x mean, i.e.: (sd()/sqrt(length())]).
mean_sd: the mean() and stats::sd() of x.
mean_se: the mean() of x and its standard error (see above).
mean_ci: the CI for the mean of x (from stat_mean_ci()).
mean_sei: the SE interval for the mean of x, i.e.: (mean() -/+ stats::sd()/sqrt()).
mean_sdi: the SD interval for the mean of x, i.e.: (mean() -/+ stats::sd()).
mean_pval: the two-sided p-value of the mean of x (from stat_mean_pval()).
median: the stats::median() of x.
mad: the median absolute deviation of x, i.e.: (stats::median() of xc, where xc = x - stats::median()).
median_ci: the CI for the median of x (from stat_median_ci()).
quantiles: two sample quantiles of x (from stats::quantile()).
iqr: the stats::IQR() of x.
range: the range_noinf() of x.
min: the max() of x.
max: the min() of x.
cv: the coefficient of variation of x, i.e.: (sd()/mean() * 100).
geom_mean: the geometric mean of x, i.e.: (exp(mean(log(x)))).
geom_cv: the geometric coefficient of variation of x, i.e.: (sqrt(exp(sd(log(x))^2) - 1)*100).

If x is of class factor or converted from character, returns a list with named items:

n: the length() of x.
count: a list with the number of cases for each level of the factor x
count_fraction: similar to count but also includes the proportion of cases for each level of the factor x relative to the denominator, or NA if the denominator is zero.

If x is of class logical, returns a list with named items:

n: the length() of x (possibly after removing NAs).
count: count of TRUE in x.
count_fraction: count and proportion of TRUE in x relative to the denominator, or NA if the denominator is zero. Note that NAs in x are never counted or leading to NA here.

Functions

s_summary(): s_summary is a S3 generic function to produce an object description.
s_summary(numeric): Method for numeric class. Note that, if x is an empty vector, NA is returned. This is the expected feature so as to return rcell content in rtables when the intersection of a column and a row delimits an empty data selection. Also, when the mean function is applied to an empty vector, NA will be returned instead of NaN, the latter being standard behavior in R.
s_summary(factor): Method for factor class. Note that, if x is an empty factor, then still a list is returned for counts with one element per factor level. If there are no levels in x, the function fails. If x contains NA, it is expected that NA have been conveyed to na_level appropriately beforehand with df_explicit_na() or explicit_na().
s_summary(character): Method for character class. This makes an automatic conversion to factor (with a warning) and then forwards to the method for factors.
s_summary(logical): Method for logical class.
a_summary(): S3 generic Formatted Analysis function to produce an object description. It is used as afun in rtables::analyze().
a_summary(numeric): Formatted Analysis function method for numeric.
a_summary(factor): Method for factor.
a_summary(character): Formatted Analysis function method for character.
a_summary(logical): Formatted Analysis function method for logical.
create_afun_summary(): Constructor function which creates a combined Formatted Analysis function for use in layout creating functions summarize_vars() and summarize_colvars().
summarize_vars(): Analyze Function to add a descriptive analyze layer to rtables pipelines. The analysis is applied to a vector and return the summary, in rcells. The ellipsis (...) conveys arguments to s_summary(), for instance na.rm = FALSE if missing data should be accounted for. When factor variables contains NA, it is expected that NA have been conveyed to na_level appropriately beforehand with df_explicit_na().

Note

Automatic conversion of character to factor does not guarantee that the table can be generated correctly. In particular for sparse tables this very likely can fail. It is therefore better to always pre-process the dataset such that factors are manually created from character variables before passing the dataset to rtables::build_table().

Since a_summary() is generic and we want customization of the formatting arguments via rtables::make_afun(), we need to create another temporary generic function, with corresponding customized methods. Then in order for the methods to be found, we need to wrap them in a combined afun. Since this is required by two layout creating functions (and possibly others in the future), we provide a constructor that does this: create_afun_summary().

Formatting arguments

These additional formatting arguments can be passed to the layout creating function:

.stats: (character)
names of the statistics to use
.indent_mods: (integer)
named vector of indent modifiers for the labels
.formats: (character or list)
named vector of formats for the statistics
.labels: (character)
named vector of labels for the statistics (without indent)

Examples

# `s_summary.numeric`

## Basic usage: empty numeric returns NA-filled items.
s_summary(numeric())
#> $n
#> n 
#> 0 
#> 
#> $sum
#> sum 
#>  NA 
#> 
#> $mean
#> mean 
#>   NA 
#> 
#> $sd
#> sd 
#> NA 
#> 
#> $se
#> se 
#> NA 
#> 
#> $mean_sd
#> mean   sd 
#>   NA   NA 
#> 
#> $mean_se
#> mean   se 
#>   NA   NA 
#> 
#> $mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $mean_sei
#> mean_sei_lwr mean_sei_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $mean_pval
#> p_value 
#>      NA 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $median
#> median 
#>     NA 
#> 
#> $mad
#> mad 
#>  NA 
#> 
#> $median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $quantiles
#> quantile_0.25 quantile_0.75 
#>            NA            NA 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $iqr
#> iqr 
#>  NA 
#> 
#> $range
#> min max 
#>  NA  NA 
#> 
#> $min
#> min 
#>  NA 
#> 
#> $max
#> max 
#>  NA 
#> 
#> $cv
#> cv 
#> NA 
#> 
#> $geom_mean
#> geom_mean 
#>       NaN 
#> 
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $geom_cv
#> geom_cv 
#>      NA 
#> 

## Management of NA values.
x <- c(NA_real_, 1)
s_summary(x, na.rm = TRUE)
#> $n
#> n 
#> 1 
#> 
#> $sum
#> sum 
#>   1 
#> 
#> $mean
#> mean 
#>    1 
#> 
#> $sd
#> sd 
#> NA 
#> 
#> $se
#> se 
#> NA 
#> 
#> $mean_sd
#> mean   sd 
#>    1   NA 
#> 
#> $mean_se
#> mean   se 
#>    1   NA 
#> 
#> $mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $mean_sei
#> mean_sei_lwr mean_sei_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $mean_pval
#> p_value 
#>      NA 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $median
#> median 
#>      1 
#> 
#> $mad
#> mad 
#>   0 
#> 
#> $median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $quantiles
#> quantile_0.25 quantile_0.75 
#>             1             1 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $iqr
#> iqr 
#>   0 
#> 
#> $range
#> min max 
#>   1   1 
#> 
#> $min
#> min 
#>   1 
#> 
#> $max
#> max 
#>   1 
#> 
#> $cv
#> cv 
#> NA 
#> 
#> $geom_mean
#> geom_mean 
#>         1 
#> 
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $geom_cv
#> geom_cv 
#>      NA 
#> 
s_summary(x, na.rm = FALSE)
#> $n
#> n 
#> 2 
#> 
#> $sum
#> sum 
#>  NA 
#> 
#> $mean
#> mean 
#>   NA 
#> 
#> $sd
#> sd 
#> NA 
#> 
#> $se
#> se 
#> NA 
#> 
#> $mean_sd
#> mean   sd 
#>   NA   NA 
#> 
#> $mean_se
#> mean   se 
#>   NA   NA 
#> 
#> $mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $mean_sei
#> mean_sei_lwr mean_sei_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $mean_pval
#> p_value 
#>      NA 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $median
#> median 
#>     NA 
#> 
#> $mad
#> mad 
#>  NA 
#> 
#> $median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $quantiles
#> quantile_0.25 quantile_0.75 
#>            NA            NA 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $iqr
#> iqr 
#>  NA 
#> 
#> $range
#> min max 
#>  NA  NA 
#> 
#> $min
#> min 
#>  NA 
#> 
#> $max
#> max 
#>  NA 
#> 
#> $cv
#> cv 
#> NA 
#> 
#> $geom_mean
#> geom_mean 
#>        NA 
#> 
#> $geom_mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $geom_cv
#> geom_cv 
#>      NA 
#> 

x <- c(NA_real_, 1, 2)
s_summary(x, stats = NULL)
#> $n
#> n 
#> 2 
#> 
#> $sum
#> sum 
#>   3 
#> 
#> $mean
#> mean 
#>  1.5 
#> 
#> $sd
#>        sd 
#> 0.7071068 
#> 
#> $se
#>  se 
#> 0.5 
#> 
#> $mean_sd
#>      mean        sd 
#> 1.5000000 0.7071068 
#> 
#> $mean_se
#> mean   se 
#>  1.5  0.5 
#> 
#> $mean_ci
#> mean_ci_lwr mean_ci_upr 
#>   -4.853102    7.853102 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $mean_sei
#> mean_sei_lwr mean_sei_upr 
#>            1            2 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>    0.7928932    2.2071068 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $mean_pval
#>   p_value 
#> 0.2048328 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $median
#> median 
#>    1.5 
#> 
#> $mad
#> mad 
#>   0 
#> 
#> $median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $quantiles
#> quantile_0.25 quantile_0.75 
#>             1             2 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $iqr
#> iqr 
#>   1 
#> 
#> $range
#> min max 
#>   1   2 
#> 
#> $min
#> min 
#>   1 
#> 
#> $max
#> max 
#>   2 
#> 
#> $cv
#>       cv 
#> 47.14045 
#> 
#> $geom_mean
#> geom_mean 
#>  1.414214 
#> 
#> $geom_mean_ci
#>  mean_ci_lwr  mean_ci_upr 
#>   0.01729978 115.60839614 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $geom_cv
#>  geom_cv 
#> 52.10922 
#> 

## Benefits in `rtables` contructions:
require(rtables)
dta_test <- data.frame(
  Group = rep(LETTERS[1:3], each = 2),
  sub_group = rep(letters[1:2], each = 3),
  x = 1:6
)

## The summary obtained in with `rtables`:
basic_table() %>%
  split_cols_by(var = "Group") %>%
  split_rows_by(var = "sub_group") %>%
  analyze(vars = "x", afun = s_summary) %>%
  build_table(df = dta_test)
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#> Warning: number of items to replace is not a multiple of replacement length
#>                                                  A                       B                      C                  
#> ———————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> a                                                                                                                  
#>   n                                              2                       1                      0                  
#>   sum                                            3                       3                      NA                 
#>   mean                                          1.5                      3                      NA                 
#>   sd                                     0.707106781186548              NA                      NA                 
#>   se                                            0.5                     NA                      NA                 
#>   mean_sd                              1.5, 0.707106781186548          3, NA                    NA                 
#>   mean_se                                     1.5, 0.5                 3, NA                    NA                 
#>   Mean 95% CI                   -4.85310236808735, 7.85310236808735     NA                      NA                 
#>   Mean -/+ 1xSE                                 1, 2                    NA                      NA                 
#>   Mean -/+ 1xSD                 0.792893218813452, 2.20710678118655     NA                      NA                 
#>   Mean p-value (H0: mean = 0)            0.204832764699133              NA                      NA                 
#>   median                                        1.5                      3                      NA                 
#>   mad                                            0                       0                      NA                 
#>   Median 95% CI                                  NA                     NA                      NA                 
#>   25% and 75%-ile                               1, 2                   3, 3                     NA                 
#>   iqr                                            1                       0                      NA                 
#>   range                                         1, 2                   3, 3                     NA                 
#>   min                                            1                       3                      NA                 
#>   max                                            2                       3                      NA                 
#>   cv                                      47.1404520791032              NA                      NA                 
#>   geom_mean                               1.41421356237309               3                      NA                 
#>   Geometric Mean 95% CI         0.0172997815631007, 115.608396135236    NA                      NA                 
#>   geom_cv                                 52.1092246837487              NA                      NA                 
#> b                                                                                                                  
#>   n                                              0                       1                      2                  
#>   sum                                            NA                      4                      11                 
#>   mean                                           NA                      4                     5.5                 
#>   sd                                             NA                     NA              0.707106781186548          
#>   se                                             NA                     NA                     0.5                 
#>   mean_sd                                        NA                    4, NA          5.5, 0.707106781186548       
#>   mean_se                                        NA                    4, NA                 5.5, 0.5              
#>   Mean 95% CI                                    NA                     NA     -0.853102368087347, 11.8531023680873
#>   Mean -/+ 1xSE                                  NA                     NA                     5, 6                
#>   Mean -/+ 1xSD                                  NA                     NA      4.79289321881345, 6.20710678118655 
#>   Mean p-value (H0: mean = 0)                    NA                     NA              0.0577158767526089         
#>   median                                         NA                      4                     5.5                 
#>   mad                                            NA                      0                      0                  
#>   Median 95% CI                                  NA                     NA                      NA                 
#>   25% and 75%-ile                                NA                    4, 4                    5, 6                
#>   iqr                                            NA                      0                      1                  
#>   range                                          NA                    4, 4                    5, 6                
#>   min                                            NA                      4                      5                  
#>   max                                            NA                      4                      6                  
#>   cv                                             NA                     NA               12.8564869306645          
#>   geom_mean                                      NA                      4               5.47722557505166          
#>   Geometric Mean 95% CI                          NA                     NA      1.71994304449266, 17.4424380482025 
#>   geom_cv                                        NA                     NA               12.945835316564           

## By comparison with `lapply`:
X <- split(dta_test, f = with(dta_test, interaction(Group, sub_group)))
lapply(X, function(x) s_summary(x$x))
#> $A.a
#> $A.a$n
#> n 
#> 2 
#> 
#> $A.a$sum
#> sum 
#>   3 
#> 
#> $A.a$mean
#> mean 
#>  1.5 
#> 
#> $A.a$sd
#>        sd 
#> 0.7071068 
#> 
#> $A.a$se
#>  se 
#> 0.5 
#> 
#> $A.a$mean_sd
#>      mean        sd 
#> 1.5000000 0.7071068 
#> 
#> $A.a$mean_se
#> mean   se 
#>  1.5  0.5 
#> 
#> $A.a$mean_ci
#> mean_ci_lwr mean_ci_upr 
#>   -4.853102    7.853102 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $A.a$mean_sei
#> mean_sei_lwr mean_sei_upr 
#>            1            2 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $A.a$mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>    0.7928932    2.2071068 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $A.a$mean_pval
#>   p_value 
#> 0.2048328 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $A.a$median
#> median 
#>    1.5 
#> 
#> $A.a$mad
#> mad 
#>   0 
#> 
#> $A.a$median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $A.a$quantiles
#> quantile_0.25 quantile_0.75 
#>             1             2 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $A.a$iqr
#> iqr 
#>   1 
#> 
#> $A.a$range
#> min max 
#>   1   2 
#> 
#> $A.a$min
#> min 
#>   1 
#> 
#> $A.a$max
#> max 
#>   2 
#> 
#> $A.a$cv
#>       cv 
#> 47.14045 
#> 
#> $A.a$geom_mean
#> geom_mean 
#>  1.414214 
#> 
#> $A.a$geom_mean_ci
#>  mean_ci_lwr  mean_ci_upr 
#>   0.01729978 115.60839614 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $A.a$geom_cv
#>  geom_cv 
#> 52.10922 
#> 
#> 
#> $B.a
#> $B.a$n
#> n 
#> 1 
#> 
#> $B.a$sum
#> sum 
#>   3 
#> 
#> $B.a$mean
#> mean 
#>    3 
#> 
#> $B.a$sd
#> sd 
#> NA 
#> 
#> $B.a$se
#> se 
#> NA 
#> 
#> $B.a$mean_sd
#> mean   sd 
#>    3   NA 
#> 
#> $B.a$mean_se
#> mean   se 
#>    3   NA 
#> 
#> $B.a$mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $B.a$mean_sei
#> mean_sei_lwr mean_sei_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $B.a$mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $B.a$mean_pval
#> p_value 
#>      NA 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $B.a$median
#> median 
#>      3 
#> 
#> $B.a$mad
#> mad 
#>   0 
#> 
#> $B.a$median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $B.a$quantiles
#> quantile_0.25 quantile_0.75 
#>             3             3 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $B.a$iqr
#> iqr 
#>   0 
#> 
#> $B.a$range
#> min max 
#>   3   3 
#> 
#> $B.a$min
#> min 
#>   3 
#> 
#> $B.a$max
#> max 
#>   3 
#> 
#> $B.a$cv
#> cv 
#> NA 
#> 
#> $B.a$geom_mean
#> geom_mean 
#>         3 
#> 
#> $B.a$geom_mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $B.a$geom_cv
#> geom_cv 
#>      NA 
#> 
#> 
#> $C.a
#> $C.a$n
#> n 
#> 0 
#> 
#> $C.a$sum
#> sum 
#>  NA 
#> 
#> $C.a$mean
#> mean 
#>   NA 
#> 
#> $C.a$sd
#> sd 
#> NA 
#> 
#> $C.a$se
#> se 
#> NA 
#> 
#> $C.a$mean_sd
#> mean   sd 
#>   NA   NA 
#> 
#> $C.a$mean_se
#> mean   se 
#>   NA   NA 
#> 
#> $C.a$mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $C.a$mean_sei
#> mean_sei_lwr mean_sei_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $C.a$mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $C.a$mean_pval
#> p_value 
#>      NA 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $C.a$median
#> median 
#>     NA 
#> 
#> $C.a$mad
#> mad 
#>  NA 
#> 
#> $C.a$median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $C.a$quantiles
#> quantile_0.25 quantile_0.75 
#>            NA            NA 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $C.a$iqr
#> iqr 
#>  NA 
#> 
#> $C.a$range
#> min max 
#>  NA  NA 
#> 
#> $C.a$min
#> min 
#>  NA 
#> 
#> $C.a$max
#> max 
#>  NA 
#> 
#> $C.a$cv
#> cv 
#> NA 
#> 
#> $C.a$geom_mean
#> geom_mean 
#>       NaN 
#> 
#> $C.a$geom_mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $C.a$geom_cv
#> geom_cv 
#>      NA 
#> 
#> 
#> $A.b
#> $A.b$n
#> n 
#> 0 
#> 
#> $A.b$sum
#> sum 
#>  NA 
#> 
#> $A.b$mean
#> mean 
#>   NA 
#> 
#> $A.b$sd
#> sd 
#> NA 
#> 
#> $A.b$se
#> se 
#> NA 
#> 
#> $A.b$mean_sd
#> mean   sd 
#>   NA   NA 
#> 
#> $A.b$mean_se
#> mean   se 
#>   NA   NA 
#> 
#> $A.b$mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $A.b$mean_sei
#> mean_sei_lwr mean_sei_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $A.b$mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $A.b$mean_pval
#> p_value 
#>      NA 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $A.b$median
#> median 
#>     NA 
#> 
#> $A.b$mad
#> mad 
#>  NA 
#> 
#> $A.b$median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $A.b$quantiles
#> quantile_0.25 quantile_0.75 
#>            NA            NA 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $A.b$iqr
#> iqr 
#>  NA 
#> 
#> $A.b$range
#> min max 
#>  NA  NA 
#> 
#> $A.b$min
#> min 
#>  NA 
#> 
#> $A.b$max
#> max 
#>  NA 
#> 
#> $A.b$cv
#> cv 
#> NA 
#> 
#> $A.b$geom_mean
#> geom_mean 
#>       NaN 
#> 
#> $A.b$geom_mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $A.b$geom_cv
#> geom_cv 
#>      NA 
#> 
#> 
#> $B.b
#> $B.b$n
#> n 
#> 1 
#> 
#> $B.b$sum
#> sum 
#>   4 
#> 
#> $B.b$mean
#> mean 
#>    4 
#> 
#> $B.b$sd
#> sd 
#> NA 
#> 
#> $B.b$se
#> se 
#> NA 
#> 
#> $B.b$mean_sd
#> mean   sd 
#>    4   NA 
#> 
#> $B.b$mean_se
#> mean   se 
#>    4   NA 
#> 
#> $B.b$mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $B.b$mean_sei
#> mean_sei_lwr mean_sei_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $B.b$mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>           NA           NA 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $B.b$mean_pval
#> p_value 
#>      NA 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $B.b$median
#> median 
#>      4 
#> 
#> $B.b$mad
#> mad 
#>   0 
#> 
#> $B.b$median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $B.b$quantiles
#> quantile_0.25 quantile_0.75 
#>             4             4 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $B.b$iqr
#> iqr 
#>   0 
#> 
#> $B.b$range
#> min max 
#>   4   4 
#> 
#> $B.b$min
#> min 
#>   4 
#> 
#> $B.b$max
#> max 
#>   4 
#> 
#> $B.b$cv
#> cv 
#> NA 
#> 
#> $B.b$geom_mean
#> geom_mean 
#>         4 
#> 
#> $B.b$geom_mean_ci
#> mean_ci_lwr mean_ci_upr 
#>          NA          NA 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $B.b$geom_cv
#> geom_cv 
#>      NA 
#> 
#> 
#> $C.b
#> $C.b$n
#> n 
#> 2 
#> 
#> $C.b$sum
#> sum 
#>  11 
#> 
#> $C.b$mean
#> mean 
#>  5.5 
#> 
#> $C.b$sd
#>        sd 
#> 0.7071068 
#> 
#> $C.b$se
#>  se 
#> 0.5 
#> 
#> $C.b$mean_sd
#>      mean        sd 
#> 5.5000000 0.7071068 
#> 
#> $C.b$mean_se
#> mean   se 
#>  5.5  0.5 
#> 
#> $C.b$mean_ci
#> mean_ci_lwr mean_ci_upr 
#>  -0.8531024  11.8531024 
#> attr(,"label")
#> [1] "Mean 95% CI"
#> 
#> $C.b$mean_sei
#> mean_sei_lwr mean_sei_upr 
#>            5            6 
#> attr(,"label")
#> [1] "Mean -/+ 1xSE"
#> 
#> $C.b$mean_sdi
#> mean_sdi_lwr mean_sdi_upr 
#>     4.792893     6.207107 
#> attr(,"label")
#> [1] "Mean -/+ 1xSD"
#> 
#> $C.b$mean_pval
#>    p_value 
#> 0.05771588 
#> attr(,"label")
#> [1] "Mean p-value (H0: mean = 0)"
#> 
#> $C.b$median
#> median 
#>    5.5 
#> 
#> $C.b$mad
#> mad 
#>   0 
#> 
#> $C.b$median_ci
#> median_ci_lwr median_ci_upr 
#>            NA            NA 
#> attr(,"conf_level")
#> [1] NA
#> attr(,"label")
#> [1] "Median 95% CI"
#> 
#> $C.b$quantiles
#> quantile_0.25 quantile_0.75 
#>             5             6 
#> attr(,"label")
#> [1] "25% and 75%-ile"
#> 
#> $C.b$iqr
#> iqr 
#>   1 
#> 
#> $C.b$range
#> min max 
#>   5   6 
#> 
#> $C.b$min
#> min 
#>   5 
#> 
#> $C.b$max
#> max 
#>   6 
#> 
#> $C.b$cv
#>       cv 
#> 12.85649 
#> 
#> $C.b$geom_mean
#> geom_mean 
#>  5.477226 
#> 
#> $C.b$geom_mean_ci
#> mean_ci_lwr mean_ci_upr 
#>    1.719943   17.442438 
#> attr(,"label")
#> [1] "Geometric Mean 95% CI"
#> 
#> $C.b$geom_cv
#>  geom_cv 
#> 12.94584 
#> 
#> 
# `s_summary.factor`

## Basic usage:
s_summary(factor(c("a", "a", "b", "c", "a")))
#> $n
#> [1] 5
#> 
#> $count
#> $count$a
#> [1] 3
#> 
#> $count$b
#> [1] 1
#> 
#> $count$c
#> [1] 1
#> 
#> 
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.6
#> 
#> $count_fraction$b
#> [1] 1.0 0.2
#> 
#> $count_fraction$c
#> [1] 1.0 0.2
#> 
#> 
#> $n_blq
#> [1] 0
#> 
# Empty factor returns NA-filled items.
s_summary(factor(levels = c("a", "b", "c")))
#> $n
#> [1] 0
#> 
#> $count
#> $count$a
#> [1] 0
#> 
#> $count$b
#> [1] 0
#> 
#> $count$c
#> [1] 0
#> 
#> 
#> $count_fraction
#> $count_fraction$a
#> [1] 0 0
#> 
#> $count_fraction$b
#> [1] 0 0
#> 
#> $count_fraction$c
#> [1] 0 0
#> 
#> 
#> $n_blq
#> [1] 0
#> 

## Management of NA values.
x <- factor(c(NA, "Female"))
x <- explicit_na(x)
s_summary(x, na.rm = TRUE)
#> $n
#> [1] 1
#> 
#> $count
#> $count$Female
#> [1] 1
#> 
#> 
#> $count_fraction
#> $count_fraction$Female
#> [1] 1 1
#> 
#> 
#> $n_blq
#> [1] 0
#> 
s_summary(x, na.rm = FALSE)
#> $n
#> [1] 2
#> 
#> $count
#> $count$Female
#> [1] 1
#> 
#> $count$`<Missing>`
#> [1] 1
#> 
#> 
#> $count_fraction
#> $count_fraction$Female
#> [1] 1.0 0.5
#> 
#> $count_fraction$`<Missing>`
#> [1] 1.0 0.5
#> 
#> 
#> $n_blq
#> [1] 0
#> 

## Different denominators.
x <- factor(c("a", "a", "b", "c", "a"))
s_summary(x, denom = "N_row", .N_row = 10L)
#> $n
#> [1] 5
#> 
#> $count
#> $count$a
#> [1] 3
#> 
#> $count$b
#> [1] 1
#> 
#> $count$c
#> [1] 1
#> 
#> 
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.3
#> 
#> $count_fraction$b
#> [1] 1.0 0.1
#> 
#> $count_fraction$c
#> [1] 1.0 0.1
#> 
#> 
#> $n_blq
#> [1] 0
#> 
s_summary(x, denom = "N_col", .N_col = 20L)
#> $n
#> [1] 5
#> 
#> $count
#> $count$a
#> [1] 3
#> 
#> $count$b
#> [1] 1
#> 
#> $count$c
#> [1] 1
#> 
#> 
#> $count_fraction
#> $count_fraction$a
#> [1] 3.00 0.15
#> 
#> $count_fraction$b
#> [1] 1.00 0.05
#> 
#> $count_fraction$c
#> [1] 1.00 0.05
#> 
#> 
#> $n_blq
#> [1] 0
#> 
# `s_summary.character`

## Basic usage:
s_summary(c("a", "a", "b", "c", "a"), .var = "x", verbose = FALSE)
#> $n
#> [1] 5
#> 
#> $count
#> $count$a
#> [1] 3
#> 
#> $count$b
#> [1] 1
#> 
#> $count$c
#> [1] 1
#> 
#> 
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.6
#> 
#> $count_fraction$b
#> [1] 1.0 0.2
#> 
#> $count_fraction$c
#> [1] 1.0 0.2
#> 
#> 
#> $n_blq
#> [1] 0
#> 
s_summary(c("a", "a", "b", "c", "a", ""), .var = "x", na.rm = FALSE, verbose = FALSE)
#> $n
#> [1] 6
#> 
#> $count
#> $count$a
#> [1] 3
#> 
#> $count$b
#> [1] 1
#> 
#> $count$c
#> [1] 1
#> 
#> $count$`<Missing>`
#> [1] 1
#> 
#> 
#> $count_fraction
#> $count_fraction$a
#> [1] 3.0 0.5
#> 
#> $count_fraction$b
#> [1] 1.0000000 0.1666667
#> 
#> $count_fraction$c
#> [1] 1.0000000 0.1666667
#> 
#> $count_fraction$`<Missing>`
#> [1] 1.0000000 0.1666667
#> 
#> 
#> $n_blq
#> [1] 0
#> 
# `s_summary.logical`

## Basic usage:
s_summary(c(TRUE, FALSE, TRUE, TRUE))
#> $n
#> [1] 4
#> 
#> $count
#> [1] 3
#> 
#> $count_fraction
#> [1] 3.00 0.75
#> 
#> $n_blq
#> [1] 0
#> 

## Management of NA values.
x <- c(NA, TRUE, FALSE)
s_summary(x, na.rm = TRUE)
#> $n
#> [1] 2
#> 
#> $count
#> [1] 1
#> 
#> $count_fraction
#> [1] 1.0 0.5
#> 
#> $n_blq
#> [1] 0
#> 
s_summary(x, na.rm = FALSE)
#> $n
#> [1] 3
#> 
#> $count
#> [1] 1
#> 
#> $count_fraction
#> [1] 1.0000000 0.3333333
#> 
#> $n_blq
#> [1] 0
#> 

## Different denominators.
x <- c(TRUE, FALSE, TRUE, TRUE)
s_summary(x, denom = "N_row", .N_row = 10L)
#> $n
#> [1] 4
#> 
#> $count
#> [1] 3
#> 
#> $count_fraction
#> [1] 3.0 0.3
#> 
#> $n_blq
#> [1] 0
#> 
s_summary(x, denom = "N_col", .N_col = 20L)
#> $n
#> [1] 4
#> 
#> $count
#> [1] 3
#> 
#> $count_fraction
#> [1] 3.00 0.15
#> 
#> $n_blq
#> [1] 0
#> 
# `a_summary.numeric`
a_summary(rnorm(10), .N_col = 10, .N_row = 20, .var = "bla")
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#>        row_name formatted_cell indent_mod                   row_label
#> 1             n             10          0                           n
#> 2           sum            1.1          0                         Sum
#> 3          mean            0.1          0                        Mean
#> 4            sd            1.0          0                          SD
#> 5            se            0.3          0                          SE
#> 6       mean_sd      0.1 (1.0)          0                   Mean (SD)
#> 7       mean_se      0.1 (0.3)          0                   Mean (SE)
#> 8       mean_ci  (-0.63, 0.86)          0                 Mean 95% CI
#> 9      mean_sei  (-0.22, 0.44)          0               Mean -/+ 1xSE
#> 10     mean_sdi  (-0.93, 1.16)          0               Mean -/+ 1xSD
#> 11    mean_pval           0.74          0 Mean p-value (H0: mean = 0)
#> 12       median            0.2          0                      Median
#> 13          mad            0.0          0   Median Absolute Deviation
#> 14    median_ci  (-0.62, 1.12)          0               Median 95% CI
#> 15    quantiles     -0.3 - 0.7          0             25% and 75%-ile
#> 16          iqr            1.0          0                         IQR
#> 17        range     -2.2 - 1.5          0                   Min - Max
#> 18          min           -2.2          0                     Minimum
#> 19          max            1.5          0                     Maximum
#> 20           cv          918.5          0                      CV (%)
#> 21    geom_mean             NA          0              Geometric Mean
#> 22 geom_mean_ci             NA          0       Geometric Mean 95% CI
#> 23      geom_cv             NA          0         CV % Geometric Mean
# `a_summary.factor`
# We need to ungroup `count` and `count_fraction` first so that the rtables formatting
# functions can be applied correctly.
afun <- make_afun(
  getS3method("a_summary", "factor"),
  .ungroup_stats = c("count", "count_fraction")
)
afun(factor(c("a", "a", "b", "c", "a")), .N_row = 10, .N_col = 10)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#>   row_name formatted_cell indent_mod row_label
#> 1        n              5          0         n
#> 2        a              3          0         a
#> 3        b              1          0         b
#> 4        c              1          0         c
#> 5        a        3 (60%)          0         a
#> 6        b        1 (20%)          0         b
#> 7        c        1 (20%)          0         c
#> 8    n_blq              0          0     n_blq
# `a_summary.character`
afun <- make_afun(
  getS3method("a_summary", "character"),
  .ungroup_stats = c("count", "count_fraction")
)
afun(c("A", "B", "A", "C"), .var = "x", .N_col = 10, .N_row = 10, verbose = FALSE)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#>   row_name formatted_cell indent_mod row_label
#> 1        n              4          0         n
#> 2        A              2          0         A
#> 3        B              1          0         B
#> 4        C              1          0         C
#> 5        A        2 (50%)          0         A
#> 6        B        1 (25%)          0         B
#> 7        C        1 (25%)          0         C
#> 8    n_blq              0          0     n_blq
# `a_summary.logical`
afun <- make_afun(
  getS3method("a_summary", "logical")
)
afun(c(TRUE, FALSE, FALSE, TRUE, TRUE), .N_row = 10, .N_col = 10)
#> RowsVerticalSection (in_rows) object print method:
#> ----------------------------
#>         row_name formatted_cell indent_mod      row_label
#> 1              n              5          0              n
#> 2          count              3          0          count
#> 3 count_fraction        3 (60%)          0 count_fraction
#> 4          n_blq              0          0          n_blq
# `create_afun_summary()` to create combined `afun`

afun <- create_afun_summary(
  .stats = NULL,
  .formats = c(median = "xx."),
  .labels = c(median = "My median"),
  .indent_mods = c(median = 1L)
)
## Fabricated dataset.
dta_test <- data.frame(
  USUBJID = rep(1:6, each = 3),
  PARAMCD = rep("lab", 6 * 3),
  AVISIT  = rep(paste0("V", 1:3), 6),
  ARM     = rep(LETTERS[1:3], rep(6, 3)),
  AVAL    = c(9:1, rep(NA, 9))
)

l <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by(var = "AVISIT") %>%
  analyze(vars = "AVAL", afun = afun)

build_table(l, df = dta_test)
#>                                        A              B       C 
#> ————————————————————————————————————————————————————————————————
#> V1                                                              
#>   n                                    2              1       0 
#>   Sum                                15.0            3.0      NA
#>   Mean                                7.5            3.0      NA
#>   SD                                  2.1            NA       NA
#>   SE                                  1.5            NA       NA
#>   Mean (SD)                        7.5 (2.1)      3.0 (NA)    NA
#>   Mean (SE)                        7.5 (1.5)      3.0 (NA)    NA
#>   Mean 95% CI                   (-11.56, 26.56)      NA       NA
#>   Mean -/+ 1xSE                  (6.00, 9.00)        NA       NA
#>   Mean -/+ 1xSD                  (5.38, 9.62)        NA       NA
#>   Mean p-value (H0: mean = 0)        0.13            NA       NA
#>     My median                          8              3       NA
#>   Median Absolute Deviation           0.0            0.0      NA
#>   Median 95% CI                       NA             NA       NA
#>   25% and 75%-ile                  6.0 - 9.0      3.0 - 3.0   NA
#>   IQR                                 3.0            0.0      NA
#>   Min - Max                        6.0 - 9.0      3.0 - 3.0   NA
#>   CV (%)                             28.3            NA       NA
#>   Minimum                             6.0            3.0      NA
#>   Maximum                             9.0            3.0      NA
#>   Geometric Mean                      7.3            3.0      NA
#>   CV % Geometric Mean                29.3            NA       NA
#> V2                                                              
#>   n                                    2              1       0 
#>   Sum                                13.0            2.0      NA
#>   Mean                                6.5            2.0      NA
#>   SD                                  2.1            NA       NA
#>   SE                                  1.5            NA       NA
#>   Mean (SD)                        6.5 (2.1)      2.0 (NA)    NA
#>   Mean (SE)                        6.5 (1.5)      2.0 (NA)    NA
#>   Mean 95% CI                   (-12.56, 25.56)      NA       NA
#>   Mean -/+ 1xSE                  (5.00, 8.00)        NA       NA
#>   Mean -/+ 1xSD                  (4.38, 8.62)        NA       NA
#>   Mean p-value (H0: mean = 0)        0.14            NA       NA
#>     My median                          6              2       NA
#>   Median Absolute Deviation           0.0            0.0      NA
#>   Median 95% CI                       NA             NA       NA
#>   25% and 75%-ile                  5.0 - 8.0      2.0 - 2.0   NA
#>   IQR                                 3.0            0.0      NA
#>   Min - Max                        5.0 - 8.0      2.0 - 2.0   NA
#>   CV (%)                             32.6            NA       NA
#>   Minimum                             5.0            2.0      NA
#>   Maximum                             8.0            2.0      NA
#>   Geometric Mean                      6.3            2.0      NA
#>   CV % Geometric Mean                34.2            NA       NA
#> V3                                                              
#>   n                                    2              1       0 
#>   Sum                                11.0            1.0      NA
#>   Mean                                5.5            1.0      NA
#>   SD                                  2.1            NA       NA
#>   SE                                  1.5            NA       NA
#>   Mean (SD)                        5.5 (2.1)      1.0 (NA)    NA
#>   Mean (SE)                        5.5 (1.5)      1.0 (NA)    NA
#>   Mean 95% CI                   (-13.56, 24.56)      NA       NA
#>   Mean -/+ 1xSE                  (4.00, 7.00)        NA       NA
#>   Mean -/+ 1xSD                  (3.38, 7.62)        NA       NA
#>   Mean p-value (H0: mean = 0)        0.17            NA       NA
#>     My median                          6              1       NA
#>   Median Absolute Deviation           0.0            0.0      NA
#>   Median 95% CI                       NA             NA       NA
#>   25% and 75%-ile                  4.0 - 7.0      1.0 - 1.0   NA
#>   IQR                                 3.0            0.0      NA
#>   Min - Max                        4.0 - 7.0      1.0 - 1.0   NA
#>   CV (%)                             38.6            NA       NA
#>   Minimum                             4.0            1.0      NA
#>   Maximum                             7.0            1.0      NA
#>   Geometric Mean                      5.3            1.0      NA
#>   CV % Geometric Mean                41.2            NA       NA

# `summarize_vars()` in `rtables` pipelines

## Default output within a `rtables` pipeline.
# dta_test <- <needs_to_be_inputted_to_work>
l <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by(var = "AVISIT") %>%
  summarize_vars(vars = "AVAL")

build_table(l, df = dta_test)
#>                   A           B       C 
#> ————————————————————————————————————————
#> V1                                      
#>   n               2           1       0 
#>   Mean (SD)   7.5 (2.1)   3.0 (NA)    NA
#>   Median         7.5         3.0      NA
#>   Min - Max   6.0 - 9.0   3.0 - 3.0   NA
#> V2                                      
#>   n               2           1       0 
#>   Mean (SD)   6.5 (2.1)   2.0 (NA)    NA
#>   Median         6.5         2.0      NA
#>   Min - Max   5.0 - 8.0   2.0 - 2.0   NA
#> V3                                      
#>   n               2           1       0 
#>   Mean (SD)   5.5 (2.1)   1.0 (NA)    NA
#>   Median         5.5         1.0      NA
#>   Min - Max   4.0 - 7.0   1.0 - 1.0   NA

## Select and format statistics output.
l <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by(var = "AVISIT") %>%
  summarize_vars(
    vars = "AVAL",
    .stats = c("n", "mean_sd", "quantiles"),
    .formats = c("mean_sd" = "xx.x, xx.x"),
    .labels = c(n = "n", mean_sd = "Mean, SD", quantiles = c("Q1 - Q3"))
  )

results <- build_table(l, df = dta_test)
as_html(results)
#> <div class="rtables-all-parts-block rtables-container">
#>   <table class="table table-condensed table-hover">
#>     <tr>
#>       <th style="white-space:pre;"></th>
#>       <th class="text-center">A</th>
#>       <th class="text-center">B</th>
#>       <th class="text-center">C</th>
#>     </tr>
#>     <tr>
#>       <td class="text-left">V1</td>
#>       <td class="text-center"></td>
#>       <td class="text-center"></td>
#>       <td class="text-center"></td>
#>     </tr>
#>     <tr>
#>       <td class="text-left" style="padding-left: 3ch">n</td>
#>       <td class="text-center">2</td>
#>       <td class="text-center">1</td>
#>       <td class="text-center">0</td>
#>     </tr>
#>     <tr>
#>       <td class="text-left" style="padding-left: 3ch">Mean, SD</td>
#>       <td class="text-center">7.5, 2.1</td>
#>       <td class="text-center">3.0, NA</td>
#>       <td class="text-center">NA</td>
#>     </tr>
#>     <tr>
#>       <td class="text-left" style="padding-left: 3ch">Q1 - Q3</td>
#>       <td class="text-center">6.0 - 9.0</td>
#>       <td class="text-center">3.0 - 3.0</td>
#>       <td class="text-center">NA</td>
#>     </tr>
#>     <tr>
#>       <td class="text-left">V2</td>
#>       <td class="text-center"></td>
#>       <td class="text-center"></td>
#>       <td class="text-center"></td>
#>     </tr>
#>     <tr>
#>       <td class="text-left" style="padding-left: 3ch">n</td>
#>       <td class="text-center">2</td>
#>       <td class="text-center">1</td>
#>       <td class="text-center">0</td>
#>     </tr>
#>     <tr>
#>       <td class="text-left" style="padding-left: 3ch">Mean, SD</td>
#>       <td class="text-center">6.5, 2.1</td>
#>       <td class="text-center">2.0, NA</td>
#>       <td class="text-center">NA</td>
#>     </tr>
#>     <tr>
#>       <td class="text-left" style="padding-left: 3ch">Q1 - Q3</td>
#>       <td class="text-center">5.0 - 8.0</td>
#>       <td class="text-center">2.0 - 2.0</td>
#>       <td class="text-center">NA</td>
#>     </tr>
#>     <tr>
#>       <td class="text-left">V3</td>
#>       <td class="text-center"></td>
#>       <td class="text-center"></td>
#>       <td class="text-center"></td>
#>     </tr>
#>     <tr>
#>       <td class="text-left" style="padding-left: 3ch">n</td>
#>       <td class="text-center">2</td>
#>       <td class="text-center">1</td>
#>       <td class="text-center">0</td>
#>     </tr>
#>     <tr>
#>       <td class="text-left" style="padding-left: 3ch">Mean, SD</td>
#>       <td class="text-center">5.5, 2.1</td>
#>       <td class="text-center">1.0, NA</td>
#>       <td class="text-center">NA</td>
#>     </tr>
#>     <tr>
#>       <td class="text-left" style="padding-left: 3ch">Q1 - Q3</td>
#>       <td class="text-center">4.0 - 7.0</td>
#>       <td class="text-center">1.0 - 1.0</td>
#>       <td class="text-center">NA</td>
#>     </tr>
#>     <caption style="caption-side:top;"><div class="rtables-titles-block rtables-container">
#>         <div class="rtables-main-titles-block rtables-container">
#>           <p class="rtables-main-title"></p>
#>         </div>
#>         <div class="rtables-subtitles-block rtables-container"></div>
#>       </div>
#>     </caption>
#>   </table>
#>   <div class="rtables-footers-block rtables-container"></div>
#> </div>

## Use arguments interpreted by `s_summary`.
l <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  split_rows_by(var = "AVISIT") %>%
  summarize_vars(vars = "AVAL", na.rm = FALSE)

results <- build_table(l, df = dta_test)

## Handle `NA` levels first when summarizing factors.
dta_test$AVISIT <- NA_character_
dta_test <- df_explicit_na(dta_test)
l <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  summarize_vars(vars = "AVISIT", na.rm = FALSE)

results <- build_table(l, df = dta_test)
if (FALSE) {
Viewer(results)
}