Column Formatting
Emily de la Rua
2024-12-20
Source:vignettes/col_formatting.Rmd
col_formatting.Rmd
Introduction
This vignette demonstrates how content in columns of a
listing_df
object can be customized using format
configurations with the rlistings
R package.
The following topics will be covered:
- Adjusting default column formatting settings in
as_listing
- Applying custom formatting to specific columns in
as_listing
- Applying custom formatting settings when adding a new column to a
listing via
add_listing_col
To learn more about how listings are constructed using the
rlistings
package, see the Getting
Started vignette.
Default Formatting in as_listing
When creating a listing with the rlistings
package, you
may want to customize how content is rendered within one or more of your
listing columns. In this section we will demonstrate how default
formatting can be set within the as_listing
function via
the default_formatting
parameter.
The default_formatting
argument to
as_listing
accepts a named list of format configurations to
apply within your listing. Format configurations are supplied as
fmt_config
objects which contain 3 elements to control
formatting:
-
format
: A format label (string) or format function to apply when rendering values (see all valid options with?formatters::list_valid_format_labels()
). Defaults toNULL
. -
na_str
: A string that should be displayed in place of missing values. Defaults to"NA"
. -
align
: Alignment to use when rendering the listing column. Defaults to"center"
. Other options include"left"
,"right"
,"decimal"
,"dec_right"
, and"dec_left"
.
The default_formatting
argument can use the same format
configuration for all columns in a listing (as is the default), but also
allows the user to set different format configurations for each data
class present in your listing. The list supplied to
default_formatting
must contain a named element
corresponding to every data class present in your listing, or include
the all
element with a configuration that will be applied
to any data classes that are not explicitly covered.
To demonstrate, we will create a basic listing below and customize
formatting using the default_formatting
parameter.
We begin by loading in the rlistings
package.
For this example, we will use the dummy ADAE dataset provided within
the formatters
package as our data frame, which consists of
48 columns of adverse event patient data, and one or more rows per
patient. For the purpose of this example, we will subset the data and
only use the first 15 records of the dataset. We will create some
NA
values in the data to showcase how NA
values can be formatted, and sort the data by what will be our key
columns.
adae <- ex_adae[1:15, ]
set.seed(1)
adae <- as.data.frame(lapply(adae, function(x) replace(x, sample(length(x), 0.1 * length(x)), NA)))
adae <- adae %>% dplyr::arrange(USUBJID, AGE, TRTSDTM)
Now we will create a basic listing.
lsting_1 <- as_listing(
df = adae,
key_cols = c("USUBJID", "AGE", "TRTSDTM"),
disp_cols = c("BMRKR1", "ASEQ", "AESEV"),
)
lsting_1
#> Unique Subject Identifier Age Datetime of First Exposure to Treatment Continous Level Biomarker 1 Analysis Sequence Number Severity/Intensity
#> ———————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> AB12345-BRA-1-id-134 47 2021-06-10 13:26:53.956201 6.46299057842479 1 MODERATE
#> 6.46299057842479 3 MODERATE
#> NA 2021-06-10 13:26:53.956201 6.46299057842479 2 MODERATE
#> AB12345-BRA-1-id-141 35 2021-02-28 23:47:16.956201 7.51607612428241 2 MILD
#> 7.51607612428241 3 MILD
#> 7.51607612428241 4 MODERATE
#> 7.51607612428241 5 MILD
#> NA 6 NA
#> NA 7.51607612428241 1 MODERATE
#> AB12345-BRA-1-id-236 32 2021-08-21 18:13:25.956201 7.66300121077566 1 SEVERE
#> 7.66300121077566 2 SEVERE
#> 7.66300121077566 3 SEVERE
#> AB12345-BRA-1-id-265 25 2020-05-13 00:38:12.956201 10.323346349886 NA MODERATE
#> 10.323346349886 2 MODERATE
#> NA 47 2021-06-10 13:26:53.956201 6.46299057842479 4 MODERATE
Notice that all of the data in the table above is displayed as is,
with no rounding or formatting applied. All columns are centered and all
missing values are displayed as "NA"
.
Suppose we want to left align all of the columns in the listing and
replace missing values with the string "<No data>"
.
This can be done by setting the all
element in the list
supplied to default_formatting
, as shown in the following
example.
default_fmt <- list(
all = fmt_config(na_str = "<No data>", align = "left")
)
lsting_2 <- as_listing(
df = adae,
key_cols = c("USUBJID", "AGE", "TRTSDTM"),
disp_cols = c("BMRKR1", "ASEQ", "AESEV"),
default_formatting = default_fmt
)
lsting_2
#> Unique Subject Identifier Age Datetime of First Exposure to Treatment Continous Level Biomarker 1 Analysis Sequence Number Severity/Intensity
#> —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> AB12345-BRA-1-id-134 47 2021-06-10 13:26:53.956201 6.46299057842479 1 MODERATE
#> 6.46299057842479 3 MODERATE
#> <No data> 2021-06-10 13:26:53.956201 6.46299057842479 2 MODERATE
#> AB12345-BRA-1-id-141 35 2021-02-28 23:47:16.956201 7.51607612428241 2 MILD
#> 7.51607612428241 3 MILD
#> 7.51607612428241 4 MODERATE
#> 7.51607612428241 5 MILD
#> <No data> 6 <No data>
#> <No data> 7.51607612428241 1 MODERATE
#> AB12345-BRA-1-id-236 32 2021-08-21 18:13:25.956201 7.66300121077566 1 SEVERE
#> 7.66300121077566 2 SEVERE
#> 7.66300121077566 3 SEVERE
#> AB12345-BRA-1-id-265 25 2020-05-13 00:38:12.956201 10.323346349886 <No data> MODERATE
#> 10.323346349886 2 MODERATE
#> <No data> 47 2021-06-10 13:26:53.956201 6.46299057842479 4 MODERATE
Now consider that we would like to display our numeric columns with
two decimal places and then align these columns on the decimal point.
This can be done by adding a "numeric"
element to the
default_formatting
list as follows:
default_fmt <- list(
all = fmt_config(na_str = "<No data>", align = "left"),
numeric = fmt_config(format = "xx.xx", na_str = "<No data>", align = "decimal")
)
lsting_3 <- as_listing(
df = adae,
key_cols = c("USUBJID", "AGE", "TRTSDTM"),
disp_cols = c("BMRKR1", "ASEQ", "AESEV"),
default_formatting = default_fmt
)
lsting_3
#> Unique Subject Identifier Age Datetime of First Exposure to Treatment Continous Level Biomarker 1 Analysis Sequence Number Severity/Intensity
#> —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> AB12345-BRA-1-id-134 47.00 2021-06-10 13:26:53.956201 6.46 1.00 MODERATE
#> 6.46 3.00 MODERATE
#> <No data> 2021-06-10 13:26:53.956201 6.46 2.00 MODERATE
#> AB12345-BRA-1-id-141 35.00 2021-02-28 23:47:16.956201 7.52 2.00 MILD
#> 7.52 3.00 MILD
#> 7.52 4.00 MODERATE
#> 7.52 5.00 MILD
#> <No data> 6.00 <No data>
#> <No data> 7.52 1.00 MODERATE
#> AB12345-BRA-1-id-236 32.00 2021-08-21 18:13:25.956201 7.66 1.00 SEVERE
#> 7.66 2.00 SEVERE
#> 7.66 3.00 SEVERE
#> AB12345-BRA-1-id-265 25.00 2020-05-13 00:38:12.956201 10.32 <No data> MODERATE
#> 10.32 2.00 MODERATE
#> <No data> 47.00 2021-06-10 13:26:53.956201 6.46 4.00 MODERATE
Along with the format strings listed by
formatters::list_valid_format_labels
, we can also specify a
format function to allow for more customized formats in our
listing. In the following example, we will define and apply a custom
format function to format date (POSIXt
class) columns in
our listing.
# Custom format function - takes date format as input
date_fmt <- function(fmt) {
function(x, ...) do.call(format, list(x = x, fmt))
}
default_fmt <- list(
all = fmt_config(na_str = "<No data>", align = "left"),
numeric = fmt_config(format = "xx.xx", na_str = "<No data>", align = "decimal"),
POSIXt = fmt_config(format = date_fmt("%B %d, %Y @ %I:%M %p %Z"), na_str = "<No data>")
)
lsting_4 <- as_listing(
df = adae,
key_cols = c("USUBJID", "AGE", "TRTSDTM"),
disp_cols = c("BMRKR1", "ASEQ", "AESEV"),
default_formatting = default_fmt
)
lsting_4
#> Unique Subject Identifier Age Datetime of First Exposure to Treatment Continous Level Biomarker 1 Analysis Sequence Number Severity/Intensity
#> —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> AB12345-BRA-1-id-134 47.00 June 10, 2021 @ 01:26 PM UTC 6.46 1.00 MODERATE
#> 6.46 3.00 MODERATE
#> <No data> June 10, 2021 @ 01:26 PM UTC 6.46 2.00 MODERATE
#> AB12345-BRA-1-id-141 35.00 February 28, 2021 @ 11:47 PM UTC 7.52 2.00 MILD
#> 7.52 3.00 MILD
#> 7.52 4.00 MODERATE
#> 7.52 5.00 MILD
#> <No data> 6.00 <No data>
#> <No data> 7.52 1.00 MODERATE
#> AB12345-BRA-1-id-236 32.00 August 21, 2021 @ 06:13 PM UTC 7.66 1.00 SEVERE
#> 7.66 2.00 SEVERE
#> 7.66 3.00 SEVERE
#> AB12345-BRA-1-id-265 25.00 May 13, 2020 @ 12:38 AM UTC 10.32 <No data> MODERATE
#> 10.32 2.00 MODERATE
#> <No data> 47.00 June 10, 2021 @ 01:26 PM UTC 6.46 4.00 MODERATE
In the output above, the all
format configuration, which
originally applied to all columns in the listing, now only applies to
the two character/factor variables (USUBJID
and
AESEV
). This is because all other data classes in the
listing have been covered by other elements in the list provided to
default_formatting
. When format configurations are supplied
to a listing, any other applicable configuration take precedence over
the all
format configuration.
Column-Wise Formatting in as_listing
In this section, we will demonstrate how custom formatting can be applied on a column-by-column basis rather than to all columns of a specified data class or an entire listing at once.
Take, for example, lsting_4
created in the previous
section.
lsting_4
#> Unique Subject Identifier Age Datetime of First Exposure to Treatment Continous Level Biomarker 1 Analysis Sequence Number Severity/Intensity
#> —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> AB12345-BRA-1-id-134 47.00 June 10, 2021 @ 01:26 PM UTC 6.46 1.00 MODERATE
#> 6.46 3.00 MODERATE
#> <No data> June 10, 2021 @ 01:26 PM UTC 6.46 2.00 MODERATE
#> AB12345-BRA-1-id-141 35.00 February 28, 2021 @ 11:47 PM UTC 7.52 2.00 MILD
#> 7.52 3.00 MILD
#> 7.52 4.00 MODERATE
#> 7.52 5.00 MILD
#> <No data> 6.00 <No data>
#> <No data> 7.52 1.00 MODERATE
#> AB12345-BRA-1-id-236 32.00 August 21, 2021 @ 06:13 PM UTC 7.66 1.00 SEVERE
#> 7.66 2.00 SEVERE
#> 7.66 3.00 SEVERE
#> AB12345-BRA-1-id-265 25.00 May 13, 2020 @ 12:38 AM UTC 10.32 <No data> MODERATE
#> 10.32 2.00 MODERATE
#> <No data> 47.00 June 10, 2021 @ 01:26 PM UTC 6.46 4.00 MODERATE
This listing applies the same format configuration to all numeric
columns. But in some cases, this may not produce the result we want. In
the above listing, the “Age” and “Analysis Sequence Number” columns
contain only integer values, so we would like to not render
these columns with two decimal places and instead only apply the current
numeric format configuration to the “Continuous Level Biomarker 1”
column. To do so, we make use of the col_formatting
argument to as_listing
. Like
default_formatting
, this argument takes a named list of
format configurations (fmt_config
objects) as input, but
unlike default_formatting
the names of the list elements
correspond to column names. The col_formatting
argument can
be used in combination with the default_formatting
argument
or on its own, and for any number of columns present in your listing,
depending on your requirements.
See the following example which demonstrates how
col_formatting
can be used with the BMRKR1
column. We will use the "xx"
format and right alignment for
the two remaining numeric columns.
default_fmt <- list(
all = fmt_config(na_str = "<No data>", align = "left"),
numeric = fmt_config(format = "xx", na_str = "<No data>", align = "right"),
POSIXt = fmt_config(format = date_fmt("%B %d, %Y @ %I:%M %p %Z"), na_str = "<No data>")
)
col_fmt <- list(
BMRKR1 = fmt_config(format = "xx.xx", na_str = "<No data>", align = "decimal")
)
lsting_5 <- as_listing(
df = adae,
key_cols = c("USUBJID", "AGE", "TRTSDTM"),
disp_cols = c("BMRKR1", "ASEQ", "AESEV"),
default_formatting = default_fmt,
col_formatting = col_fmt
)
lsting_5
#> Unique Subject Identifier Age Datetime of First Exposure to Treatment Continous Level Biomarker 1 Analysis Sequence Number Severity/Intensity
#> —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> AB12345-BRA-1-id-134 47 June 10, 2021 @ 01:26 PM UTC 6.46 1 MODERATE
#> 6.46 3 MODERATE
#> <No data> June 10, 2021 @ 01:26 PM UTC 6.46 2 MODERATE
#> AB12345-BRA-1-id-141 35 February 28, 2021 @ 11:47 PM UTC 7.52 2 MILD
#> 7.52 3 MILD
#> 7.52 4 MODERATE
#> 7.52 5 MILD
#> <No data> 6 <No data>
#> <No data> 7.52 1 MODERATE
#> AB12345-BRA-1-id-236 32 August 21, 2021 @ 06:13 PM UTC 7.66 1 SEVERE
#> 7.66 2 SEVERE
#> 7.66 3 SEVERE
#> AB12345-BRA-1-id-265 25 May 13, 2020 @ 12:38 AM UTC 10.32 <No data> MODERATE
#> 10.32 2 MODERATE
#> <No data> 47 June 10, 2021 @ 01:26 PM UTC 6.46 4 MODERATE
Now all of the columns present in our listing are formatted according
to our specifications. Note that format configurations supplied to
col_formatting
for individual columns take precedence over
any format configurations from default_formatting
.
Adding Formatted Columns to a Listing via
add_listing_col
In some cases, you may want to add a new column with its own
formatting settings to a pre-existing listing. In this section, we will
demonstrate how this can be accomplished using the
add_listing_col
. Columns added after a listing has already
been created with as_listing
will not inherit format
configurations previously applied, so formatting for the new column must
be specified within the add_listing_col
function.
Instead of creating a fmt_config
object, the
format
, na_str
, and align
specifications are supplied directly to add_listing_col
using its the parameters by the same names. If these parameters are not
specified, default values of NULL
, "NA"
, and
"left"
will be used as format
,
na_str
, and align
, respectively. The
add_listing_col
can be used in sequence as many times as
needed to add new columns to a listing.
In this example, we will add a column to lsting_5
created in the previous section. This new column will calculates the
length of the analysis (in days) by subtracting “Analysis Start Relative
Day” (ASTDY
) from “Analysis End Relative Day”
(AENDY
). This can be done as follows:
lsting_6 <- lsting_5 %>%
add_listing_col(
name = "Length of\nAnalysis",
fun = function(df) df$AENDY - df$ASTDY,
format = "xx.x",
na_str = "NE",
align = "center"
)
lsting_6
#> Length of
#> Unique Subject Identifier Age Datetime of First Exposure to Treatment Continous Level Biomarker 1 Analysis Sequence Number Severity/Intensity Analysis
#> —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
#> AB12345-BRA-1-id-134 47 June 10, 2021 @ 01:26 PM UTC 6.46 1 MODERATE 267.0
#> 6.46 3 MODERATE 228.0
#> <No data> June 10, 2021 @ 01:26 PM UTC 6.46 2 MODERATE 255.0
#> AB12345-BRA-1-id-141 35 February 28, 2021 @ 11:47 PM UTC 7.52 2 MILD 420.0
#> 7.52 3 MILD 23.0
#> 7.52 4 MODERATE 93.0
#> 7.52 5 MILD 43.0
#> <No data> 6 <No data> NE
#> <No data> 7.52 1 MODERATE 7.0
#> AB12345-BRA-1-id-236 32 August 21, 2021 @ 06:13 PM UTC 7.66 1 SEVERE 410.0
#> 7.66 2 SEVERE 517.0
#> 7.66 3 SEVERE 4.0
#> AB12345-BRA-1-id-265 25 May 13, 2020 @ 12:38 AM UTC 10.32 <No data> MODERATE 44.0
#> 10.32 2 MODERATE 162.0
#> <No data> 47 June 10, 2021 @ 01:26 PM UTC 6.46 4 MODERATE NE
Summary
In this vignette, you have learned how column formatting can be
configured using the default_formatting
and
col_formatting
arguments to as_listing
and the
add_listing_col
function to customize how listings are
rendered.
For more information please explore the rlistings website.