Joining datasets is an essential step when working with relational datasets.
To support this, two functions are provided depending on how to
process the data_extract_spec object:
-
merge_expression_modulecan be used when there is no need to process the list ofdata_extract_spec. This function reads the data and the list ofdata_extract_specobjects and applies the merging. Essentially, it serves as a wrapper that combinesdata_extract_multiple_srv()andmerge_expression_srv(). -
merge_expression_srvanddata_extract_multiple_srvcan be used in scenarios where additional processing of the list ofdata_extract_specis necessary ordata_extract_srv()to customize theselector_listinput.
The following sections provide examples for both scenarios.
merge_expression_module
Using merge_expression_module alone requires a list of
data_extract_spec objects for the data_extract
argument, a list of reactive or non-reactive data.frame
objects, and a list of join keys corresponding to each
data.frame object.
Step 1/5 - Preparing the Data
library(teal.transform)
library(teal.data)
#> Loading required package: teal.code
library(shiny)
# Define data.frame objects
ADSL <- teal.data::rADSL
ADTTE <- teal.data::rADTTE
# create a list of reactive data.frame objects
datasets <- list(
ADSL = reactive(ADSL),
ADTTE = reactive(ADTTE)
)
# create join_keys
join_keys <- join_keys(
join_key("ADSL", "ADSL", c("STUDYID", "USUBJID")),
join_key("ADSL", "ADTTE", c("STUDYID", "USUBJID")),
join_key("ADTTE", "ADTTE", c("STUDYID", "USUBJID", "PARAMCD"))
)Step 2/5 - Creating the Data Extracts
adsl_extract <- data_extract_spec(
dataname = "ADSL",
select = select_spec(
label = "Select variable:",
choices = c("AGE", "BMRKR1"),
selected = "AGE",
multiple = TRUE,
fixed = FALSE
)
)
adtte_extract <- data_extract_spec(
dataname = "ADTTE",
select = select_spec(
choices = c("AVAL", "ASEQ"),
selected = "AVAL",
multiple = TRUE,
fixed = FALSE
)
)
data_extracts <- list(adsl_extract = adsl_extract, adtte_extract = adtte_extract)Step 3/5 - Creating the UI
merge_ui <- function(id, data_extracts) {
ns <- NS(id)
sidebarLayout(
sidebarPanel(
h3("Encoding"),
tags$div(
data_extract_ui(
ns("adsl_extract"), # must correspond with data_extracts list names
label = "ADSL extract",
data_extracts[[1]]
),
data_extract_ui(
ns("adtte_extract"), # must correspond with data_extracts list names
label = "ADTTE extract",
data_extracts[[2]]
)
)
),
mainPanel(
h3("Output"),
verbatimTextOutput(ns("expr")),
dataTableOutput(ns("data"))
)
)
}Step 4/5 - Creating the Server Logic
merge_srv <- function(id, datasets, data_extracts, join_keys) {
moduleServer(id, function(input, output, session) {
merged_data <- merge_expression_module(
data_extract = data_extracts,
datasets = datasets,
join_keys = join_keys,
merge_function = "dplyr::left_join"
)
ANL <- reactive({
data_list <- lapply(datasets, function(ds) ds())
eval(envir = list2env(data_list), expr = as.expression(merged_data()$expr))
})
output$expr <- renderText(paste(merged_data()$expr, collapse = "\n"))
output$data <- renderDataTable(ANL())
})
}
data_extract_multiple_srv +
merge_expression_srv
In the scenario above, if the user deselects the ADTTE
variable, the merging between ADTTE and ADSL
would still occur, even though ADTTE is not used or needed.
Here, the developer might update the selector_list input in
a reactive manner so that it gets updated based on conditions set by the
developer. Below, we reuse the input from above and update the app
server so that the adtte_extract is removed from the
selector_list input when no ADTTE variable is selected. The
reactive_selector_list is then passed to
merge_expression_srv:
Modifying the Server Logic
merge_srv <- function(id, datasets, data_extracts, join_keys) {
moduleServer(id, function(input, output, session) {
selector_list <- data_extract_multiple_srv(data_extracts, datasets, join_keys)
reactive_selector_list <- reactive({
if (is.null(selector_list()$adtte_extract) || length(selector_list()$adtte_extract()$select) == 0) {
selector_list()[names(selector_list()) != "adtte_extract"]
} else {
selector_list()
}
})
merged_data <- merge_expression_srv(
selector_list = reactive_selector_list,
datasets = datasets,
join_keys = join_keys,
merge_function = "dplyr::left_join"
)
ANL <- reactive({
data_list <- lapply(datasets, function(ds) ds())
eval(envir = list2env(data_list), expr = as.expression(merged_data()$expr))
})
output$expr <- renderText(paste(merged_data()$expr, collapse = "\n"))
output$data <- renderDataTable(ANL())
})
}Updating the shiny app
shinyApp(
ui = fluidPage(merge_ui("data_merge", data_extracts)),
server = function(input, output, session) {
merge_srv("data_merge", datasets, data_extracts, join_keys)
}
)merge_expression_module is replaced here with three
parts:
-
selector_list: the output ofdata_extract_multiple_srv, which loops over the list ofdata_extractgiven and runsdata_extract_srvfor each one, returning a list of reactive objects. -
reactive_selector_list: an intermediate reactive list updatingselector_listcontent. -
merged_data: the output ofmerge_expression_srvusingreactive_selector_listas input.
Output from merging
Both merge functions, merge_expression_srv and
merge_expression_module, return a reactive object which
contains a list of the following elements:
-
expr: code needed to replicate merged dataset. -
columns_source: list of columns selected per selector. -
keys: the keys of the merged dataset. -
filter_info: filters that are applied on the data.
These elements can be further used inside the server to retrieve and use information about the selections, data, filters, etc.
Merging of non CDISC datasets
General datasets do not have the same relationships as
CDISC datasets, so these relationships must be specified
using the join_keys functions. For more information, please
refer to the Join Keys vignette.
The data merge module respects the relationships given by the user. In
the case of multiple datasets to merge, the order is specified by the
order of elements in the data_extract argument of the
merge_expression_module function. Merging groups of
datasets with complex relationships can quickly become challenging to
specify so please take extra care when setting this up.