Using Delayed Data Loading (Basic)
Dawid Kałędkowski
15.05.2022
using-delayed-data-basic.Rmd
Basic understanding
Delayed data objects are R
objects that contain
instructions on how to acquire data. In practice, you will pass these
DDL
objects with their instructions into a
teal
application so that you can launch a teal
app first and then pull the data afterwards.
The main difference between a DDL
object and a
non-DDL
object is that data is available immediately after
creating a non-DDL
object. In contrast, data in a
DDL
object is not available after construction, only after
pulling
it (executing instructions stored in the
object).
Key concepts
- A
TealDatasetConnector
is an object used to pull a single delayed data set into ateal
app. Connectors to pull data from proprietary data storage are not included in this package.
library(teal.data)
library(scda)
# specialized function to create delayed data using scda (could also set keys when using cdisc_dataset_connector)
adlb <- scda_dataset_connector(
"ADLB",
"adlb",
keys = get_cdisc_keys("ADLB")
)
# generalized function to create delayed data from code - see package help for other connectors
x <- code_dataset_connector(
dataname = "ADSL",
keys = get_cdisc_keys("ADSL"),
code = "library(scda)\nADSL <- synthetic_cdisc_data(\"latest\")$adsl"
)
- A
TealDataConnector
is an object used to pull a set of delayed data sets into ateal
app which all share a common connection (see Delayed Data Loading for the definition of a connection object).
# using scda
adsl <- scda_cdisc_dataset_connector(dataname = "ADSL", scda_dataname = "adsl")
adae <- scda_cdisc_dataset_connector(dataname = "ADAE", scda_dataname = "adae")
adsl_adae <- relational_data_connector(
connection = data_connection(),
connectors = list(adsl, adae)
)
- The
cdisc_data
function takes a set ofTealDataConnector
,TealDatasetConnector
and / orcdisc_datasets
(non-delayed datasets) to create theTealData
object which is used to createteal
applications.
# create a TealDatasetConnector for ADVS
advs <- scda_cdisc_dataset_connector(dataname = "ADVS", scda_dataname = "advs")
# use cdisc_data() to create a `DDL` object
delayed_data <- cdisc_data(adsl_adae, advs)
Constructors
Below is a list of all of the constructors available in
teal.data
to create TealDataset
and delayed
TealDatasetConnector
objects:
Description | Base Constructor | Constructor Wrappers | |
---|---|---|---|
TealDataset |
Dataframe with name (and optionally keys) |
dataset , dataset_file
|
dataset , cdisc_dataset
|
TealDatasetConnector |
Delayed Dataset |
dataset_connector ,
dataset_connector_file
|
(see note 1 below) rds_dataset_connector ,
script_dataset_connector ,
code_dataset_connector , csv_dataset_connector ,
fun_dataset_connector ,
python_dataset_connector ,
scda_dataset_connector
|
TealDataConnector |
Group of TealDatasetConnector
|
||
TealData |
Group of TealDatasetConnector ,
TealDataConnector , TealDataset
|
teal_data ,
teal_data_file
|
(see note 2 below) cdisc_data ,
cdisc_data_file
|
Notes:
- All
xyz_dataset_connector
functions have an equivalentxyz_cdisc_dataset_connector
function (for examplerds_cdisc_dataset_connector
) which specifies additional dataset metadata. -
cdisc_data
is the standard function used to create a data object to be used within teal apps for standardCDISC
study data. The more generalteal_data
function can be used to allow arbitrary relational data to be used within teal apps.
Dataset dependencies
The datasets passed into teal_data
and
cdisc_data
are pulled in the order they are inputted. So if
datasets depend on other datasets being available they should be placed
later in the argument list:
adsl <- scda_cdisc_dataset_connector(dataname = "ADSL", "adsl")
adsl_2 <- code_cdisc_dataset_connector("ADSL_2",
code = "head(ADSL, 5)",
keys = get_cdisc_keys("ADSL"), ADSL = adsl
)
# launch method will be able to load the data as adsl will be pulled first
cdisc_data(adsl, adsl_2)
## A CDISCTealData object containing 2 TealDataset/TealDatasetConnector object(s) as element(s):
## --> Element 1:
## A CDISCTealDatasetConnector object, named ADSL, containing a TealDataset object that has not been loaded/pulled
## --> Element 2:
## A CDISCTealDatasetConnector object, named ADSL_2, containing a TealDataset object that has not been loaded/pulled
# launch method will not be able to load the data as adae is pulled first but it depends on adsl
cdisc_data(adsl_2, adsl)
## A CDISCTealData object containing 2 TealDataset/TealDatasetConnector object(s) as element(s):
## --> Element 1:
## A CDISCTealDatasetConnector object, named ADSL_2, containing a TealDataset object that has not been loaded/pulled
## --> Element 2:
## A CDISCTealDatasetConnector object, named ADSL, containing a TealDataset object that has not been loaded/pulled
Suggested development workflow
The following workflow facilitates building teal
apps
with DDL
by minimizing debugging overhead.
- Run a teal app configured without delayed data to verify that the app starts as expected.
- Replace the
cdisc_dataset
functions with the appropriateTealDatasetConnector
objects. - Add preprocessing code (see Delayed Data Advanced for
preprocessing documentation) and verify once again by running the newly
created object’s
$launch
method. - First, include just a single module to verify that the teal app starts and that it loads all the expected data.
- Then, add the rest of the modules one by one, replacing all dataset calls with strings and iteratively verify that the app functions as expected.