Using Delayed Data Loading (Basic)
Dawid Kałędkowski
15.05.2022
using-delayed-data-basic.RmdBasic understanding
Delayed data objects are R objects that contain
instructions on how to acquire data. In practice, you will pass these
DDL objects with their instructions into a
teal application so that you can launch a teal
app first and then pull the data afterwards.
The main difference between a DDL object and a
non-DDL object is that data is available immediately after
creating a non-DDL object. In contrast, data in a
DDL object is not available after construction, only after
pulling it (executing instructions stored in the
object).
Key concepts
- A
TealDatasetConnectoris an object used to pull a single delayed data set into atealapp. Connectors to pull data from proprietary data storage are not included in this package.
library(teal.data)
library(scda)
# specialized function to create delayed data using scda (could also set keys when using cdisc_dataset_connector)
adlb <- scda_dataset_connector(
"ADLB",
"adlb",
keys = get_cdisc_keys("ADLB")
)
# generalized function to create delayed data from code - see package help for other connectors
x <- code_dataset_connector(
dataname = "ADSL",
keys = get_cdisc_keys("ADSL"),
code = "library(scda)\nADSL <- synthetic_cdisc_data(\"latest\")$adsl"
)- A
TealDataConnectoris an object used to pull a set of delayed data sets into atealapp which all share a common connection (see Delayed Data Loading for the definition of a connection object).
# using scda
adsl <- scda_cdisc_dataset_connector(dataname = "ADSL", scda_dataname = "adsl")
adae <- scda_cdisc_dataset_connector(dataname = "ADAE", scda_dataname = "adae")
adsl_adae <- relational_data_connector(
connection = data_connection(),
connectors = list(adsl, adae)
)- The
cdisc_datafunction takes a set ofTealDataConnector,TealDatasetConnectorand / orcdisc_datasets(non-delayed datasets) to create theTealDataobject which is used to createtealapplications.
# create a TealDatasetConnector for ADVS
advs <- scda_cdisc_dataset_connector(dataname = "ADVS", scda_dataname = "advs")
# use cdisc_data() to create a `DDL` object
delayed_data <- cdisc_data(adsl_adae, advs)Constructors
Below is a list of all of the constructors available in
teal.data to create TealDataset and delayed
TealDatasetConnector objects:
| Description | Base Constructor | Constructor Wrappers | |
|---|---|---|---|
TealDataset |
Dataframe with name (and optionally keys) |
dataset, dataset_file
|
dataset, cdisc_dataset
|
TealDatasetConnector |
Delayed Dataset |
dataset_connector,
dataset_connector_file
|
(see note 1 below) rds_dataset_connector,
script_dataset_connector,
code_dataset_connector, csv_dataset_connector,
fun_dataset_connector,
python_dataset_connector,
scda_dataset_connector
|
TealDataConnector |
Group of TealDatasetConnector
|
||
TealData |
Group of TealDatasetConnector,
TealDataConnector, TealDataset
|
teal_data,
teal_data_file
|
(see note 2 below) cdisc_data,
cdisc_data_file
|
Notes:
- All
xyz_dataset_connectorfunctions have an equivalentxyz_cdisc_dataset_connectorfunction (for examplerds_cdisc_dataset_connector) which specifies additional dataset metadata. -
cdisc_datais the standard function used to create a data object to be used within teal apps for standardCDISCstudy data. The more generalteal_datafunction can be used to allow arbitrary relational data to be used within teal apps.
Dataset dependencies
The datasets passed into teal_data and
cdisc_data are pulled in the order they are inputted. So if
datasets depend on other datasets being available they should be placed
later in the argument list:
adsl <- scda_cdisc_dataset_connector(dataname = "ADSL", "adsl")
adsl_2 <- code_cdisc_dataset_connector("ADSL_2",
code = "head(ADSL, 5)",
keys = get_cdisc_keys("ADSL"), ADSL = adsl
)
# launch method will be able to load the data as adsl will be pulled first
cdisc_data(adsl, adsl_2)## A TealData object containing 2 TealDataset/TealDatasetConnector object(s) as element(s):
## --> Element 1:
## A CDISCTealDatasetConnector object, named ADSL, containing a TealDataset object that has not been loaded/pulled
## --> Element 2:
## A CDISCTealDatasetConnector object, named ADSL_2, containing a TealDataset object that has not been loaded/pulled
# launch method will not be able to load the data as adae is pulled first but it depends on adsl
cdisc_data(adsl_2, adsl)## A TealData object containing 2 TealDataset/TealDatasetConnector object(s) as element(s):
## --> Element 1:
## A CDISCTealDatasetConnector object, named ADSL_2, containing a TealDataset object that has not been loaded/pulled
## --> Element 2:
## A CDISCTealDatasetConnector object, named ADSL, containing a TealDataset object that has not been loaded/pulled
Suggested development workflow
The following workflow facilitates building teal apps
with DDL by minimizing debugging overhead.
- Run a teal app configured without delayed data to verify that the app starts as expected.
- Replace the
cdisc_datasetfunctions with the appropriateTealDatasetConnectorobjects. - Add preprocessing code (see Delayed Data Advanced for
preprocessing documentation) and verify once again by running the newly
created object’s
$launchmethod. - First, include just a single module to verify that the teal app starts and that it loads all the expected data.
- Then, add the rest of the modules one by one, replacing all dataset calls with strings and iteratively verify that the app functions as expected.