Skip to contents

Overview

Once your delayed data object has been created as described in Delayed Data Objects, teal.data provides a useful set of functions to examine the object outside of a shiny application, i.e. the global environment. Below is an exhaustive list of all such functions:

TealDataset TealDatasetConnector TealDataConnector & TealData
Get Reproducible Code (Optionally Deparsed) get_code get_code get_code
Get data.frame get_raw_data get_raw_data get_raw_data
Get Dataset Name get_dataname get_dataname get_dataname
Get Single Dataset Object get_dataset get_dataset get_dataset
Get All Dataset Objects - - get_datasets
Load Data - load_dataset load_datasets
Check if Loaded - is_pulled is_pulled
Mutate Single Dataset mutate_dataset mutate_dataset mutate_dataset
Mutate All Datasets - - mutate_data

The most basic function get_dataname returns the name of the dataset or datasets in your delayed data object:

library(scda)
## 
library(teal.data)
## Loading required package: shiny
adsl_cf <- callable_function(function() synthetic_cdisc_data("latest")$adsl)
adsl <- cdisc_dataset_connector(
  dataname = "ADSL",
  pull_callable = adsl_cf,
  keys = get_cdisc_keys("ADSL")
)
get_dataname(adsl) # "ADSL"
## [1] "ADSL"
adae_cf <- callable_function(function() synthetic_cdisc_data("latest")$adae)
adae <- cdisc_dataset_connector(
  dataname = "ADAE",
  pull_callable = adae_cf,
  keys = get_cdisc_keys("ADAE")
)
delayed_data <- cdisc_data(adsl, adae)
get_dataname(delayed_data) # "ADSL" "ADAE"
## [1] "ADSL" "ADAE"

The delayed data objects described above all also contain a launch method which can be used to test the data loading screen:

if (interactive()) {
  delayed_data$launch()
}

There is also a pull method to test that the data can be loaded without launching a shiny app. See Delayed Data Loading.

Alternatively teal.data provides a load_dataset function for <...>Dataset<...> objects which is used to pull the data without launching the delayed loading screen, and a load_datasetsfunction for <...>Data<...> objects which launches the delayed loading screen used to pull the datasets from the connection.

After loading the data, it can be checked that the data has been successfully pulled using the is_pulled function:

if (interactive()) {
  load_datasets(delayed_data)
}
is_pulled(delayed_data)
## [1] FALSE

Aside: Loading page UI

It is possible to set default values of the boxes on the loading page using the set_ui_input method:

adae$set_ui_input(function(ns) {
  list(pickerInput("name", label = "Version of the dataset", choices = ls_synthetic_cdisc_data(), selected = "latest"))
})

Testing data loading continued

Once the data are loaded, it’s also possible to access the individual dataset objects using the get_dataset function, or for <...>Data<...> objects, retrieve all dataset objects using the get_datasets function:

lapply(delayed_data$get_items(), function(item) item$pull())

# return a particular dataset by name
get_dataset(delayed_data, dataname = "ADSL")

# or return all datasets
load_datasets(delayed_data)
get_datasets(delayed_data)

Note that when a connector is loaded, the result is a dataset object:

# "CDISCTealDatasetConnector" "TealDatasetConnector" "R6"
class(adsl)
## [1] "CDISCTealDatasetConnector" "TealDatasetConnector"     
## [3] "R6"
# "CDISCTealDataset" "TealDataset" "R6"
class(get_dataset(adsl))
## [1] "CDISCTealDataset" "TealDataset"      "R6"

To view the raw dataframe object, use the get_raw_data function:

# for a single <...>Dataset<..> object
head(get_raw_data(adsl), 3)
##   STUDYID               USUBJID SUBJID SITEID AGE  AGEU SEX
## 1 AB12345  AB12345-CHN-3-id-128 id-128  CHN-3  32 YEARS   M
## 2 AB12345 AB12345-CHN-15-id-262 id-262 CHN-15  35 YEARS   M
## 3 AB12345  AB12345-RUS-3-id-378 id-378  RUS-3  30 YEARS   F
##                        RACE                 ETHNIC COUNTRY DTHFL         INVID
## 1                     ASIAN NOT HISPANIC OR LATINO     CHN     N  INV ID CHN-3
## 2 BLACK OR AFRICAN AMERICAN NOT HISPANIC OR LATINO     CHN     N INV ID CHN-15
## 3                     ASIAN NOT HISPANIC OR LATINO     RUS     N  INV ID RUS-3
##           INVNAM            ARM ARMCD         ACTARM ACTARMCD         TRT01P
## 1  Dr. CHN-3 Doe      A: Drug X ARM A      A: Drug X    ARM A      A: Drug X
## 2 Dr. CHN-15 Doe C: Combination ARM C C: Combination    ARM C C: Combination
## 3  Dr. RUS-3 Doe C: Combination ARM C C: Combination    ARM C C: Combination
##           TRT01A REGION1 STRATA1 STRATA2    BMRKR1 BMRKR2 ITTFL SAFFL BMEASIFL
## 1      A: Drug X    Asia       C      S2 14.424934 MEDIUM     Y     Y        Y
## 2 C: Combination    Asia       C      S1  4.055463    LOW     Y     Y        N
## 3 C: Combination Eurasia       A      S1  2.803240   HIGH     Y     Y        Y
##   BEP01FL     RANDDT             TRTSDTM             TRTEDTM    EOSSTT
## 1       Y 2019-02-22 2019-02-24 11:09:18 2021-02-23 22:47:42 COMPLETED
## 2       N 2019-02-26 2019-02-26 09:05:00 2021-02-25 20:43:24 COMPLETED
## 3       N 2019-02-24 2019-02-28 03:19:08 2021-02-27 14:57:32 COMPLETED
##      EOTSTT      EOSDT EOSDY DCSREAS DTHDT DTHCAUS DTHCAT LDDTHELD LDDTHGR1
## 1 COMPLETED 2021-02-23   731    <NA>  <NA>    <NA>   <NA>       NA     <NA>
## 2 COMPLETED 2021-02-25   731    <NA>  <NA>    <NA>   <NA>       NA     <NA>
## 3 COMPLETED 2021-02-27   731    <NA>  <NA>    <NA>   <NA>       NA     <NA>
##     LSTALVDT DTHADY study_duration_secs
## 1 2021-03-05     NA            63113904
## 2 2021-03-15     NA            63113904
## 3 2021-03-15     NA            63113904
# or for a <...>Data<...> object containing multiple datasets, specify the name of the dataset of interest
raw <- get_raw_data(delayed_data, "ADSL")
head(raw, 3)
##   STUDYID               USUBJID SUBJID SITEID AGE  AGEU SEX
## 1 AB12345  AB12345-CHN-3-id-128 id-128  CHN-3  32 YEARS   M
## 2 AB12345 AB12345-CHN-15-id-262 id-262 CHN-15  35 YEARS   M
## 3 AB12345  AB12345-RUS-3-id-378 id-378  RUS-3  30 YEARS   F
##                        RACE                 ETHNIC COUNTRY DTHFL         INVID
## 1                     ASIAN NOT HISPANIC OR LATINO     CHN     N  INV ID CHN-3
## 2 BLACK OR AFRICAN AMERICAN NOT HISPANIC OR LATINO     CHN     N INV ID CHN-15
## 3                     ASIAN NOT HISPANIC OR LATINO     RUS     N  INV ID RUS-3
##           INVNAM            ARM ARMCD         ACTARM ACTARMCD         TRT01P
## 1  Dr. CHN-3 Doe      A: Drug X ARM A      A: Drug X    ARM A      A: Drug X
## 2 Dr. CHN-15 Doe C: Combination ARM C C: Combination    ARM C C: Combination
## 3  Dr. RUS-3 Doe C: Combination ARM C C: Combination    ARM C C: Combination
##           TRT01A REGION1 STRATA1 STRATA2    BMRKR1 BMRKR2 ITTFL SAFFL BMEASIFL
## 1      A: Drug X    Asia       C      S2 14.424934 MEDIUM     Y     Y        Y
## 2 C: Combination    Asia       C      S1  4.055463    LOW     Y     Y        N
## 3 C: Combination Eurasia       A      S1  2.803240   HIGH     Y     Y        Y
##   BEP01FL     RANDDT             TRTSDTM             TRTEDTM    EOSSTT
## 1       Y 2019-02-22 2019-02-24 11:09:18 2021-02-23 22:47:42 COMPLETED
## 2       N 2019-02-26 2019-02-26 09:05:00 2021-02-25 20:43:24 COMPLETED
## 3       N 2019-02-24 2019-02-28 03:19:08 2021-02-27 14:57:32 COMPLETED
##      EOTSTT      EOSDT EOSDY DCSREAS DTHDT DTHCAUS DTHCAT LDDTHELD LDDTHGR1
## 1 COMPLETED 2021-02-23   731    <NA>  <NA>    <NA>   <NA>       NA     <NA>
## 2 COMPLETED 2021-02-25   731    <NA>  <NA>    <NA>   <NA>       NA     <NA>
## 3 COMPLETED 2021-02-27   731    <NA>  <NA>    <NA>   <NA>       NA     <NA>
##     LSTALVDT DTHADY study_duration_secs
## 1 2021-03-05     NA            63113904
## 2 2021-03-15     NA            63113904
## 3 2021-03-15     NA            63113904
# note the raw data is now just a regular R table
class(raw)
## [1] "tbl_df"     "tbl"        "data.frame"

The get_code function is called to check that the processing code is as expected (and for reproducibility).

get_code(delayed_data)
## [1] "ADSL <- (function() synthetic_cdisc_data(\"latest\")$adsl)()\nADAE <- (function() synthetic_cdisc_data(\"latest\")$adae)()"

See the section on pre-processing Delayed Data to specify additional code instructions to transform your delayed data which will also be added to the output of get_code.

Aside: Piping functions

The examples above covered some basic piping, but there is a natural sequence to the loading and inspection of a delayed data object. For this reason, the magrittr pipe %>% works well for many pre-processing tasks.

library(teal.data)
library(scda)
library(magrittr)

adsl_cf <- callable_function(function() synthetic_cdisc_data("latest")$adsl)
cdisc_dataset_connector(
  dataname = "ADSL",
  pull_callable = adsl_cf,
  keys = get_cdisc_keys("ADSL")
) %>%
  mutate_dataset("ADSL$TRTDUR <- round(as.numeric(ADSL$TRTEDTM - ADSL$TRTSDTM), 1)") %>%
  load_dataset() %>%
  get_raw_data() %>%
  head(n = 3)
##   STUDYID               USUBJID SUBJID SITEID AGE  AGEU SEX
## 1 AB12345  AB12345-CHN-3-id-128 id-128  CHN-3  32 YEARS   M
## 2 AB12345 AB12345-CHN-15-id-262 id-262 CHN-15  35 YEARS   M
## 3 AB12345  AB12345-RUS-3-id-378 id-378  RUS-3  30 YEARS   F
##                        RACE                 ETHNIC COUNTRY DTHFL         INVID
## 1                     ASIAN NOT HISPANIC OR LATINO     CHN     N  INV ID CHN-3
## 2 BLACK OR AFRICAN AMERICAN NOT HISPANIC OR LATINO     CHN     N INV ID CHN-15
## 3                     ASIAN NOT HISPANIC OR LATINO     RUS     N  INV ID RUS-3
##           INVNAM            ARM ARMCD         ACTARM ACTARMCD         TRT01P
## 1  Dr. CHN-3 Doe      A: Drug X ARM A      A: Drug X    ARM A      A: Drug X
## 2 Dr. CHN-15 Doe C: Combination ARM C C: Combination    ARM C C: Combination
## 3  Dr. RUS-3 Doe C: Combination ARM C C: Combination    ARM C C: Combination
##           TRT01A REGION1 STRATA1 STRATA2    BMRKR1 BMRKR2 ITTFL SAFFL BMEASIFL
## 1      A: Drug X    Asia       C      S2 14.424934 MEDIUM     Y     Y        Y
## 2 C: Combination    Asia       C      S1  4.055463    LOW     Y     Y        N
## 3 C: Combination Eurasia       A      S1  2.803240   HIGH     Y     Y        Y
##   BEP01FL     RANDDT             TRTSDTM             TRTEDTM    EOSSTT
## 1       Y 2019-02-22 2019-02-24 11:09:18 2021-02-23 22:47:42 COMPLETED
## 2       N 2019-02-26 2019-02-26 09:05:00 2021-02-25 20:43:24 COMPLETED
## 3       N 2019-02-24 2019-02-28 03:19:08 2021-02-27 14:57:32 COMPLETED
##      EOTSTT      EOSDT EOSDY DCSREAS DTHDT DTHCAUS DTHCAT LDDTHELD LDDTHGR1
## 1 COMPLETED 2021-02-23   731    <NA>  <NA>    <NA>   <NA>       NA     <NA>
## 2 COMPLETED 2021-02-25   731    <NA>  <NA>    <NA>   <NA>       NA     <NA>
## 3 COMPLETED 2021-02-27   731    <NA>  <NA>    <NA>   <NA>       NA     <NA>
##     LSTALVDT DTHADY study_duration_secs TRTDUR
## 1 2021-03-05     NA            63113904  730.5
## 2 2021-03-15     NA            63113904  730.5
## 3 2021-03-15     NA            63113904  730.5

Since these functions modify (operate on) the objects that are given to them, there is no need to assign the result.

For an introduction to pipes, refer to the documentation for %>% or other resources on pipes.