Pre-processing data
NEST CoreDev
2022-07-26
preprocessing-data.Rmd
“Data” preprocessing refers to the code which contains:
- Data import calls
- Data modification
Including the preprocessing code is an important step that is handled
by teal.data
functions.
For example, for the following example only
ADSL <- readRDS("<your data path>/adsl.rds")
is
considered as preprocessing code:
library(teal.data)
saveRDS(example_cdisc_data("ADSL"), "adsl.rds")
## preprocessing -------------------
adsl <- readRDS("adsl.rds")
## -------------------
data <- cdisc_data(cdisc_dataset("ADSL", adsl))
data$get_code()
## [1] ""
If you run the example above, the get_code
function will
return an empty string reflecting that “Preprocessing is empty”. In
order to show the preprocessing code correctly the code
argument of the cdisc_data
function needs to be specified.
For the above example this would be:
library(teal.data)
saveRDS(example_cdisc_data("ADSL"), "adsl.rds")
## preprocessing -------------------
adsl <- readRDS("adsl.rds")
## -------------------
unlink("adsl.rds")
data <- cdisc_data(
cdisc_dataset("ADSL", adsl),
code = 'ADSL <- readRDS("adsl.rds")'
)
data$get_code()
## [1] "ADSL <- readRDS(\"adsl.rds\")"
The code used to get the ADSL
dataset is returned as
expected. This can be used as input to the data
argument of
teal::init
to ensure reproducibility in teal apps.