Encode Categorical Missing Values in a DM
Object
dm_explicit_na.Rd
Encode Categorical Missing Values in a DM
Object
Usage
dm_explicit_na(
data,
omit_tables = NULL,
omit_columns = NULL,
char_as_factor = TRUE,
logical_as_factor = FALSE,
na_level = "<Missing>"
)
Arguments
- data
(
dm
) object to be transformed.- omit_tables
(
character
) the names of the table to omit from processing.- omit_columns
(
character
) the names of the columns to omit from processing.- char_as_factor
(
logical
) should character columns be transformed into factor.- logical_as_factor
(
logical
) should logical columns be transformed into factor.- na_level
(
character
) the label to encode missing levels.
Details
This is a helper function to encode missing entries across groups of categorical variables in potentially
all tables of a dm
object. The label
attribute of the columns is preserved.
Examples
library(dm)
#>
#> Attaching package: ‘dm’
#> The following object is masked from ‘package:stats’:
#>
#> filter
df1 <- data.frame(
"char" = c("a", "b", NA, "a", "k", "x"),
"fact" = factor(c("f1", "f2", NA, NA, "f1", "f1")),
"logi" = c(NA, FALSE, TRUE, NA, FALSE, NA)
)
df2 <- data.frame(
"char" = c("a", "b", NA, "a", "k", "x"),
"fact" = factor(c("f1", "f2", NA, NA, "f1", "f1")),
"num" = 1:6
)
db <- dm(df1, df2)
dm_fact <- dm_explicit_na(db)
dm_fact$df1
#> char fact logi
#> 1 a f1 NA
#> 2 b f2 FALSE
#> 3 <Missing> <Missing> TRUE
#> 4 a <Missing> NA
#> 5 k f1 FALSE
#> 6 x f1 NA
dm_fact$df2
#> char fact num
#> 1 a f1 1
#> 2 b f2 2
#> 3 <Missing> <Missing> 3
#> 4 a <Missing> 4
#> 5 k f1 5
#> 6 x f1 6