Skip to contents

Encode Categorical Missing Values in a DM Object

Usage

dm_explicit_na(
  data,
  omit_tables = NULL,
  omit_columns = NULL,
  char_as_factor = TRUE,
  logical_as_factor = FALSE,
  na_level = "<Missing>"
)

Arguments

data

(dm) object to be transformed.

omit_tables

(character) the names of the table to omit from processing.

omit_columns

(character) the names of the columns to omit from processing.

char_as_factor

(logical) should character columns be transformed into factor.

logical_as_factor

(logical) should logical columns be transformed into factor.

na_level

(character) the label to encode missing levels.

Value

dm object with explicit missing levels

Details

This is a helper function to encode missing entries across groups of categorical variables in potentially all tables of a dm object. The label attribute of the columns is preserved.

Examples

library(dm)
#> 
#> Attaching package: ‘dm’
#> The following object is masked from ‘package:stats’:
#> 
#>     filter

df1 <- data.frame(
  "char" = c("a", "b", NA, "a", "k", "x"),
  "fact" = factor(c("f1", "f2", NA, NA, "f1", "f1")),
  "logi" = c(NA, FALSE, TRUE, NA, FALSE, NA)
)
df2 <- data.frame(
  "char" = c("a", "b", NA, "a", "k", "x"),
  "fact" = factor(c("f1", "f2", NA, NA, "f1", "f1")),
  "num" = 1:6
)

db <- dm(df1, df2)

dm_fact <- dm_explicit_na(db)
dm_fact$df1
#>        char      fact  logi
#> 1         a        f1    NA
#> 2         b        f2 FALSE
#> 3 <Missing> <Missing>  TRUE
#> 4         a <Missing>    NA
#> 5         k        f1 FALSE
#> 6         x        f1    NA
dm_fact$df2
#>        char      fact num
#> 1         a        f1   1
#> 2         b        f2   2
#> 3 <Missing> <Missing>   3
#> 4         a <Missing>   4
#> 5         k        f1   5
#> 6         x        f1   6