S3 generic for creating an information summary about the duplicate key values in a dataset
get_key_duplicates.RdUsage
get_key_duplicates(dataset, keys = NULL)
# S3 method for TealDataset
get_key_duplicates(dataset, keys = NULL)
# S3 method for data.frame
get_key_duplicates(dataset, keys = NULL)Arguments
- dataset
TealDatasetordata.framea dataset, which will be tested- keys
charactervector of variable names indatasetconsisting the key orkeysobject, which does have aprimaryelement with a vector of variable names indatasetconsisting the key. Optional, default: NULL
Details
The information summary provides row numbers and number of duplicates for each duplicated key value.
Note
Raises an exception when this function cannot determine the primary key columns of the tested object.
Examples
library(scda)
adsl <- synthetic_cdisc_data("latest")$adsl
# create a TealDataset with default keys
rel_adsl <- cdisc_dataset("ADSL", adsl)
get_key_duplicates(rel_adsl)
#> # A tibble: 0 × 4
#> # ℹ 4 variables: STUDYID <chr>, USUBJID <chr>, rows <chr>, n <int>
df <- as.data.frame(
list(a = c("a", "a", "b", "b", "c"), b = c(1, 2, 3, 3, 4), c = c(1, 2, 3, 4, 5))
)
res <- get_key_duplicates(df, keys = c("a", "b")) # duplicated keys are in rows 3 and 4
print(res) # prints a tibble
#> a b rows n
#> 1 b 3 3,4 2
if (FALSE) {
get_key_duplicates(df) # raises an exception, because keys are missing with no default
}