S3 generic for creating an information summary about the duplicate key values in a dataset
get_key_duplicates.Rd
Usage
get_key_duplicates(dataset, keys = NULL)
# S3 method for TealDataset
get_key_duplicates(dataset, keys = NULL)
# S3 method for data.frame
get_key_duplicates(dataset, keys = NULL)
Arguments
- dataset
TealDataset
ordata.frame
a dataset, which will be tested- keys
character
vector of variable names indataset
consisting the key orkeys
object, which does have aprimary
element with a vector of variable names indataset
consisting the key. Optional, default: NULL
Details
The information summary provides row numbers and number of duplicates for each duplicated key value.
Note
Raises an exception when this function cannot determine the primary key columns of the tested object.
Examples
library(scda)
adsl <- synthetic_cdisc_data("latest")$adsl
# create a TealDataset with default keys
rel_adsl <- cdisc_dataset("ADSL", adsl)
get_key_duplicates(rel_adsl)
#> # A tibble: 0 × 4
#> # ℹ 4 variables: STUDYID <chr>, USUBJID <chr>, rows <chr>, n <int>
df <- as.data.frame(
list(a = c("a", "a", "b", "b", "c"), b = c(1, 2, 3, 3, 4), c = c(1, 2, 3, 4, 5))
)
res <- get_key_duplicates(df, keys = c("a", "b")) # duplicated keys are in rows 3 and 4
print(res) # prints a tibble
#> a b rows n
#> 1 b 3 3,4 2
if (FALSE) {
get_key_duplicates(df) # raises an exception, because keys are missing with no default
}