Skip to contents

Cutting data by group

Usage

cut_by_group(df, col_data, col_group, group, cat_col)

Arguments

df

(dataframe) with a column of data to be cut and a column specifying the group of each observation.

col_data

(character) the column containing the data to be cut.

col_group

(character) the column containing the names of the groups according to which the data should be split.

group

(nested list) providing for each parameter value that should be analyzed in a categorical way: the name of the parameter (character), a series of breakpoints (numeric) where the first breakpoints is typically -Inf and the last Inf, and a series of name which will describe each category (character).

cat_col

(character) the name of the new column in which the cut label should he stored.

Value

data.frame with a column containing categorical values.

Details

Function used to categorize numeric data stored in long format depending on their group. Intervals are closed on the right (and open on the left).

Examples

group <- list(
  list(
    "Height",
    c(-Inf, 150, 170, Inf),
    c("=<150", "150-170", ">170")
  ),
  list(
    "Weight",
    c(-Inf, 65, Inf),
    c("=<65", ">65")
  ),
  list(
    "Age",
    c(-Inf, 31, Inf),
    c("=<31", ">31")
  ),
  list(
    "PreCondition",
    c(-Inf, 1, Inf),
    c("=<1", "<1")
  )
)
data <- data.frame(
  SUBJECT = rep(letters[1:10], 4),
  PARAM = rep(c("Height", "Weight", "Age", "other"), each = 10),
  AVAL = c(rnorm(10, 165, 15), rnorm(10, 65, 5), runif(10, 18, 65), rnorm(10, 0, 1)),
  index = 1:40
)

cut_by_group(data, "AVAL", "PARAM", group, "my_new_categories")
#>    SUBJECT  PARAM        AVAL index my_new_categories
#> 1        a Height 157.2725530     1           150-170
#> 2        b Height 154.9778764     2           150-170
#> 3        c Height 168.8468203     3           150-170
#> 4        d Height 164.0055743     4           150-170
#> 5        e Height 162.5247391     5           150-170
#> 6        f Height 171.2907129     6              >170
#> 7        g Height 155.1490217     7           150-170
#> 8        h Height 173.1627275     8              >170
#> 9        i Height 164.6799735     9           150-170
#> 10       j Height 165.5452721    10           150-170
#> 11       a Weight  65.5641024    11               >65
#> 12       b Weight  70.3865852    12               >65
#> 13       c Weight  62.3552697    13              =<65
#> 14       d Weight  61.8770028    14              =<65
#> 15       e Weight  62.7078588    15              =<65
#> 16       f Weight  68.5815406    16               >65
#> 17       g Weight  62.4612351    17              =<65
#> 18       h Weight  60.4098953    18              =<65
#> 19       i Weight  62.2419759    19              =<65
#> 20       j Weight  63.4265753    20              =<65
#> 21       a    Age  28.1285390    21              =<31
#> 22       b    Age  21.6080439    22              =<31
#> 23       c    Age  42.2775297    23               >31
#> 24       d    Age  44.2139626    24               >31
#> 25       e    Age  35.9991782    25               >31
#> 26       f    Age  32.2212677    26               >31
#> 27       g    Age  34.6173737    27               >31
#> 28       h    Age  60.5048244    28               >31
#> 29       i    Age  32.9622315    29               >31
#> 30       j    Age  30.7340572    30              =<31
#> 31       a  other   1.0118062    31              <NA>
#> 32       b  other   0.7266998    32              <NA>
#> 33       c  other  -0.6311388    33              <NA>
#> 34       d  other   0.6448591    34              <NA>
#> 35       e  other   0.1394645    35              <NA>
#> 36       f  other  -0.5889006    36              <NA>
#> 37       g  other  -1.1900977    37              <NA>
#> 38       h  other   0.8526860    38              <NA>
#> 39       i  other   0.8197633    39              <NA>
#> 40       j  other   0.5939371    40              <NA>