Skip to contents

[Experimental]

This is a useful function when trying to join genetic with CDISC data sets.

Usage

inner_join_cdisc(
  gene_data,
  cdisc_data,
  patient_key = "USUBJID",
  additional_keys = character()
)

Arguments

gene_data

(data.frame or DataFrame)
genetic data.

cdisc_data

(data.frame)
CDISC data (typically patient level data).

patient_key

(string)
patient identifier.

additional_keys

(character)
potential additional keys for the two data sets.

Value

A data.frame which contains columns from both data sets merged by the keys.

Note

Columns which are contained in both data sets but are not specified as keys are taken from gene_data and not from cdisc_data.

Examples

gene_data <- col_data_with_genes(hermes_data, "counts", gene_spec("GeneID:1820"))
cdisc_data <- data.frame(
  USUBJID = head(gene_data$USUBJID, 10),
  extra = 1:10
)
result <- inner_join_cdisc(gene_data, cdisc_data)
#> Warning: Patients AB12345-CHN-7-id-28, AB12345-CHN-4-id-73, AB12345-RUS-1-id-52, AB12345-PAK-11-id-268, AB12345-CHN-13-id-102, AB12345-CHN-17-id-84, AB12345-BRA-11-id-9, AB12345-CHN-4-id-115, AB12345-CHN-15-id-245, AB12345-CHN-4-id-370 from gene data set were lost because they could not be joined to CDISC data set
result
#>                  USUBJID Filename       SampleID         AGEGRP AGE18 STDDRS
#> 1   AB12345-CHN-1-id-307     eset 06520046C0018R 12 - <18 years  < 18  DEATH
#> 2  AB12345-CHN-11-id-220     eset 06520105C0017R 12 - <18 years  < 18       
#> 3  AB12345-CHN-15-id-201     eset 06520103C0017R    >= 18 years >= 18  DEATH
#> 4  AB12345-CHN-15-id-262     eset 06520067C0018R   2 - <6 years  < 18       
#> 5   AB12345-CHN-3-id-128     eset 06520011B0023R    >= 18 years >= 18  DEATH
#> 6   AB12345-CHN-7-id-267     eset 06520092C0017R 12 - <18 years  < 18  DEATH
#> 7  AB12345-NGA-11-id-173     eset 06520062C0017R  6 - <12 years  < 18  DEATH
#> 8   AB12345-RUS-3-id-378     eset 06520063C0043R 12 - <18 years  < 18  DEATH
#> 9   AB12345-USA-1-id-261     eset 06520022C0017R 12 - <18 years  < 18       
#> 10   AB12345-USA-1-id-45     eset 06520001B0023R    >= 18 years >= 18  DEATH
#>                                STDDRSD    STDSSDT              TRTDRS
#> 1  DEATH DUE TO PROGRESSION OF DISEASE 07/24/2016 PROGRESSIVE DISEASE
#> 2                                                 PROGRESSIVE DISEASE
#> 3  DEATH DUE TO PROGRESSION OF DISEASE 08/12/2016 PROGRESSIVE DISEASE
#> 4                                                 PROGRESSIVE DISEASE
#> 5  DEATH DUE TO PROGRESSION OF DISEASE 05/31/2016       ADVERSE EVENT
#> 6  DEATH DUE TO PROGRESSION OF DISEASE 02/16/2017 PROGRESSIVE DISEASE
#> 7  DEATH DUE TO PROGRESSION OF DISEASE 09/11/2016 PROGRESSIVE DISEASE
#> 8  DEATH DUE TO PROGRESSION OF DISEASE 08/05/2016 PROGRESSIVE DISEASE
#> 9                                                 PROGRESSIVE DISEASE
#> 10 DEATH DUE TO PROGRESSION OF DISEASE 01/08/2016 PROGRESSIVE DISEASE
#>                   TRTDRSD BHDCIRC BHDCIRCU ADAFL BLANP BKPS BLKS BTANNER
#> 1  PROGRESSION OF DISEASE      NA              Y    NA   80   80      NA
#> 2  PROGRESSION OF DISEASE      NA              Y   100   NA  100      NA
#> 3  PROGRESSION OF DISEASE      NA              Y    NA   90   90      NA
#> 4  PROGRESSION OF DISEASE      NA              Y    NA  100  100      NA
#> 5           ADVERSE EVENT      NA              Y   100   NA  100      NA
#> 6  PROGRESSION OF DISEASE      NA              Y   100   NA  100      NA
#> 7  PROGRESSION OF DISEASE      NA              Y    NA  100  100      NA
#> 8  PROGRESSION OF DISEASE      NA              Y    90   NA   90      NA
#> 9  PROGRESSION OF DISEASE      NA              Y    NA   90   90      NA
#> 10 PROGRESSION OF DISEASE      NA              Y    NA   90   90      NA
#>             FRPST     DURIDX    DURSAF    DURSUR LNTHRPY AENCIFL STUDYID
#> 1  POST-MENARCHAL  61.667351 1.6755647  5.158111       5      NA AB12345
#> 2   PRE-MENARCHAL  14.981520 3.0882957  5.815195       4      NA AB12345
#> 3                  10.841889 0.9856263  3.055441       4      NA AB12345
#> 4                 131.088296 1.6755647 19.252567       3      NA AB12345
#> 5                  55.030801 0.9856263  5.026694       3      NA AB12345
#> 6                   7.687885 3.0225873 12.418891       1      NA AB12345
#> 7                  99.285421 3.5811088  6.340862       7      NA AB12345
#> 8                  93.568789 0.9856263  4.665298       3      NA AB12345
#> 9                  41.626283 1.6755647 18.858316       3      NA AB12345
#> 10                 45.733060 0.6570842  2.299795       2      NA AB12345
#>             RFSTDTC          RFENDTC         RFXSTDTC         RFXENDTC
#> 1  2016-03-10T14:05 2016-03-31T15:49 2016-03-10T14:05 2016-03-31T15:49
#> 2  2017-04-11T12:35 2017-06-14T11:45 2017-04-11T12:35 2017-06-14T11:45
#> 3  2016-05-31T14:10 2016-05-31T14:10 2016-05-31T14:10 2016-05-31T14:10
#> 4  2016-01-11T14:30 2016-02-01T13:30 2016-01-11T14:30 2016-02-01T13:30
#> 5  2016-01-14T12:37 2016-01-14T12:37 2016-01-14T12:37 2016-01-14T12:37
#> 6  2016-02-12T11:05 2016-04-14T12:00 2016-02-12T11:05 2016-04-14T12:00
#> 7  2016-03-21T15:40 2016-06-08T18:00 2016-03-21T15:40 2016-06-08T18:00
#> 8  2016-03-24T14:25 2016-03-24T14:25 2016-03-24T14:25 2016-03-24T14:25
#> 9  2016-02-08T12:37 2016-02-29T14:15 2016-02-08T12:37 2016-02-29T14:15
#> 10 2015-11-05T11:00 2015-11-05T11:00 2015-11-05T11:00 2015-11-05T11:00
#>       RFICDTC   RFPENDTC     DTHDTC DTHFL SITEID  INVID AGE  AGEU SEX
#> 1  2016-02-18 2016-07-24 2016-07-24     Y 283495 223804  12 YEARS   F
#> 2  2017-04-03                             283694 456732  15 YEARS   M
#> 3  2016-05-11 2016-08-12 2016-08-12     Y 282087 468105  27 YEARS   F
#> 4  2016-01-07                             282087 468105   2 YEARS   F
#> 5  2015-12-30 2016-05-31 2016-05-31     Y 283495 223804  19 YEARS   F
#> 6  2016-02-04 2017-02-16 2017-02-16     Y 280959  20842  16 YEARS   M
#> 7  2016-03-02 2016-09-11 2016-09-11     Y 283497 241874   7 YEARS   F
#> 8  2016-03-16 2016-08-05 2016-08-05     Y 284024 457432  13 YEARS   M
#> 9  2016-02-01                             281049 457179  16 YEARS   F
#> 10 2015-10-30 2016-01-08 2016-01-08     Y 283971 235545  19 YEARS   F
#>                         RACE                 ETHNIC ARMCD       ARM ACTARMCD
#> 1                      WHITE NOT HISPANIC OR LATINO  COH3  COHORT 3     COH3
#> 2                      WHITE NOT HISPANIC OR LATINO COH12 COHORT 12    COH12
#> 3                    UNKNOWN           NOT REPORTED COH9E COHORT 9E    COH9E
#> 4                    UNKNOWN           NOT REPORTED  COH1  COHORT 1     COH1
#> 5                   MULTIPLE NOT HISPANIC OR LATINO  COH1  COHORT 1     COH1
#> 6                      WHITE NOT HISPANIC OR LATINO COH9O COHORT 9O    COH9O
#> 7  BLACK OR AFRICAN AMERICAN     HISPANIC OR LATINO  COH6  COHORT 6     COH6
#> 8                      WHITE NOT HISPANIC OR LATINO  COH8  COHORT 8     COH8
#> 9                      ASIAN NOT HISPANIC OR LATINO  COH1  COHORT 1     COH1
#> 10                   UNKNOWN     HISPANIC OR LATINO  COH6  COHORT 6     COH6
#>       ACTARM COUNTRY      DMDTC DMDY BAGE BAGEU    BWT BWTU   BHT BHTU     BBMI
#> 1   COHORT 3     CHN 2016-02-18  -21   12 YEARS  50.00   kg 157.0   cm 20.28480
#> 2  COHORT 12     CHN 2017-04-04   -7   15 YEARS  26.15   kg 136.0   cm 14.13819
#> 3   COHORT 9     CHN 2016-05-25   -6   27 YEARS  61.60   kg 173.0   cm 20.58204
#> 4   COHORT 1     CHN 2016-01-07   -4    2 YEARS  64.60   kg 177.1   cm 20.59659
#> 5   COHORT 1     CHN 2015-12-30  -15   19 YEARS  40.60   kg 154.0   cm 17.11924
#> 6   COHORT 9     CHN 2016-02-04   -8   16 YEARS  45.20   kg 161.1   cm 17.41596
#> 7   COHORT 6     NGA 2016-03-02  -19    7 YEARS  53.90   kg 176.0   cm 17.40057
#> 8   COHORT 8     RUS 2016-03-22   -2   13 YEARS  25.40   kg 125.0   cm 16.25600
#> 9   COHORT 1     USA 2016-02-01   -7   16 YEARS 104.70   kg 172.0   cm 35.39075
#> 10  COHORT 6     USA 2015-10-30   -6   19 YEARS  57.00   kg 172.0   cm 19.26717
#>    ITTFL SAFFL    INFCODT     RANDDT          TRTSDTC             TRTSDTM
#> 1      Y     Y 2016-02-18 2016-03-09 2016-03-10T14:05 2016-03-10 14:05:00
#> 2      Y     Y 2017-04-03 2017-04-11 2017-04-11T12:35 2017-04-11 12:35:00
#> 3      Y     Y 2016-05-11 2016-05-31 2016-05-31T14:10 2016-05-31 14:10:00
#> 4      Y     Y 2016-01-07 2016-01-11 2016-01-11T14:30 2016-01-11 14:30:00
#> 5      Y     Y 2015-12-30 2016-01-14 2016-01-14T12:37 2016-01-14 12:37:00
#> 6      Y     Y 2016-02-04 2016-02-09 2016-02-12T11:05 2016-02-12 11:05:00
#> 7      Y     Y 2016-03-02 2016-03-21 2016-03-21T15:40 2016-03-21 15:40:00
#> 8      Y     Y 2016-03-16 2016-03-24 2016-03-24T14:25 2016-03-24 14:25:00
#> 9      Y     Y 2016-02-01 2016-02-03 2016-02-08T12:37 2016-02-08 12:37:00
#> 10     Y     Y 2015-10-30 2015-11-05 2015-11-05T11:00 2015-11-05 11:00:00
#>    TRTSTMF             TRTEDTM TRTETMF TRTDUR DISCSTUD DISCDEAT DISCAE DISTRTFL
#> 1        S 2016-03-31 16:55:59       S     22        Y        Y      N        Y
#> 2        S 2017-06-14 12:15:59       S     65        N        N      N        Y
#> 3        S 2016-05-31 15:10:59       S      1        Y        Y      N        Y
#> 4        S 2016-02-01 14:30:59       S     22        N        N      N        Y
#> 5        S 2016-01-14 13:37:59       S      1        Y        Y      N        Y
#> 6        S 2016-04-14 12:30:59       S     63        Y        Y      N        Y
#> 7        S 2016-06-08 18:30:59       S     80        Y        Y      N        Y
#> 8        S 2016-03-24 15:30:59       S      1        Y        Y      N        Y
#> 9        S 2016-02-29 14:48:59       S     22        N        N      N        Y
#> 10       S 2015-11-05 12:00:59       S      1        Y        Y      N        Y
#>    AEWITHFL     ALIVDT
#> 1         N 2016-07-24
#> 2         N 2017-09-27
#> 3         N 2016-08-12
#> 4         N 2017-08-15
#> 5         Y 2016-05-31
#> 6         N 2017-02-16
#> 7         N 2016-09-11
#> 8         N 2016-08-05
#> 9         N 2017-08-28
#> 10        N 2016-01-08
#>                                                           COHORT
#> 1                                       Cohort 3 (NEUROBLASTOMA)
#> 2                    Cohort 12 (ATYPICAL TERATOID RHABOID TUMOR)
#> 3  Cohort 9 (OTHER TUMOR TYPES WITH DOCUMENTED PD-L1 EXPRESSION)
#> 4                                       Cohort 1 (EWING SARCOMA)
#> 5                                       Cohort 1 (EWING SARCOMA)
#> 6  Cohort 9 (OTHER TUMOR TYPES WITH DOCUMENTED PD-L1 EXPRESSION)
#> 7                                        Cohort 6 (OSTEOSARCOMA)
#> 8                                         Cohort 8 (WILMS TUMOR)
#> 9                                       Cohort 1 (EWING SARCOMA)
#> 10                                       Cohort 6 (OSTEOSARCOMA)
#>                                                                                                                                        TTYPE
#> 1                                                                                                                              NEUROBLASTOMA
#> 2                                                                                                           ATYPICAL TERATOID RHABDOID TUMOR
#> 3                                                 GERM CELL TUMOR - YOLK SAC TUMOR (ENDODERMAL SINUS TUMOR) WITH DOCUMENTED PD-L1 EXPRESSION
#> 4                                                                                                                              EWING SARCOMA
#> 5                                                                                                                              EWING SARCOMA
#> 6  OTHER TUMOR TYPES WITH DOCUMENTED PD-L1 EXPRESSION ON EITHER TUMOR CELLS OR IMMUNE CELLS (TUMORS TYPE MUST NOT BE INCLUDED IN LIST ABOVE)
#> 7                                                                                                                               OSTEOSARCOMA
#> 8                                                                                                                                WILMS TUMOR
#> 9                                                                                                                              EWING SARCOMA
#> 10                                                                                                                              OSTEOSARCOMA
#>    STDSSDY low_depth_flag tech_failure_flag GeneID.1820 extra
#> 1      137          FALSE             FALSE          16    10
#> 2       NA          FALSE             FALSE          54     4
#> 3       74          FALSE             FALSE         567     6
#> 4       NA          FALSE             FALSE         153     2
#> 5      139          FALSE             FALSE          49     1
#> 6      371          FALSE             FALSE         111     5
#> 7      175           TRUE             FALSE         180     9
#> 8      135           TRUE             FALSE          94     3
#> 9       NA          FALSE             FALSE         171     8
#> 10      65          FALSE             FALSE         118     7