R/pca_cor_samplevar.R
h_pca_df_r2_matrix.RdThis function processes sample variables from AnyHermesData and the
corresponding principal components matrix, and then generates the matrix of R2 values.
h_pca_df_r2_matrix(pca, df)(matrix)
comprises principal components generated by calc_pca().
(data.frame)
from the SummarizedExperiment::colData() of a
AnyHermesData object.
A matrix with R2 values for all combinations of sample variables and principal components.
Note that only the df columns which are numeric, character, factor or
logical are included in the resulting matrix, because other variable types are not
supported.
In addition, df columns which are constant, all NA, or character or factor
columns with too many levels are also dropped before the analysis.
h_pca_var_rsquared() which is used internally to calculate the R2 for one
sample variable.
object <- hermes_data %>%
add_quality_flags() %>%
filter() %>%
normalize()
# Obtain the principal components.
pca <- calc_pca(object)$x
# Obtain the `colData` as a `data.frame`.
df <- as.data.frame(colData(object))
# Correlate them.
r2_all <- h_pca_df_r2_matrix(pca, df)
str(r2_all)
#> num [1:18, 1:39] 0.102 0.183 0.147 0.391 0.662 ...
#> - attr(*, "dimnames")=List of 2
#> ..$ : chr [1:18] "PC1" "PC2" "PC3" "PC4" ...
#> ..$ : chr [1:39] "AGEGRP" "AGE18" "STDDRS" "STDDRSD" ...
# We can see that only about half of the columns from `df` were
# used for the correlations.
ncol(r2_all)
#> [1] 39
ncol(df)
#> [1] 74