
Calculation of R2 Matrix between Sample Variables and Principal Components
Source:R/pca_cor_samplevar.R
h_pca_df_r2_matrix.RdThis function processes sample variables from AnyHermesData and the
corresponding principal components matrix, and then generates the matrix of R2 values.
Arguments
- pca
(
matrix)
comprises principal components generated bycalc_pca().- df
(
data.frame)
from theSummarizedExperiment::colData()of aAnyHermesDataobject.
Details
Note that only the
dfcolumns which arenumeric,character,factororlogicalare included in the resulting matrix, because other variable types are not supported.In addition,
dfcolumns which are constant, allNA, orcharacterorfactorcolumns with too many levels are also dropped before the analysis.
See also
h_pca_var_rsquared() which is used internally to calculate the R2 for one
sample variable.
Examples
object <- hermes_data %>%
add_quality_flags() %>%
filter() %>%
normalize()
# Obtain the principal components.
pca <- calc_pca(object)$x
# Obtain the `colData` as a `data.frame`.
df <- as.data.frame(colData(object))
# Correlate them.
r2_all <- h_pca_df_r2_matrix(pca, df)
str(r2_all)
#> num [1:18, 1:39] 0.102 0.183 0.147 0.391 0.662 ...
#> - attr(*, "dimnames")=List of 2
#> ..$ : chr [1:18] "PC1" "PC2" "PC3" "PC4" ...
#> ..$ : chr [1:39] "AGEGRP" "AGE18" "STDDRS" "STDDRSD" ...
# We can see that only about half of the columns from `df` were
# used for the correlations.
ncol(r2_all)
#> [1] 39
ncol(df)
#> [1] 74