class: HermesData
assays(1): counts
genes(5085): GeneID:11185 GeneID:10677 ... GeneID:9087 GeneID:9426
additional gene information(12): HGNC HGNCGeneName ... chromosome_name
LowExpressionFlag
samples(20): 06520011B0023R 06520067C0018R ... 06520015C0016R
06520019C0023R
additional sample information(74): Filename SampleID ... LowDepthFlag
TechnicalFailureFlag
RNAG1
RNAseq QC Graphs
This page can be used as a template of how to use the available hermes
functions for simple QC analyses of RNA-seq gene expression data and to create interactive QC graphs using teal.modules.hermes
.
We start by creating HermesData
from a SummarizedExperiment
(SE) object. An example SummarizedExperiment
object with name summarized_experiment
is available in hermes
.
At this point we can also take the already prepared object hermes_data
instead. First we add all quality flags (low expression, low read depth, technical failure).
We can look at how many samples or genes have been flagged.
FALSE TRUE
2391 2694
FALSE TRUE
19 1
FALSE
20
We then filter for samples that pass all quality checks (i.e. all flags are FALSE
).
It is also possible to only filter genes or samples via the what
argument of filter()
.
Now that the samples are properly filtered, we can apply our normalization method. By default, if a method is not specified for normalize()
then five methods (cpm
, rpkm
, tpm
, voom
, vst
) will be performed and results saved as separate assays. If only one method is needed, it can be easily specified in the methods
argument. In addition, if the rlog
transformation method is preferred, it can also be specified in the methods
argument.
The hermes
package offers a series of draw_*
functions to help in the QC process. First we introduce draw_libsize_hist()
which displays a histogram of the sample library sizes.
The draw_libsize_qq()
displays a QQ plot of the samples library size. Here we look for potential outliers.
The draw_libsize_densities()
displays a density plot of the (log) counts distributions. Distribution lines correspond to each sample.
The draw_nonzero_boxplot()
displays a box plot of the non-zero expressed genes per sample.
It’s also possible to add an additional ggplot2
layer to get the sample ID of any points of interest, reusing the same position for labeling. If the labeling of several data points overlaps, the parameters of position_jitter()
or geom_text_repel()
can be adjusted to avoid that.
Code
The draw_nonzero_boxplot()
displays a box plot of the non-zero expressed genes per sample.
We can also select for specific chromosomes by specifying the values in the chromosomes
parameter. For example, to only display chromosomes 1 and 2 separately and ignore all others:
It is also possible to call all available draw_*
functions with one function, autoplot()
. Note: autoplot()
does not allow for customization of the plots. We recommend using the appropriate draw_*
function if you wish to make adjustments to the plots.
We start by importing a MultiAssayExperiment
; here we use the example multi_assay_experiment
available in hermes
. It is wrapped as a teal::dataset
. We can then use the provided teal module tm_g_quality
to include a QC module in our teal app.
Code
[INFO] 2024-09-14 17:32:21.9249 pid:5196 token:[fcb95150] teal Initializing reporter_previewer_module
Warning: 'experiments' dropped; see 'drops()'
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] teal.modules.hermes_0.1.6 teal_0.15.2
[3] teal.slice_0.5.1 teal.data_0.6.0
[5] teal.code_0.5.0 shiny_1.9.1
[7] ggrepel_0.9.6 hermes_1.8.1
[9] SummarizedExperiment_1.34.0 Biobase_2.64.0
[11] GenomicRanges_1.56.1 GenomeInfoDb_1.40.1
[13] IRanges_2.38.1 S4Vectors_0.42.1
[15] BiocGenerics_0.50.0 MatrixGenerics_1.16.0
[17] matrixStats_1.4.1 ggfortify_0.4.17
[19] ggplot2_3.5.1
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 jsonlite_1.8.8
[3] shape_1.4.6.1 MultiAssayExperiment_1.30.3
[5] magrittr_2.0.3 farver_2.1.2
[7] rmarkdown_2.28 ragg_1.3.3
[9] GlobalOptions_0.1.2 zlibbioc_1.50.0
[11] vctrs_0.6.5 memoise_2.0.1
[13] webshot_0.5.5 BiocBaseUtils_1.7.3
[15] htmltools_0.5.8.1 S4Arrays_1.4.1
[17] forcats_1.0.0 progress_1.2.3
[19] curl_5.2.2 SparseArray_1.4.8
[21] sass_0.4.9 bslib_0.8.0
[23] fontawesome_0.5.2 htmlwidgets_1.6.4
[25] testthat_3.2.1.1 httr2_1.0.4
[27] cachem_1.1.0 teal.widgets_0.4.2
[29] mime_0.12 lifecycle_1.0.4
[31] iterators_1.0.14 pkgconfig_2.0.3
[33] webshot2_0.1.1 Matrix_1.7-0
[35] R6_2.5.1 fastmap_1.2.0
[37] GenomeInfoDbData_1.2.12 rbibutils_2.2.16
[39] clue_0.3-65 digest_0.6.37
[41] colorspace_2.1-1 shinycssloaders_1.1.0
[43] ps_1.8.0 AnnotationDbi_1.66.0
[45] DESeq2_1.44.0 textshaping_0.4.0
[47] RSQLite_2.3.7 filelock_1.0.3
[49] labeling_0.4.3 fansi_1.0.6
[51] httr_1.4.7 abind_1.4-8
[53] compiler_4.4.1 bit64_4.0.5
[55] withr_3.0.1 doParallel_1.0.17
[57] backports_1.5.0 BiocParallel_1.38.0
[59] DBI_1.2.3 logger_0.3.0
[61] biomaRt_2.60.1 rappdirs_0.3.3
[63] DelayedArray_0.30.1 rjson_0.2.22
[65] chromote_0.3.1 tools_4.4.1
[67] httpuv_1.6.15 glue_1.7.0
[69] callr_3.7.6 promises_1.3.0
[71] grid_4.4.1 checkmate_2.3.2
[73] cluster_2.1.6 generics_0.1.3
[75] gtable_0.3.5 websocket_1.4.2
[77] tidyr_1.3.1 hms_1.1.3
[79] xml2_1.3.6 utf8_1.2.4
[81] XVector_0.44.0 foreach_1.5.2
[83] pillar_1.9.0 stringr_1.5.1
[85] limma_3.60.4 later_1.3.2
[87] circlize_0.4.16 dplyr_1.1.4
[89] BiocFileCache_2.12.0 lattice_0.22-6
[91] bit_4.0.5 tidyselect_1.2.1
[93] ComplexHeatmap_2.20.0 locfit_1.5-9.10
[95] Biostrings_2.72.1 knitr_1.48
[97] gridExtra_2.3 teal.logger_0.2.0
[99] edgeR_4.2.1 xfun_0.47
[101] statmod_1.5.0 brio_1.1.5
[103] stringi_1.8.4 UCSC.utils_1.0.0
[105] yaml_2.3.10 shinyWidgets_0.8.6
[107] evaluate_0.24.0 codetools_0.2-20
[109] tibble_3.2.1 cli_3.6.3
[111] systemfonts_1.1.0 xtable_1.8-4
[113] Rdpack_2.6.1 processx_3.8.4
[115] jquerylib_0.1.4 munsell_0.5.1
[117] teal.reporter_0.3.1 Rcpp_1.0.13
[119] EnvStats_3.0.0 dbplyr_2.5.0
[121] png_0.1-8 parallel_4.4.1
[123] assertthat_0.2.1 blob_1.2.4
[125] prettyunits_1.2.0 scales_1.3.0
[127] purrr_1.0.2 crayon_1.5.3
[129] GetoptLong_1.0.5 rlang_1.1.4
[131] formatR_1.14 KEGGREST_1.44.1
[133] shinyjs_2.1.0