The HermesData class is an extension of SummarizedExperiment::SummarizedExperiment
with additional validation criteria.
HermesData(object)
HermesDataFromMatrix(counts, ...)(SummarizedExperiment)
input to create the HermesData object from.
If this is a RangedSummarizedExperiment, then the result will be
RangedHermesData.
(matrix)
counts to create the HermesData object from.
additional arguments, e.g. rowData, colData, etc. passed to
SummarizedExperiment::SummarizedExperiment() internally. Note that if rowRanges
is passed instead of rowData, then the result will be a RangedHermesData object.
An object of class AnyHermesData (HermesData or RangedHermesData).
The additional criteria are:
The first assay must be counts containing non-missing, integer, non-negative values.
The following columns must be in rowData:
symbol (also often called HGNC or similar, example: "INMT")
desc (the gene name, example: "indolethylamine N-methyltransferase")
chromosome (the chromosome as string, example: "7")
size (the size of the gene in base pairs, e.g 5468)
low_expression_flag (can be populated with add_quality_flags())
The following columns must be in colData:
low_depth_flag (can be populated with add_quality_flags())
tech_failure_flag (can be populated with add_quality_flags())
The object must have unique row and column names. The row names are the gene names and the column names are the sample names.
Analogously, RangedHermesData is an extension of
SummarizedExperiment::RangedSummarizedExperiment and has the same
additional validation requirements. Methods can be defined for both classes at the
same time with the AnyHermesData signature.
A Biobase::ExpressionSet object can be imported by using the
SummarizedExperiment::makeSummarizedExperimentFromExpressionSet() function to
first convert it to a SummarizedExperiment::SummarizedExperiment object before
converting it again into a HermesData object.
prefixcommon prefix of the gene IDs (row names).
Note that we use S4Vectors::setValidity2() to define the validity
method, which allows us to turn off the validity checks in internal
functions where intermediate objects may not be valid within the scope of
the function.
It can be helpful to convert character and logical variables to factors in colData()
(before or after the HermesData creation). We provide the utility function
df_cols_to_factor() to simplify this task, but leave it to the user to allow
for full control of the details.
# Convert an `ExpressionSet` to a `RangedSummarizedExperiment`.
ranged_summarized_experiment <- makeSummarizedExperimentFromExpressionSet(expression_set)
# Then convert to `RangedHermesData`.
HermesData(ranged_summarized_experiment)
#> class: RangedHermesData
#> assays(1): counts
#> genes(5085): GeneID:11185 GeneID:10677 ... GeneID:9087 GeneID:9426
#> additional gene information(12): HGNC HGNCGeneName ... chromosome_name
#> LowExpressionFlag
#> samples(20): 06520011B0023R 06520067C0018R ... 06520015C0016R
#> 06520019C0023R
#> additional sample information(74): Filename SampleID ... LowDepthFlag
#> TechnicalFailureFlag
# Create objects starting from a `SummarizedExperiment`.
hermes_data <- HermesData(summarized_experiment)
hermes_data
#> class: HermesData
#> assays(1): counts
#> genes(5085): GeneID:11185 GeneID:10677 ... GeneID:9087 GeneID:9426
#> additional gene information(12): HGNC HGNCGeneName ... chromosome_name
#> LowExpressionFlag
#> samples(20): 06520011B0023R 06520067C0018R ... 06520015C0016R
#> 06520019C0023R
#> additional sample information(74): Filename SampleID ... LowDepthFlag
#> TechnicalFailureFlag
# Create objects from a matrix. Note that additional arguments are not required but possible.
counts_matrix <- assay(summarized_experiment)
counts_hermes_data <- HermesDataFromMatrix(counts_matrix)