R/AllGenerics.R
, R/GeneSetDb-methods.R
conform.Rd
conform
-ing, a GeneSetDb
to a target expression
object is an important step required prior to perform any type of GSEA. This
function maps the featureIds used in the GeneSetDb to the elements of a
target expression object (ie. the rows of an expression matrix, or the
elements of a vector of gene-level statistics).
After conform
-ation, each geneset in the GeneSetDb
is flagged
as active (or inactive) given the number of its features that are
successfully mapped to target
and the minimum and maximum number of
genes per geneset required as specified by the min.gs.size
and
max.gs.size
parameters, respectively.
Only genesets that are marked with active = TRUE
will be used in any
downstream gene set operations.
conform(x, ...)
unconform(x, ...)
# S4 method for class 'GeneSetDb'
conform(
x,
target,
unique.by = c("none", "mean", "var"),
min.gs.size = 2L,
max.gs.size = Inf,
match.tolerance = 0.25,
...
)
# S4 method for class 'GeneSetDb'
unconform(x, ...)
is.conformed(x, to)
The GeneSetDb
moar args
The expression object/matrix to conform to. This could also just be a character vector of IDs.
If there are multiple rows that map to the identifiers used in the genesets, this is a means to pick the single row for that ID
Ensure that the genesets that make their way to the
GeneSetDb@table
are of a minimum size
Ensure that the genesets that make their way to the
GeneSetDb@table
are smaller than this size
Numeric value between [0,1]. If the fraction of
feature_id
s used in x
that match rownames(y)
is below
this number, a warning will be fired.
the object to test conformation to
A GeneSetDb()
that has been matched/conformed to an expression
object target y
.
is.conformed()
: Checks to see if GeneSetDb x
is conformed to a target
object to
es <- exampleExpressionSet()
gdb <- exampleGeneSetDb()
head(geneSets(gdb))
#> collection name active N n
#> 1 c2 BIOCARTA_AGPCR_PATHWAY FALSE 13 NA
#> 2 c2 BOYAULT_LIVER_CANCER_SUBCLASS_G123_DN FALSE 51 NA
#> 3 c2 BURTON_ADIPOGENESIS_PEAK_AT_2HR FALSE 51 NA
#> 4 c2 BYSTRYKH_HEMATOPOIESIS_STEM_CELL_IL3RA FALSE 9 NA
#> 5 c2 CAIRO_PML_TARGETS_BOUND_BY_MYC_UP FALSE 23 NA
#> 6 c2 CHARAFE_BREAST_CANCER_BASAL_VS_MESENCHYMAL_DN FALSE 50 NA
gdb <- conform(gdb, es)
## Note the updated values `active` flag, and n (the number of features
## mapped per gene set)
head(geneSets(gdb))
#> collection name active N n
#> 1 c2 BIOCARTA_AGPCR_PATHWAY TRUE 13 11
#> 2 c2 BOYAULT_LIVER_CANCER_SUBCLASS_G123_DN TRUE 51 41
#> 3 c2 BURTON_ADIPOGENESIS_PEAK_AT_2HR TRUE 51 50
#> 4 c2 BYSTRYKH_HEMATOPOIESIS_STEM_CELL_IL3RA TRUE 9 6
#> 5 c2 CAIRO_PML_TARGETS_BOUND_BY_MYC_UP TRUE 23 23
#> 6 c2 CHARAFE_BREAST_CANCER_BASAL_VS_MESENCHYMAL_DN TRUE 50 45