Adds preliminary support for data-frame based inputs to multiGSEA()
to more easily support pre-ranked and ORA-like tests. When x
is a data.frame, use the rank_by
parameter to specify the column name that holds the rank metric, and rank_order
to specify how to order the ranks ("descending"
, "ascending"
, or "ordered"
).
The implementation is still tied to the original design of package which only supported a numeric ranking vector, and hijacks the whole xmeta.
workaround. Consider using FacileAnalysis::ffsea(data.frame, gdb)
if you want a more coherent interface.
geneSet,MultiGSEAResult
method does not remove duplicated columns present in both geneSet(gdb)
and logFC
, but rather appends a *.gs
suffix to the duplicated columns from geneSet
. This happened because I defined a geneset with a logFC
column (from the experiment the set was derived) from, but this collided with (and superseded) the logFC
value for the gene returned from logFC(mgresult)
, and messed up interpretation of a geneset’s move across a contrast.scoreSingleSamples now defualts to returning a melted data.frame of results (previously we returned a (list of) matrix of scores). melted=FALSE
parameter was changed to as.matrix=FALSE
Internal MSigDB collections updated to v5.1 (as of v0.4.30). The major difference in v5.1 vs v5.0 seems to be an updated c7 (immunologic) collection, which comes on the heels of this paper: Compendium of Immune Signatures Identifies Conserved and Species-Specific Biology in Response to Inflammation http://www.cell.com/immunity/abstract/S1074-7613(15)00532-4
All methods now default to returning data.frame(s) instead of data.table(s) data.table is still used internally, however I understand that some people don’t want to have them returned into their workspace. A global option (multiGSEA.df.return) is available for you to tweak this behavior, ie. set options(multiGSEA.df.return='data.tabe')
to have geneSets(gdb)
return a data.table instead of a data.frame.
Changed geneSetFeatureStatistics to geneSetsStats
use.treat
parameter has been added to relevant places to run the internal differential testing via the “treat” framework made available in limma and edgeR via the treat
and glmTreat
“pipelines,” respectively. By default, all pipelines do not use the treat framework (but perhaps they should).
goseq, fry, and romer testing methods have been added.
Support for DGEList expression input fully baked in. Previously, a DGEList was shot through voom to be used internally. Internal differential gene expression statistics on DGELists use edgeR’s quasi-likelihood framework. See ?edgeR::glmQLFit
for more information, and particulary reference Aaron Lun’s tutorial on using this approach here:
Adds support for PANTHER pathway (and GOSLIM) genesets from PANTHER.db package via the getPantherGeneSetDb
function.
Added geneSetFeatureStats to return the logFC (and membership) information for a geneset after a multiGSEA run.
Added constructors and as()
coercion functions to create GeneSetDb’s from GeneSetCollection(s) and data.frames (and vice versa).
scoreSingleSamples
is another wrapper function which calls several single sample geneset scoring algorithms, including the ones provided by the GSVA package (such as plage and ssGSEA)
Adds expression-utils.R for CPM/RPKM and meltx. Also has methods to facilitate “universal access” to pData, fData, and expression data from the different often-used expression containers (ExpressionSet, EList, DGEList, SummarizedExperiment)
MSigDB definitions were updated to v5.0, which adds a new “hallmark (h)” gene set. A call to getMSigDBset
without a version parameter will return the v5.0 data.
Note that MSigDB provides the entrez IDs for each geneset as human genes. The mouse version of these gene sets was constructing by mapping the human entrez ID’s to their mouse orthologs via igis::orthologs
. When this mapping returns multiple mouse IDs for a single human ID, all of the mouse IDs are kept in the mouse geneset.
To get the previous v4.0 genesets (provided by WEHI), you would simply specify version=‘v4.0’ in the getMSigDBSet like so: