R/AllGenerics.R
, R/GeneSetDb-methods.R
, R/SparrowResult-methods.R
collectionMetadata.Rd
Associates key:value metadata to a gene set collection of a GeneSetDb()
.
collectionMetadata(x, collection, name, ...)
geneSetURL(x, i, j, ...)
featureIdType(x, i, ...)
featureIdType(x, i) <- value
# S4 method for class 'GeneSetDb,missing,missing'
collectionMetadata(x, collection, name, as.dt = FALSE)
# S4 method for class 'GeneSetDb,character,missing'
collectionMetadata(x, collection, name, as.dt = FALSE)
# S4 method for class 'GeneSetDb,character,character'
collectionMetadata(x, collection, name, as.dt = FALSE)
# S4 method for class 'GeneSetDb'
geneSetURL(x, i, j, ...)
# S4 method for class 'GeneSetDb'
featureIdType(x, i) <- value
# S4 method for class 'GeneSetDb'
featureIdType(x, i, ...)
addCollectionMetadata(
x,
xcoll,
xname,
value,
validate.value.fn = NULL,
allow.add = TRUE
)
# S4 method for class 'SparrowResult'
geneSetURL(x, i, j, ...)
The geneset collection to to query
The name of the metadata variable to get the value for
not used yet
The collection,name compound key identifier of the gene set
The value of the metadata variable
If FALSE
(default), the data.frame like thing that
this funciton returns will be set to a data.frame. Set this to TRUE
to keep this object as a data.table
The collection name
The name of the metadata variable
If a function is provided, it is run on
value
and msut return TRUE
for addition to be made
If FALSE
, this xcoll,xname should be in the
GeneSetDb
already, and this will fail because something is
deeply wrong with the world
A character vector of URLs for each of the genesets identified by
i, j
. NA
is returned for genesets i,j
that are not found in x
.
The updated GeneSetDb
.
The design of the GeneSetDb is such that we assume that groups of gene sets are usually defined together and will therefore share similar metadata. These groups of gene sets will fall into the same "collection", and, therefore, metadata for particular gene sets are tracked at the collection level.
Types of metadata being referred to could be things like the organism
that a batch of gene sets were defined in, the type of feature identifiers
that a collection of gene sets are using (ie. GSEABase::EntrezIdentifier()
)
or a URL pattern that combines the collection,name compound key that one
can browse to in order to find out more information about the gene set.
There are explicit helper functions that set and get these aforementioned
metadata, namely featureIdType()
, geneSetCollectionURLfunction()
, and
geneSetURL()
. Aribtrary metadata can be stored at the collection level
using the addCollectionMetadata()
function. More details are provided
below.
collectionMetadata(x = GeneSetDb, collection = missing, name = missing)
: Returns metadata for all collections
collectionMetadata(x = GeneSetDb, collection = character, name = missing)
: Returns all metadata for a specific collection
collectionMetadata(x = GeneSetDb, collection = character, name = character)
: Returns the name
metadata value for a given
collection
.
geneSetURL(GeneSetDb)
: returns the URL for a geneset
featureIdType(GeneSetDb) <- value
: sets the feature id type for a collection
featureIdType(GeneSetDb)
: retrieves the feature id type for a collection
geneSetURL(SparrowResult)
: returns the URL for a geneset from a
SparrowResult object
A URL function can be defined per collection that takes the collection,name
compound key and generates a URL for the gene set that the user can browse
to for futher information. For instance, the
geneSetCollectionURLfunction()
for the MSigDB collections are defined
like so:
url.fn <- function(collection, name) {
url <- 'http://www.broadinstitute.org/gsea/msigdb/cards/%s.html'
sprintf(url, name)
}
gdb <- getMSigGeneSetDb('H')
geneSetCollectionURLfunction(gdb, 'H') <- url.fn
In this way, a call to geneSetURL(gdb, 'H', 'HALLMARK_ANGIOGENESIS')
will return
http://www.broadinstitute.org/gsea/msigdb/cards/HALLMARK_ANGIOGENESIS.html.
This function is vectorized over i
and j
When defining a set of gene sets in a collection, the identifiers used must be of the same type. Most often you'll probably be working with Entrez identifiers, simply because that's what most of the annotations work with.
As such, you'd define that your collection uses geneset identifiers like so:
Adds arbitrary metadata to a gene set collection of a GeneSetDb
Note that this is not a replacement method! You must catch the returned
object to keep the one with the updated collectionMetadata
. Although this
function is exported, I imagine this being used mostly through predefined
replace methods that use this as a utility function, such as the replacement
methods featureIdType<-
, geneSetURLfunction<-
, etc.
gdb <- exampleGeneSetDb()
# Gene Set URLs
geneSetURL(gdb, 'c2', 'BIOCARTA_AGPCR_PATHWAY')
#> c2
#> "http://www.broadinstitute.org/gsea/msigdb/cards/BIOCARTA_AGPCR_PATHWAY.html"
geneSetURL(gdb, c('c2', 'c7'),
c('BIOCARTA_AGPCR_PATHWAY', 'GSE14308_TH2_VS_TH1_UP'))
#> c2
#> "http://www.broadinstitute.org/gsea/msigdb/cards/BIOCARTA_AGPCR_PATHWAY.html"
#> c7
#> "http://www.broadinstitute.org/gsea/msigdb/cards/GSE14308_TH2_VS_TH1_UP.html"
# feature id types
featureIdType(gdb, "c2") <- GSEABase::EntrezIdentifier()
featureIdType(gdb, "c2")
#> geneIdType: EntrezId
## Arbitrary metadata
gdb <- addCollectionMetadata(gdb, 'c2', 'foo', 'bar')
cmh <- collectionMetadata(gdb, 'c2', as.dt = TRUE) ## print this to see