eigenWeightedMean.Rd
Weights for the genes in x
are calculated by the percent of which
they contribute to the principal component indicated by eigengene
.
eigenWeightedMean( x, eigengene = 1L, center = TRUE, scale = TRUE, uncenter = center, unscale = scale, retx = FALSE, weights = NULL, normalize = FALSE, all.x = NULL, ..., .drop.sd = 1e-04 )
x | An expression matrix of genes x samples. When using this to score
geneset activity, you want to reduce the rows of |
---|---|
eigengene | the PC used to extract the gene weights from |
center | center and/or scale data before scoring? |
scale | center and/or scale data before scoring? |
uncenter | uncenter and unscale the data data on the way out?
Defaults to the respective values of |
unscale | uncenter and unscale the data data on the way out?
Defaults to the respective values of |
retx | Works the same as |
weights | a user can pass in a prespecified set of waits using a named
numeric vector. The names must be a superset of |
normalize | If |
all.x | if the user is trying to normalize these scores, an expression
matrix that has superset of the control genes needs to be provided, where
the columns of |
... | these aren't used in here |
.drop.sd | When zero-sd (non varying) features are scaled, their values
are |
A list of useful transformation information. The caller is likely
most interested in the $score
vector, but other bits related to
the SVD/PCA decomposition are included for the ride.
You will generally want the rows of the gene x sample matrix ``xto be z-transformed. If it is not already, ensure that
center` and
`scale` are set to `TRUE`.
When uncenter and/or unscale are FALSE
, it means that the scores
should be applied on the centered or scaled values, respectively.
Scores can be normalized against a set of control genes. This results in negative and postiive sample scores. Positive scores are ones where the specific geneset score is higher than the aggregate control-geneset score.
Genes used for the control set can either be randomly sampled from the
rows of the all.x
expression matrix (when normalize = TRUE
), or
explicitly specified by a row-identifier character vectore passed to the
normalize
parameter. In both cases, the code prefers to select a
random-control geneset to be of equal size as nrow(x)
. If that's not
possible, we use as many genes as we can get.
Note that normalization requires an expression matrix to be passed into
the all.x
parameter, whose columns match 1:1 to the columns in x
.
Calling scoreSingleSamples()
with method = "ewm", normalize = TRUE
handles this transparently.
This idea to implement this method of normalizatition was inspried from
the ctrl.score
normalization found in Seurat's
Seurat::AddModuleScore()
function.
scoreSingleSamples
vm <- exampleExpressionSet(do.voom=TRUE) gdb <- conform(getMSigGeneSetDb('H', "human", "entrez"), vm) features <- featureIds(gdb, 'H', 'HALLMARK_INTERFERON_GAMMA_RESPONSE', value='x.idx') scores <- eigenWeightedMean(vm[features,])$score ## Use scoreSingleSamples to facilitate scoring of all gene sets scores.all <- scoreSingleSamples(gdb, vm, 'ewm') s2 <- with(subset(scores.all, name == 'HALLMARK_INTERFERON_GAMMA_RESPONSE'), setNames(score, sample_id)) all.equal(s2, scores)#> [1] TRUE