Find most variable rows
var_rows.Rd
Calculates the variance of the chi-square component matrix and selects the rows with the highest variance, e.g. 5,000.
Arguments
- mat
A numeric matrix. For sequencing a count matrix, gene expression values with genes in rows and samples/cells in columns. Should contain row and column names.
- residuals
character string. Specifies which kind of residuals should be calculated. Can be "pearson" (default), "freemantukey" or "NB" for negative-binomial.
- top
Integer. Number of most variable rows to retain. Default 5000.
- ...
Further arguments for `calc_residuals`.
Examples
set.seed(1234)
# Simulate counts
cnts <- mapply(function(x){rpois(n = 500, lambda = x)},
x = sample(1:20, 50, replace = TRUE))
rownames(cnts) <- paste0("gene_", 1:nrow(cnts))
colnames(cnts) <- paste0("cell_", 1:ncol(cnts))
# Choose top 5000 most variable genes
cnts <- var_rows(mat = cnts, top = 5000)
#> Warning: Top is larger than the number of rows in matrix. Top was set to nrow(mat).