caclust
caclust.Rd
`caclust()` performs biclustering on either a "cacomp" or "SingleCellExperiment" object.
Usage
caclust(
obj,
k,
algorithm = "leiden",
SNN_prune = 1/15,
loops = FALSE,
mode = "out",
select_genes = TRUE,
prune_overlap = TRUE,
overlap = 0.2,
calc_gene_cell_kNN = FALSE,
resolution = 1,
marker_genes = NULL,
n.int = 10,
rand_seed = 2358,
use_gap = TRUE,
nclust = NULL,
spectral_method = "kmeans",
iter_max = 10,
num_seeds = 10,
return_eig = TRUE,
dims = NULL,
cast_to_dense = TRUE,
...
)
# S4 method for cacomp
caclust(
obj,
k,
algorithm = "leiden",
SNN_prune = 1/15,
loops = FALSE,
mode = "out",
select_genes = TRUE,
prune_overlap = TRUE,
overlap = 0.2,
calc_gene_cell_kNN = FALSE,
resolution = 1,
marker_genes = NULL,
n.int = 10,
rand_seed = 2358,
use_gap = TRUE,
nclust = NULL,
spectral_method = "kmeans",
iter_max = 10,
num_seeds = 10,
return_eig = TRUE,
dims = NULL,
cast_to_dense = TRUE,
method = BiocNeighbors::KmknnParam(),
BPPARAM = BiocParallel::SerialParam(),
leiden_pack = "igraph",
...
)
# S4 method for SingleCellExperiment
caclust(
obj,
k,
algorithm = "leiden",
SNN_prune = 1/15,
loops = FALSE,
mode = "out",
select_genes = TRUE,
prune_overlap = TRUE,
overlap = 0.2,
calc_gene_cell_kNN = FALSE,
resolution = 1,
marker_genes = NULL,
n.int = 10,
rand_seed = 2358,
use_gap = TRUE,
nclust = NULL,
spectral_method = "kmeans",
iter_max = 10,
num_seeds = 10,
return_eig = TRUE,
dims = NULL,
cast_to_dense = TRUE,
method = BiocNeighbors::KmknnParam(),
BPPARAM = BiocParallel::SerialParam(),
leiden_pack = "igraph",
...,
caclust_meta_name = "caclust",
cacomp_meta_name = "CA"
)
Arguments
- obj
A cacomp object or SingleCellExperiment object
- k
Either an integer (same k for all subgraphs) or a vector of exactly four integers specifying in this order: the k_c for the cell-cell kNN-graph, k_g for the gene-gene kNN-graph, k_cg for the cell-gene kNN-graph, k_gc for the gene-cell kNN-graph.
- algorithm
Character. Algorithm for clustering. Options are "leiden" or "spectral". Defalut: 'leiden'.
- SNN_prune
numeric. Value between 0-1. Sets cutoff of acceptable jaccard similarity scores for neighborhood overlap of vertices in SNN. Edges with values less than this will be set as 0. The default value is 1/15.
- loops
TRUE/FALSE. If TRUE self-loops are allowed, otherwise not.
- mode
The type of neighboring vertices to use for calculating similarity scores(Jaccard Index). Three options: "out", "in" and "all":
"out": Selecting neighbouring vertices by out-going edges;
"in": Selecting neighbouring vertices by in-coming edges;
"all": Selecting neigbouring vertices by both in-coming and out-going edges.
- select_genes
TRUE/FALSE. Should genes be selected by whether they have an edge in the cell-gene kNN graph?
- prune_overlap
TRUE/FALSE. If TRUE edges to genes that share less than
overlap
of genes with the nearest neighbours of the cell are removed. Pruning is only performed if select_genes = TRUE.- overlap
Numeric between 0 and 1. Overlap cutoff applied if prune_overlap = TRUE.
- calc_gene_cell_kNN
TRUE/FALSE. If TRUE a cell-gene graph is calculated by choosing the
k_gc
nearest cells for each gene. If FALSE the cell-gene graph is transposed.- resolution
float number. Resolution for leiden algorithm.
- marker_genes
character. Optional. Names of known marker genes that should be excempt from any pruning on the graph and be kept.
- n.int
Integer. Number of iterations for leiden algorithm.
- rand_seed
integer. Random seed.
- use_gap
Logical, TRUE/FALSE. If TRUE, 'eigengap' method will be used to find the most important eigenvector automatically, and the number of output clusters equals number of selected eigenvectors. If FALSE, 'nclust'(integer) should be specified. The eigenvectors corresponding with the smallest 'nclust' eigenvalues will be selcted and 'nclust' clusters will be detected by skmeans/kmeans/GMM.
- nclust
Integer. Number of clusters.
- spectral_method
character. Name of the method to cluster the eigenvectors. Can be on of the following 3:
"kmeans": k-means clustering
"skmeans": spherical k-means clustering
"GMM": Gaussian-Mixture-Model fuzzy clustering.
- iter_max
Number of iterations for k-means clustering and GMM.
- num_seeds
Number of times k-means clustering is repeated.
- return_eig
Logical. Whether or not to return eigenvectors and store them in caclust-object.
- dims
Integer. Number of dimensions to choose from SVD of graph laplacian.
- cast_to_dense
logical. Casting sparse SNN adjacency matrix to dense speeds up the leiden algorithm.
- ...
further arguments
- method
BiocNeighbors::BiocNeighborParam object specifying the algorithm to use. see Details.
- BPPARAM
BiocParallel settings parameter. By default single core
BiocParallel::SerialParam()
but other parameters can be passed.- leiden_pack
character. Optional values are 'igraph'(default) and 'leiden', the package used for leiden clustering.
- caclust_meta_name
the name of caclust object stored in metadata(SingleCellExperiment object). Default: 'caclust.'
- cacomp_meta_name
Character. The name of cacomp object stored in metadata(SingleCellExperiment object). Default: 'caobj'.
Details
Convenient wrapper around `make_SNN` and `run_leiden`/`run_spectral`. `run_caclust` takes a cacomp object and biclusters cells and genes.
See also
Other biclustering:
make_SNN()
,
run_leiden()
,
run_spectral()