Skip to contents

`caclust()` performs biclustering on either a "cacomp" or "SingleCellExperiment" object.

Usage

caclust(
  obj,
  k,
  algorithm = "leiden",
  SNN_prune = 1/15,
  loops = FALSE,
  mode = "out",
  select_genes = TRUE,
  prune_overlap = TRUE,
  overlap = 0.2,
  calc_gene_cell_kNN = FALSE,
  resolution = 1,
  marker_genes = NULL,
  n.int = 10,
  rand_seed = 2358,
  use_gap = TRUE,
  nclust = NULL,
  spectral_method = "kmeans",
  iter_max = 10,
  num_seeds = 10,
  return_eig = TRUE,
  dims = NULL,
  cast_to_dense = TRUE,
  ...
)

# S4 method for cacomp
caclust(
  obj,
  k,
  algorithm = "leiden",
  SNN_prune = 1/15,
  loops = FALSE,
  mode = "out",
  select_genes = TRUE,
  prune_overlap = TRUE,
  overlap = 0.2,
  calc_gene_cell_kNN = FALSE,
  resolution = 1,
  marker_genes = NULL,
  n.int = 10,
  rand_seed = 2358,
  use_gap = TRUE,
  nclust = NULL,
  spectral_method = "kmeans",
  iter_max = 10,
  num_seeds = 10,
  return_eig = TRUE,
  dims = NULL,
  cast_to_dense = TRUE,
  method = BiocNeighbors::KmknnParam(),
  BPPARAM = BiocParallel::SerialParam(),
  leiden_pack = "igraph",
  ...
)

# S4 method for SingleCellExperiment
caclust(
  obj,
  k,
  algorithm = "leiden",
  SNN_prune = 1/15,
  loops = FALSE,
  mode = "out",
  select_genes = TRUE,
  prune_overlap = TRUE,
  overlap = 0.2,
  calc_gene_cell_kNN = FALSE,
  resolution = 1,
  marker_genes = NULL,
  n.int = 10,
  rand_seed = 2358,
  use_gap = TRUE,
  nclust = NULL,
  spectral_method = "kmeans",
  iter_max = 10,
  num_seeds = 10,
  return_eig = TRUE,
  dims = NULL,
  cast_to_dense = TRUE,
  method = BiocNeighbors::KmknnParam(),
  BPPARAM = BiocParallel::SerialParam(),
  leiden_pack = "igraph",
  ...,
  caclust_meta_name = "caclust",
  cacomp_meta_name = "CA"
)

Arguments

obj

A cacomp object or SingleCellExperiment object

k

Either an integer (same k for all subgraphs) or a vector of exactly four integers specifying in this order: the k_c for the cell-cell kNN-graph, k_g for the gene-gene kNN-graph, k_cg for the cell-gene kNN-graph, k_gc for the gene-cell kNN-graph.

algorithm

Character. Algorithm for clustering. Options are "leiden" or "spectral". Defalut: 'leiden'.

SNN_prune

numeric. Value between 0-1. Sets cutoff of acceptable jaccard similarity scores for neighborhood overlap of vertices in SNN. Edges with values less than this will be set as 0. The default value is 1/15.

loops

TRUE/FALSE. If TRUE self-loops are allowed, otherwise not.

mode

The type of neighboring vertices to use for calculating similarity scores(Jaccard Index). Three options: "out", "in" and "all":

  • "out": Selecting neighbouring vertices by out-going edges;

  • "in": Selecting neighbouring vertices by in-coming edges;

  • "all": Selecting neigbouring vertices by both in-coming and out-going edges.

select_genes

TRUE/FALSE. Should genes be selected by whether they have an edge in the cell-gene kNN graph?

prune_overlap

TRUE/FALSE. If TRUE edges to genes that share less than overlap of genes with the nearest neighbours of the cell are removed. Pruning is only performed if select_genes = TRUE.

overlap

Numeric between 0 and 1. Overlap cutoff applied if prune_overlap = TRUE.

calc_gene_cell_kNN

TRUE/FALSE. If TRUE a cell-gene graph is calculated by choosing the k_gc nearest cells for each gene. If FALSE the cell-gene graph is transposed.

resolution

float number. Resolution for leiden algorithm.

marker_genes

character. Optional. Names of known marker genes that should be excempt from any pruning on the graph and be kept.

n.int

Integer. Number of iterations for leiden algorithm.

rand_seed

integer. Random seed.

use_gap

Logical, TRUE/FALSE. If TRUE, 'eigengap' method will be used to find the most important eigenvector automatically, and the number of output clusters equals number of selected eigenvectors. If FALSE, 'nclust'(integer) should be specified. The eigenvectors corresponding with the smallest 'nclust' eigenvalues will be selcted and 'nclust' clusters will be detected by skmeans/kmeans/GMM.

nclust

Integer. Number of clusters.

spectral_method

character. Name of the method to cluster the eigenvectors. Can be on of the following 3:

  • "kmeans": k-means clustering

  • "skmeans": spherical k-means clustering

  • "GMM": Gaussian-Mixture-Model fuzzy clustering.

iter_max

Number of iterations for k-means clustering and GMM.

num_seeds

Number of times k-means clustering is repeated.

return_eig

Logical. Whether or not to return eigenvectors and store them in caclust-object.

dims

Integer. Number of dimensions to choose from SVD of graph laplacian.

cast_to_dense

logical. Casting sparse SNN adjacency matrix to dense speeds up the leiden algorithm.

...

further arguments

method

BiocNeighbors::BiocNeighborParam object specifying the algorithm to use. see Details.

BPPARAM

BiocParallel settings parameter. By default single core BiocParallel::SerialParam() but other parameters can be passed.

leiden_pack

character. Optional values are 'igraph'(default) and 'leiden', the package used for leiden clustering.

caclust_meta_name

the name of caclust object stored in metadata(SingleCellExperiment object). Default: 'caclust.'

cacomp_meta_name

Character. The name of cacomp object stored in metadata(SingleCellExperiment object). Default: 'caobj'.

Value

A caclust object or SingleCellExperiment object

Details

Convenient wrapper around `make_SNN` and `run_leiden`/`run_spectral`. `run_caclust` takes a cacomp object and biclusters cells and genes.

See also

Other biclustering: make_SNN(), run_leiden(), run_spectral()