Clustering with BayesSpace

A wrapper around the BayesSpace clustering pipeline introduced by Zhao et al. (2021).

runBayesSpaceClustering(
  object,
  name = "bayes_space",
  n.Pcs = 15,
  n.HVGs = 2000,
  skip.PCA = FALSE,
  log.normalize = TRUE,
  assay.type = "logcounts",
  BSPARAM = BiocSingular::ExactParam(),
  qs = 3:15,
  burn.in = c(100, 1000),
  nrep = c(1000, 50000),
  use.dimred = "PCA",
  d = 15,
  init.method = "mclust",
  model = "t",
  gamma = 3,
  mu0 = NULL,
  lambda0 = NULL,
  alpha = 1,
  beta = 0.01,
  save.chain = FALSE,
  chain.fname = NULL,
  prefix = "B",
  return_model = TRUE,
  empty_remove = FALSE,
  overwrite = FALSE,
  assign_sce = FALSE,
  assign_envir = .GlobalEnv,
  seed = 123,
  verbose = NULL,
  ...
)

Arguments

object

An object of class SPATA2 or, in case of S4 generics, objects of classes for which a method has been defined.

name

Character value. The name the cluster variable has in the feature data of the SPATA2 object. Defaults to bayes_space.

n.HVGs

Number of highly variable genes to run PCA upon.

skip.PCA

Skip PCA (if dimensionality reduction was previously computed.)

log.normalize

Whether to log-normalize the input data with scater. May be omitted if log-normalization previously computed.

assay.type

Name of assay in sce containing normalized counts. Leave as "logcounts" unless you explicitly pre-computed a different normalization and added it to sce under another assay. Note that we do not recommend running BayesSpace on PCs computed from raw counts.

BSPARAM

A BiocSingularParam object specifying which algorithm should be used to perform the PCA. By default, an exact PCA is performed, as current spatial datasets are generally small (<10,000 spots). To perform a faster approximate PCA, please specify FastAutoParam() and set a random seed to ensure reproducibility.

qs

The values of q to evaluate. If qs is only one value exactly that is given to q of BayesSpace::spatialCluster(). Else the optimal q from all provided values is identified using BayesSpace::qTune().

burn.in, nrep

Integers specifying the range of repetitions to compute.

use.dimred

Name of a reduced dimensionality result in reducedDims(sce). If provided, cluster on these features directly.

d

Number of top principal components to use when clustering.

init.method

If init is not provided, cluster the top d PCs with this method to obtain initial cluster assignments.

model

Error model. ('normal' or 't')

gamma

Smoothing parameter. Defaults to 2 for platform="ST" and 3 for platform="Visium". (Values in range of 1-3 seem to work well.)

mu0

Prior mean hyperparameter for mu. If not provided, mu0 is set to the mean of PCs over all spots.

lambda0

Prior precision hyperparam for mu. If not provided, lambda0 is set to a diagonal matrix \(0.01 I\).

alpha

Hyperparameter for Wishart distributed precision lambda.

beta

Hyperparameter for Wishart distributed precision lambda.

save.chain

If true, save the MCMC chain to an HDF5 file.

chain.fname

File path for saved chain. Tempfile used if not provided.

prefix

Character value. Prefix of the cluster groups.

overwrite

Logical value. If TRUE, name overwrites features in feature data of the SPATA2 object.

assign_sce

Character value or NULL. If character, specifies the name under which the bayes space output (object of class SingleCellExperiment) is assigned to the global environment. This makes the whole output of the bayes space pipeline available instead of only adding the clustering output as a grouping variable to the SPATA2 object.

verbose

Logical. If TRUE, informative messages regarding the computational progress will be printed.

(Warning messages will always be printed.)

...

Additional arguments given to BayesSpace::spatialCluster(). Exception: sce, q are specified within the function.

q_force

Numeric value or FALSE. If numeric, it forces the number of output clusters with input value. If FALSE, the optimal number of clusters is chosen for q determined by the elbow point of BayesSpace::qTune().

Value

The updated input object, containing the added, removed or computed results.

Details

This function is a wrapper around readVisium(), spatialPreprocess(), qTune() and spatialCluster() of the BayesSpace package. The results are stored in form of a grouping variable in the feature data.frame of the returned SPATA2 object.

References

Zhao E, Stone MR, Ren X, Guenthoer J, Smythe KS, Pulliam T, Williams SR, Uytingco CR, Taylor SEB, Nghiem P, Bielas JH, Gottardo R. Spatial transcriptomics at subspot resolution with BayesSpace. Nat Biotechnol. 2021 Nov;39(11):1375-1384. doi: 10.1038/s41587-021-00935-2. Epub 2021 Jun 3. PMID: 34083791; PMCID: PMC8763026.

Examples

library(SPATA2)

data("example_data")

object <- example_data$object_UKF313T_diet

# tests options for q from 3 to 15 and picks the best
object <- runBayesSpaceClustering(object, name = "new_bspace", qs = 3:15)

plotLoglik(object)

# run with q = 10 to force 10 clusters in the output
object <- runBayesSpaceClustering(object, name = "bspace_10", qs = 10)

Arguments

Value

Details

References

See also

Examples