Create spatial annotations based on numeric values

Creates spatial annotations based on gene expression or any other continous data variable (e.g. read counts, copy number alterations). See details for more.

createNumericAnnotations(
  object,
  variable,
  threshold,
  id,
  tags = NULL,
  tags_expand = TRUE,
  use_dbscan = TRUE,
  inner_borders = TRUE,
  eps = recDbscanEps(object),
  minPts = recDbscanMinPts(object),
  force1 = FALSE,
  fct_incr = 1,
  min_size = nObs(object) * 0.01,
  method_outline = "concaveman",
  alpha = recAlpha(object),
  concavity = 2,
  method_gs = NULL,
  transform_with = NULL,
  overwrite = FALSE,
  verbose = NULL,
  ...
)

Arguments

object

An object of class SPATA2 or, in case of S4 generics, objects of classes for which a method has been defined.

variable

Character value. The name of the numeric variable of interest.

threshold

Character value. Determines the method and/or the threshold by which the data points are filtered. Valid input options are 'kmeans_high', 'kmeans_low' and operator-value combinations such as '>0.75' or '<=0.5'. See details for more.

id

Character value. The ID of the spatial annotation. If NULL, the ID of the annotation is created by combining the string 'spat_ann' with the index the new annotation has in the list of all annotations.

tags

A character vector of tags for the spatial annotation.

tags_expand

Logical value. If TRUE, the tags with which the image annotations are tagged are expanded by the unsuffixed id, the variable, the threshold and 'createGroupAnnotations'.

use_dbscan

Logical value. If TRUE, the DBSCAN algorithm is used to identify spatial clusters and outliers before the outline of the spatial annotation is drawn.

inner_borders

Logical value. If TRUE, the algorithm checks whether the annotation requires inner borders and sets them accordingly. If FALSE, only an outer border is created.

eps

Distance measure. Given to eps of dbscan::dbscan(). Determines the size (radius) of the epsilon neighborhood.

minPts

Numeric value. Given to dbscan::dbscan(). Determines the number of minimum points required in the eps neighborhood for core points (including the point itself)

force1

Logical value. If TRUE, spatial sub groups identified by DBSCAN are merged into one cluster. Note: If FALSE (the default), the input for ìd is suffixed with an index to label each spatial annotation created uniquely, regardless of how many are eventually created. E.g. if id = "my_ann" and the algorithm created two spatial annotations, they are named my_ann_1 and my_ann_2.

min_size

Numeric value. The minimum number of data points a dbscan cluster must have in order not to be discarded as a spatial outlier.

method_outline

Character value. The method used to create the outline of the spatial annotations. Either 'concaveman' or 'alphahull'.

'concaveman': A fast algorithm that creates concave hulls with adjustable detail. It captures more intricate shapes and is generally computationally efficient, but may produce less smooth outlines compared to alpha shapes. concavity determines the level of detail.
'alphahull': (BETA) Generates an alpha shape outline by controlling the boundary tightness with the alpha parameter. Smaller alpha values produce highly detailed boundaries, while larger values approximate convex shapes. It’s more precise for capturing complex edges but can be computationally more intensive.

alpha

Numeric value. Given to alpha of alphahull::ahull(). Default is platform dependent.

concavity

Numeric value. Given to argument concavity of concaveman::concaveman(). Determines the relative measure of concavity. 1 results in a relatively detailed shape, Infinity results in a convex hull. You can use values lower than 1, but they can produce pretty crazy shapes.

method_gs

Character value. The method according to which gene sets will be handled specified as a character of length one. This can be either 'mean or one of 'gsva', 'ssgsea', 'zscore', or 'plage'. The latter four will be given to gsva::GSVA().

transform_with

List or NULL. If list, can be used to transform continuous variables before usage. Names of the list slots refer to the variable. The content of the slot refers to the transforming functions. E.g if the variable of interest is GFAP gene expression, the following would work:

Single function: transform_with = log10,
Multiple functions: transform_with = list(GFAP = list(log10, log2)

In case of plotting: Useful if you want to apply more than one transformation on variables mapped to plotting aesthetics. Input for transform_with is applied before the respective <aes>_trans argument.

overwrite

Logical value. Must be TRUE to allow overwriting.

verbose

Logical. If TRUE, informative messages regarding the computational progress will be printed.

(Warning messages will always be printed.)

...

Additional slot content given to methods::new() when constructing the SpatialAnnotation object.

Value

The updated input object, containing the added, removed or computed results.

Details

The function createNumericAnnotations() facilitates the mapping of expression values associated with data points (spots or cells) to an image. This process is achieved by identifying data points that meet the criteria set by the threshold input, encompassing them within a polygon that serves as the foundation for creating a SpatialAnnotation. The annotation procedure, based on the position of data points showcasing specific expression values, involves the following key steps.

Data point filtering: The data points from the coordinates data.frame are selectively retained based on the values of the variable specified in the variable argument. How the filtering is conducted depends on threshold.
Grouping: The remaining data points are organized into groups, a behavior influenced by the values of use_dbscan and force1 arguments.
Outlining: Each group of data points is subject to the concaveman algorithm, resulting in the creation of an outlining polygon.
Spatial annotation: The generated concave polygons serve as the foundation for crafting spatial annotations.

In-depth Explanation: Initially, the coordinates data.frame is joined with the variable indicated in the variable argument. Subsequently, the threshold input is applied. Two primary methods exist for conducting thresholding. If threshold is either 'kmeans_high' or 'kmeans_low', the data points undergo clustering based solely on their variable values, with centers = 2. Depending on the chosen approach, the group of data points with the highest or lowest mean is retained, while the other group is excluded.

Alternatively, the threshold can comprise a combination of a logical operator (e.g., '>', '>=', '<=', or '<') and a numeric value. This combination filters the data points accordingly. For instance, using variable = 'GFAP' and threshold = '> 0.75' results in retaining only those data points with a GFAP value of 0.75 or higher.

Following filtering, if use_dbscan is TRUE, the DBSCAN algorithm identifies spatial outliers, which are then removed. Furthermore, if DBSCAN detects multiple dense clusters, they can be merged into a single group if force1 is also set to TRUE.

It is essential to note that bypassing the DBSCAN step may lead to the inclusion of individual data points dispersed across the sample. This results in a spatial annotation that essentially spans the entirety of the sample, lacking the segregation of specific variable expressions. Similarly, enabling force1 might unify multiple segregated areas, present on both sides of the sample, into one group and subsequently, one spatial annotation encompassing the whole sample. Consider to allow the creation of multiple spatial annotations (suffixed with an index) and merging them afterwards via mergeSpatialAnnotations() if they are too close together.

Lastly, the remaining data points are fed into either the concaveman or the alphahull algorithm on a per-group basis. The algorithm calculates polygons outlining the groups of data points. If dbscan_use is FALSE, all data points that remained after the initial filtering are submitted to the algorithm. Subsequently, these polygons are integrated into addSpatialAnnotation() along with the unsuffixed id and tags input arguments. The ID is suffixed with an index for each group.

Distance measures

The vignette on distance measures in SPATA2 has been replaced. Click here to read it.

References

P. J. de Oliveira and A. C. P. F. da Silva (2012). alphahull: Generalization of the convex hull of a sample of points in the plane. R package version 2.1. https://CRAN.R-project.org/package=alphahull

Graham, D., & Heaton, D. (2018). concaveman: A very fast 2D concave hull algorithm. R package version 1.1.0. https://CRAN.R-project.org/package=concaveman

Examples


library(SPATA2)
library(tidyverse)
library(patchwork)

data("example_data")

object <- example_data$object_UKF275T_diet

# create an image annotation based on the segregated area of
# high expression in hypoxia signatures
 object <-
   createNumericAnnotations(
     object = object,
     variable = "HM_HYPOXIA",
     threshold = "kmeans_high",
     id = "hypoxia",
     tags = "hypoxic"
   )

 # visualize both
 plotSurface(object, color_by = "HM_HYPOXIA") +
   legendLeft() +

 plotImage(object) +
   ggpLayerSpatAnnOutline(object, tags = c("hypoxic"))