createNumericAnnotations.Rd
Creates spatial annotations based on gene expression or any other continous data variable (e.g. read counts, copy number alterations). See details for more.
createNumericAnnotations(
object,
variable,
threshold,
id,
tags = NULL,
tags_expand = TRUE,
use_dbscan = TRUE,
inner_borders = TRUE,
eps = recDbscanEps(object),
minPts = recDbscanMinPts(object),
force1 = FALSE,
fct_incr = 1,
min_size = nObs(object) * 0.01,
concavity = 2,
method_gs = NULL,
transform_with = NULL,
overwrite = FALSE,
verbose = NULL,
...
)
An object of class SPATA2
or, in case of S4 generics,
objects of classes for which a method has been defined.
Character value. The name of the numeric variable of interest.
Character value. Determines the method and/or the threshold by which the data points are filtered. Valid input options are 'kmeans_high', 'kmeans_low' and operator-value combinations such as '>0.75' or '<=0.5'. See details for more.
Character value. The ID of the spatial annotation. If NULL
,
the ID of the annotation is created by combining the string 'spat_ann' with
the index the new annotation has in the list of all annotations.
A character vector of tags for the spatial annotation.
Logical value. If TRUE
, the tags with which the image
annotations are tagged are expanded by the unsuffixed id
, the variable
,
the threshold
and 'createGroupAnnotations'.
Logical value. If TRUE
, the DBSCAN algorithm is used to identify
spatial clusters and outliers before the outline of the spatial annotation is drawn.
Logical value. If TRUE
, the algorithm checks whether the
annotation requires inner borders and sets them accordingly. If FALSE
, only
an outer border is created.
Distance measure. Given to eps
of dbscan::dbscan()
. Determines
the size (radius) of the epsilon neighborhood.
Numeric value. Given to dbscan::dbscan()
. Determines the
number of minimum points required in the eps neighborhood for core points
(including the point itself)
Logical value. If TRUE
, spatial sub groups identified by DBSCAN
are merged into one cluster. Note: If FALSE
(the default), the input for ìd
is suffixed
with an index to label each spatial annotation created uniquely, regardless of
how many are eventually created. E.g. if id = "my_ann"
and the algorithm
created two spatial annotations, they are named my_ann_1 and my_ann_2.
Numeric value. The minimum number of data points a dbscan cluster must have in order not to be discarded as a spatial outlier.
Numeric value. Given to argument concavity
of
concaveman::concaveman()
. Determines the relative measure of concavity.
1 results in a relatively detailed shape, Infinity results in a convex hull.
You can use values lower than 1, but they can produce pretty crazy shapes.
Character value. The method according to which gene sets will be handled specified as a character of length one. This can be either 'mean or one of 'gsva', 'ssgsea', 'zscore', or 'plage'. The latter four will be given to gsva::GSVA().
List or NULL.
If list, can be used to transform continuous variables before usage.
Names of the list slots refer to the variable. The content of the slot refers to the transforming functions.
E.g if the variable of interest is GFAP gene expression, the following would work:
Single function: transform_with = log10
,
Multiple functions: transform_with = list(GFAP = list(log10, log2)
In case of plotting:
Useful if you want to apply more than one transformation on variables mapped to
plotting aesthetics. Input for transform_with
is applied before the
respective <aes>_trans
argument.
Logical value. Must be TRUE
to allow overwriting.
Logical. If TRUE
, informative messages regarding
the computational progress will be printed.
(Warning messages will always be printed.)
Additional slot content given to methods::new()
when
constructing the SpatialAnnotation
object.
The updated input object, containing the added, removed or computed results.
The function createNumericAnnotations()
facilitates the mapping of expression values
associated with data points (spots or cells) to an image. This process is achieved by identifying
data points that meet the criteria set by the threshold
input, encompassing them within a
polygon that serves as the foundation for creating a SpatialAnnotation
. The annotation procedure,
based on the position of data points showcasing specific expression values, involves the following key steps.
Data point filtering: The data points from the coordinates data.frame are selectively retained
based on the values of the variable specified in the variable
argument. How the filtering
is conducted depends on threshold
.
Grouping: The remaining data points are organized into groups, a behavior influenced by the values
of use_dbscan
and force1
arguments.
Outlining: Each group of data points is subject to the concaveman algorithm, resulting in the creation of an outlining polygon.
Spatial annotation: The generated concave polygons serve as the foundation for crafting spatial annotations.
In-depth Explanation:
Initially, the coordinates data.frame is joined with the variable indicated in
the variable
argument. Subsequently, the threshold
input is applied.
Two primary methods exist for conducting thresholding. If threshold
is
either 'kmeans_high' or 'kmeans_low', the data points undergo clustering
based solely on their variable values, with centers = 2
. Depending on
the chosen approach, the group of data points with the highest or lowest mean
is retained, while the other group is excluded.
Alternatively, the threshold can comprise a combination of a logical operator
(e.g., '>'
, '>='
, '<='
, or '<'
) and a numeric value.
This combination filters the data points accordingly. For instance, using
variable = 'GFAP'
and threshold = '> 0.75'
results in retaining
only those data points with a GFAP value of 0.75 or higher.
Following filtering, if use_dbscan
is TRUE
, the DBSCAN algorithm
identifies spatial outliers, which are then removed. Furthermore, if DBSCAN
detects multiple dense clusters, they can be merged into a single group
if force1
is also set to TRUE
.
It is essential to note that bypassing the DBSCAN step may lead to the inclusion
of individual data points dispersed across the sample. This results in a spatial
annotation that essentially spans the entirety of the sample, lacking the
segregation of specific variable expressions. Similarly, enabling force1
might unify multiple segregated areas, present on both sides of the sample, into one
group and subsequently, one spatial annotation encompassing the whole sample.
Consider to allow the creation of multiple spatial annotations (suffixed with an index)
and merging them afterwards via mergeSpatialAnnotations()
if they are too
close together.
Lastly, the remaining data points are fed into the concaveman algorithm on a
per-group basis. The algorithm calculates concave polygons outlining the groups
of data points. If dbscan_use
is FALSE
, all data points that remained after the
initial filtering are submitted to the algorithm. Subsequently, these polygons are
integrated into addSpatialAnnotation()
along with the unsuffixed id
and
tags
input arguments. The ID is suffixed with an index for each group.
The vignette on distance measures in SPATA2 has been replaced. Click
here
to read it.
library(SPATA2)
library(tidyverse)
library(patchwork)
data("example_data")
object <- example_data$object_UKF275T_diet
# create an image annotation based on the segregated area of
# high expression in hypoxia signatures
object <-
createNumericAnnotations(
object = object,
variable = "HM_HYPOXIA",
threshold = "kmeans_high",
id = "hypoxia",
tags = "hypoxic"
)
# visualize both
plotSurface(object, color_by = "HM_HYPOXIA") +
legendLeft() +
plotImage(object) +
ggpLayerSpatAnnOutline(object, tags = c("hypoxic"))