Screens the sample for numeric variables that stand in meaningful, spatial relation to annotated structures/areas. For a detailed explanation on how to define the parameters distance, n_bins_circle, binwidth, angle_span and n_bins_angle see details section.

imageAnnotationScreening(
  object,
  id,
  variables,
  distance = NA_integer_,
  n_bins_circle = NA_integer_,
  binwidth = getCCD(object),
  angle_span = c(0, 360),
  n_bins_angle = 1,
  include_area = FALSE,
  summarize_with = "mean",
  normalize_by = "sample",
  method_padj = "fdr",
  model_subset = NULL,
  model_remove = NULL,
  model_add = NULL,
  mtr_name = NULL,
  bcsp_exclude = NA_character_,
  verbose = NULL,
  ...
)

Arguments

object

An object of class spata2.

id

Character value. The ID of the image annotation of interest.

variables

Character vector. All numeric variables (meaning genes, gene-sets and numeric features) that are supposed to be included in the screening process.

distance

Distance value. Specifies the distance from the border of the image annotation to the horizon in the periphery up to which the screening is conducted. (See details for more.) - See details of ?is_dist for more information about distance values.

n_bins_circle

Numeric value or vector of length 2. Specifies how many times the area is buffered with the value denoted in binwidth. (See details for more.)

binwidth

Distance value. The width of the circular bins to which the barcode-spots are assigned. We recommend to set it equal to the center-center distance: binwidth = getCCD(object). (See details for more.) - See details of ?is_dist for more information about distance values.

angle_span

Numeric vector of length 2. Confines the area screened by an angle span relative to the center of the image annotation. (See details fore more.)

n_bins_angle

Numeric value. Number of bins that are created by angle. (See details for more.)

summarize_with

Character value. Either 'mean' or 'median'. Specifies the function with which the bins are summarized.

method_padj

Character value. The method with which adjusted p-values are calculated. Use validPadjMethods() to obtain all valid input options.

model_subset

Character value. Used as a regex to subset models. Use validModelNames() to obtain all model names that are known to SPATA2 and showModels() to visualize them.

model_remove

Character value. Used as a regex to remove models are not supposed to be included.

model_add

Named list. Every slot in the list must be either a formula containing a function that takes a numeric vector as input and returns a numeric vector with the same length as its input vector. Or a numeric vector with the same length as the input vector. Test models with showModels().

bcsp_exclude

Character value containing name(s) of barcode-spots to be excluded from the analysis.

verbose

Logical. If set to TRUE informative messages regarding the computational progress will be printed.

(Warning messages will always be printed.)

...

Used to absorb deprecated arguments or functions.

Value

An object of class ImageAnnotationScreening. See documentation with ?ImageAnnotationScreening for more information.

Details

In conjunction with argument id which provides the ID of the image annotation of interest the arguments distance, binwidth, n_bins_circle, angle_span and n_bins_angle can be used to specify the exact area that is screened as well as the resolution of the screening.

How the algorithm works: During the IAS-algorithm the barcode spots are binned according to their localisation to the image annotation. Every bin's mean expression of a given gene is then aligned in an ascending order - mean expression of bin 1, mean expression of bin 2, ... up to the last bin, the bin with the barcode-spots that lie farest away from the image annotation. This allows to infer the gene expression changes in relation to the image annotation and to screen for genes whose expression changes resemble specific biological behaviors. E.g. linear ascending: gene expression increases linearly with the distance to the image annotation. E.g. immediate descending: gene expression is high in close proximity to the image annotation and declines logarithmically with the distance to the image annotation.

How circular binning works: To bin barcode-spots according to their localisation to the image annotation three parameters are required:

  • distance: The distance from the border of the image annotation to the horizon in the periphery up to which the screening is conducted. Unit of the distance is pixel as is the unit of the image.

  • binwidth: The width of every bin. Unit is pixel.

  • n_bins_circle: The number of bins that are created.

Regarding parameter n_bins_circle: The suffix _circle is used for one thing to emphasize that bins are created in a circular fashion around the image annotation (although the shape of the polygon that was created to encircle the image annotation is maintained). Additionally, the suffix is needed to delineate it from argument n_bins_angle which can be used to increase the resolution of the screening.

These three parameters stand in the following relation to each other:

  1. n_bins_circle = distance / binwidth

  2. distance = n_bins_circle * binwidth

  3. binwidth = distance / n_bins_circle

Therefore, only two of the three arguments must be specified as the remaining one is calculated. We recommend to stick to the first option: Specifying distance and binwidth and letting the function calculate n_bins_circle.

Once the parameters are set and calculated the polygon that is used to define the borders of the image annotation (the one you draw with createImageAnnotation()) is repeatedly expanded by the distance indicated by parameter binwidth. The number of times this expansion is repeated is equal to the parameter n_bins_circle. Every time the polygon is expanded, the newly enclosed barcode-spots are binned (grouped) and the bin is given a number that is equal to the number of the expansion. Thus, barcode-spots that are adjacent to the image annotation are binned into bin 1, barcode spots that lie a distance of binwidth away are binned into bin 2, etc.

Note that the function plotSurfaceIas() allows to visually check if your input results in the desired screening.

How the screening works: For every gene that is included in the screening process every bin's mean expression is calculated and then aligned in an ascending order - mean expression of bin 1, mean expression of bin 2, ... up to the last bin, namely the bin with the barcode-spots that lie farest away from the image annotation. This allows to infer the gene expression changes in relation to the image annotation and to screen for genes whose expression changes resemble specific biological behaviors. The gene expression change is fitted to every model that is included. (Use showModels() to visualize the predefined models of SPATA2). A gene-model-fit is evaluated twofold:

  • Residuals area over the curve: The area under the curve (AUC) of the residuals between the inferred expression changes and the model is calculated, normalized against the number of bins and then subtracted from 1.

  • Pearson correlation: The inferred expression changes is correlated with the model. (Correlation as well as the corresponding p-value depend on the number of bins!)

Eventually, the mean of the RAOC and the Correlation for every gene-model-fit is calculated and stored as the IAS-Score.