imageAnnotationScreening.Rd
Screens the sample for numeric variables that stand
in meaningful, spatial relation to annotated structures/areas.
For a detailed explanation on how to define the parameters distance
,
n_bins_circle
, binwidth
, angle_span
and n_bins_angle
see details section.
imageAnnotationScreening(
object,
id,
variables,
distance = NA_integer_,
n_bins_circle = NA_integer_,
binwidth = getCCD(object),
angle_span = c(0, 360),
n_bins_angle = 1,
include_area = FALSE,
summarize_with = "mean",
normalize_by = "sample",
method_padj = "fdr",
model_subset = NULL,
model_remove = NULL,
model_add = NULL,
mtr_name = NULL,
bcsp_exclude = NA_character_,
verbose = NULL,
...
)
An object of class spata2
.
Character value. The ID of the image annotation of interest.
Character vector. All numeric variables (meaning genes, gene-sets and numeric features) that are supposed to be included in the screening process.
Distance value. Specifies the distance from the border of the
image annotation to the horizon in the periphery up to which the screening
is conducted. (See details for more.) - See details of ?is_dist
for more
information about distance values.
Numeric value or vector of length 2. Specifies how many times the area is buffered with the value
denoted in binwidth
.
(See details for more.)
Distance value. The width of the circular bins to which
the barcode-spots are assigned. We recommend to set it equal to the center-center
distance: binwidth = getCCD(object)
. (See details for more.) - See details of ?is_dist
for more
information about distance values.
Numeric vector of length 2. Confines the area screened by an angle span relative to the center of the image annotation. (See details fore more.)
Numeric value. Number of bins that are created by angle. (See details for more.)
Character value. Either 'mean' or 'median'. Specifies the function with which the bins are summarized.
Character value. The method with which adjusted p-values are
calculated. Use validPadjMethods()
to obtain all valid input options.
Character value. Used as a regex to subset models.
Use validModelNames()
to obtain all model names that are known to SPATA2
and showModels()
to visualize them.
Character value. Used as a regex to remove models are not supposed to be included.
Named list. Every slot in the list must be either a formula
containing a function that takes a numeric vector as input and returns a numeric
vector with the same length as its input vector. Or a numeric vector with the
same length as the input vector. Test models with showModels()
.
Character value containing name(s) of barcode-spots to be excluded from the analysis.
Logical. If set to TRUE informative messages regarding the computational progress will be printed.
(Warning messages will always be printed.)
Used to absorb deprecated arguments or functions.
An object of class ImageAnnotationScreening
. See documentation
with ?ImageAnnotationScreening
for more information.
In conjunction with argument id
which provides the
ID of the image annotation of interest the arguments distance
,
binwidth
, n_bins_circle
, angle_span
and n_bins_angle
can be used
to specify the exact area that is screened as well as the resolution of the screening.
How the algorithm works: During the IAS-algorithm the barcode spots are binned according to their localisation to the image annotation. Every bin's mean expression of a given gene is then aligned in an ascending order - mean expression of bin 1, mean expression of bin 2, ... up to the last bin, the bin with the barcode-spots that lie farest away from the image annotation. This allows to infer the gene expression changes in relation to the image annotation and to screen for genes whose expression changes resemble specific biological behaviors. E.g. linear ascending: gene expression increases linearly with the distance to the image annotation. E.g. immediate descending: gene expression is high in close proximity to the image annotation and declines logarithmically with the distance to the image annotation.
How circular binning works: To bin barcode-spots according to their localisation to the image annotation three parameters are required:
distance
: The distance from the border of the image annotation to
the horizon in the periphery up to which the screening is conducted. Unit
of the distance is pixel as is the unit of the image.
binwidth
: The width of every bin. Unit is pixel.
n_bins_circle
: The number of bins that are created.
Regarding parameter n_bins_circle
: The suffix _circle
is used for one
thing to emphasize that bins are created in a circular fashion around the image
annotation (although the shape of the polygon that was created to encircle the
image annotation is maintained). Additionally, the suffix is needed to delineate
it from argument n_bins_angle
which can be used to increase the
resolution of the screening.
These three parameters stand in the following relation to each other:
n_bins_circle
= distance
/ binwidth
distance
= n_bins_circle
* binwidth
binwidth
= distance
/ n_bins_circle
Therefore, only two of the three arguments must be specified as the remaining
one is calculated. We recommend to stick to the first option: Specifying
distance
and binwidth
and letting the function calculate
n_bins_circle
.
Once the parameters are set and calculated the polygon that is used to
define the borders of the image annotation (the one you draw with
createImageAnnotation()
) is repeatedly expanded by the distance indicated
by parameter binwidth
. The number of times this expansion is
repeated is equal to the parameter n_bins_circle
. Every time the
polygon is expanded, the newly enclosed barcode-spots are binned (grouped)
and the bin is given a number that is equal to the number of the expansion.
Thus, barcode-spots that are adjacent to the image annotation are binned into
bin 1, barcode spots that lie a distance of binwidth
away are binned into
bin 2, etc.
Note that the function plotSurfaceIas()
allows to visually check
if your input results in the desired screening.
How the screening works: For every gene that is included in the
screening process every bin's mean expression is calculated and then
aligned in an ascending order - mean expression of bin 1, mean expression
of bin 2, ... up to the last bin, namely the bin with the barcode-spots that lie
farest away from the image annotation. This allows to infer
the gene expression changes in relation to the image annotation and
to screen for genes whose expression changes resemble specific biological
behaviors. The gene expression change is fitted to every model that is included.
(Use showModels()
to visualize the predefined models of SPATA2
).
A gene-model-fit is evaluated twofold:
Residuals area over the curve: The area under the curve (AUC) of the residuals between the inferred expression changes and the model is calculated, normalized against the number of bins and then subtracted from 1.
Pearson correlation: The inferred expression changes is correlated with the model. (Correlation as well as the corresponding p-value depend on the number of bins!)
Eventually, the mean of the RAOC and the Correlation for every gene-model-fit is calculated and stored as the IAS-Score.