This functions integrates large-scale copy number variations analysis using the inferncnv-package. For more detailed information about infercnv works visit https://github.com/broadinstitute/inferCNV/wiki.

runCnvAnalysis(
  object,
  ref_annotation = cnv_ref[["annotation"]],
  ref_mtr = cnv_ref[["mtr"]],
  ref_regions = cnv_ref[["regions"]],
  gene_pos_df = SPATA2::gene_pos_df,
  directory_cnv_folder = "data-development/cnv-results",
  directory_regions_df = NA,
  n_pcs = 30,
  cnv_prefix = "Chr",
  save_infercnv_object = TRUE,
  verbose = NULL,
  of_sample = NA,
  CreateInfercnvObject = list(ref_group_names = "ref"),
  require_above_min_mean_expr_cutoff = list(min_mean_expr_cutoff = 0.1),
  require_above_min_cells_ref = list(min_cells_per_gene = 3),
  normalize_counts_by_seq_depth = list(),
  anscombe_transform = list(),
  log2xplus1 = list(),
  apply_max_threshold_bounds = list(),
  smooth_by_chromosome = list(window_length = 101, smooth_ends = TRUE),
  center_cell_expr_across_chromosome = list(method = "median"),
  subtract_ref_expr_from_obs = list(inv_log = TRUE),
  invert_log2 = list(),
  clear_noise_via_ref_mean_sd = list(sd_amplifier = 1.5),
  remove_outliers_norm = list(),
  define_signif_tumor_subclusters = list(p_val = 0.05, hclust_method = "ward.D2",
    cluster_by_groups = TRUE, partition_method = "qnorm"),
  plot_cnv = list(k_obs_groups = 5, cluster_by_groups = TRUE, output_filename =
    "infercnv.outliers_removed", color_safe_pal = FALSE, x.range = "auto", x.center = 1,
    output_format = "pdf", title = "Outliers Removed")
)

Arguments

object

An object of class spata2.

ref_annotation

A data.frame in which the row names refer to the barcodes of the reference matrix provided in argument ref_mtr and and a column named sample that refers to the reference group names.

Defaults to the data.frame stored in slot $annotation of list SPATA2::cnv_ref.

If you provide your own reference, make sure that barcodes of the reference input do not overlap with barcodes of the spata-object. (e.g. by suffixing as exemplified in the default list SPATA2::cnv_ref.)

ref_mtr

The count matrix that is supposed to be used as the reference. Row names must refer to the gene names and column names must refer to the barcodes. Barcodes must be identical to the row names of the data.frame provided in argument ref_annotation.

Defaults to the count matrix stored in slot $mtr of list SPATA2::cnv_ref.

If you provide your own reference, make sure that barcodes of the reference input do not overlap with barcodes of the spata-object. (e.g. by suffixing as exemplified in the default list SPATA2::cnv_ref.)

ref_regions

A data.frame that contains information about chromosome positions.

Defaults to the data.frame stored in slot $regions of list SPATA2::cnv_ref.

If you provide your own regions reference, make sure that the data.frame has equal column names and row names as the default input.

gene_pos_df

Either NULL or a data.frame. If data.frame, it replaces the output of CONICsmat::getGenePositions(). Must contain three character variables ensembl_gene_id, hgnc_symbol, chromosome_name and two numeric variables start_position and end_position..

If NULL the data.frame is created via CONICsmat::getGenePositions() using all gene names that appear in the count matrix and in the reference matrix.

Defaults to the SPATA2 intern data.frame SPATA2::gene_pos_df.

directory_cnv_folder

Character value. A directory that leads to the folder in which to store temporary files, the infercnv-object as well as the output heatmap.

cnv_prefix

Character value. Denotes the string with which the the feature variables in which the information about the chromosomal gains and losses are stored are prefixed.

save_infercnv_object

Logical value. If set to TRUE the infercnv-object is stored in the folder denoted in argument directory_cnv_folder under 'infercnv-object.RDS.

verbose

Logical. If set to TRUE informative messages regarding the computational progress will be printed.

(Warning messages will always be printed.)

of_sample

This argument is currently inactive. It might be reactivated when spata-objects can store more than one sample.

CreateInfercnvObject

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

require_above_min_mean_expr_cutoff

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

require_above_min_cells_ref

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

normalize_counts_by_seq_depth

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

anscombe_transform

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

log2xplus1

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

apply_max_threshold_bounds

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

smooth_by_chromosome

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

center_cell_expr_across_chromosome

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

subtract_ref_expr_from_obs

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

invert_log2

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

clear_noise_via_ref_mean_sd

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

remove_outliers_norm

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

define_signif_tumor_subclusters

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj must not be specified.

plot_cnv

A list of arguments with which the function is supposed to be called. Make sure that your input does not conflict with downstream function calls. Input for argument infercnv_obj and must not be specified. Input for argument out_dir is taken from argument directory_cnv_folder.

Value

An updated spata-object containg the results in the respective slot.

Details

runCnvAnalysis() is a wrapper around all functions the infercnv-pipeline is composed of. Argument directory_cnv_folder should lead to an empty folder as temporary files as well as the output heatmap and the infercnv-object are stored there without asking for permission which can lead to overwriting due to naming issues.

Results (including a PCA) are stored in the slot @cnv of the spata-object which can be obtained via getCnvResults(). Additionally, the variables that store the copy-number-variations for each barcode-spot are added to the spata-object's feature data. The corresponding feature variables are named according to the chromosome's number and the prefix denoted with the argument cnv_prefix.

Regarding the reference data: In the list SPATA2::cnv_ref we offer reference data including a count matrix that results from stRNA-seq of healthy human brain tissue, an annotation data.frame as well as a data.frame containing information regarding the chromosome positions. You can choose to provide your own reference data by specifying the ref_*-arguments. Check out the content of list SPATA2::cnv_ref and make sure that your own reference input is of similiar structure regarding column names, rownames, etc.