Prerequisites

Make sure to be familiar with the following tutorials before proceeding:

1. Introduction

Once the data has been read in there is only one function left to use before analysis can start - processData().

2. Process data

Using the function processData()is as simple as it gets.

cypro_object_new <-
  processData(
    object = cypro_object_new,
    summarize_with = c("min", "max", "mean", "median", "sd")
    )

The function processData() does a variety of things:

1.) it splits the loaded data by its content (track data, grouping data, well plate data, etc.).

2.) it completes the track data in case of missing observations for certain cells due to poor coverage (e.g. cell has left the image for some frames). Currently cypro simply adds NAs in this case. See the tutorial on data manipulation for options with which you can do the imputation yourself. Future cyproversions will offer built in imputation techniques.

3.) depending on the modules used and the variables denoted it computes remaining data variables (e.g. Distance from origin from the migration module).

4.) in case of time lapse experiments it summarizes the track data by cell ID with the functions specified in argument summarize_with - creating the stat data This includes module specific summaries. (e.g. the migration module variable Distance from last point is summarized via base::sum() to Total distance travelled).

5.) it summarizes the the data by variable providing a summary not by cell ID but by variable. See slot @vdata.

The output is a cypro object that is ready for analysis and data extraction.

# print a summary of the cypro objects content
printSummary(object = cypro_object_new)
## An object of class 'cypro'.
## 
## Name: TMZ + LY34
## Type: Time Lapse
## Number of Cells: 8407
## Conditions:
##  First phase: 'Ctrl'
##  Second phase: 'Ctrl', 'LY(1uM)', 'TMZ(100uM)' and 'TMZ(100uM)+LY(1uM)'
## Cell Lines: '168', '233' and 'GSC'
## Well Plates: 'green'
## No variables sets have been defined yet.

Cypro objects carry the directory under which they are supposed to be stored. The default directory is assembled by the objects name and the storage folder specified in designExperiment(). You can change the storage directory at any time via setStorageDirectory().


getStorageDirectory(object = cypro_object_new)
## [1] "Data/TMZ+LY34.RDS"

cypro_object_new <- setStorageDirectory(object = cypro_object_new, directory = "data/tmz-ly-processed.RDS")

saveCyproObject(object = cypro_object_new)
## 23:32:38 Saving cypro object under 'data/tmz-ly-processed.RDS'.
## 23:32:39 Done.

3. Quality check

Missing values are be default represented with NAs. This applies to missing observations as well. If, for instance, a cell has not been detected for two frames the observation cell ID at a given point of time is added but the data variables contain only NAs for the missing frames. There are imputation techniques implemented in cypro. You can, however, simply discard cells that do not match certain coverage requirements. Use subsetByQuality() for that matter.

cypro_object_subsetted <-
  subsetByQuality(object = cypro_object_new, new_name = "complete_tracks")

Subset-functions allow to create data subsets of your cypro object based on different reasonings. The function subsetByQuality() opens a shiny application that displays histograms summarizing the coverage quality of all cells. By interactively brushing the columns you can set up the quality requirements.

Figure 1. Interface of subsetByQuality().Figure 1. Interface of subsetByQuality().

Figure 1. Interface of subsetByQuality().

Brushing the columns set up the requirements. By clicking on ‘Apply Filter’ the filter you set up is used to subset the data and a barchart is displayed indicating how many cells of which condition would remain. As with any other cypro application clicking on ‘Save & Proceed’ unlocks the next step which would be to close the app via ‘Return Cypro Object’.

You can use printSubsetHistory() to have information printed in the console about how and why you subsetted the cypro object.

## 
## First Subsetting:
## 
## By: Quality Check
## Reasoning: 
##  Keep cells with a total of 23 to 24 frames(s).
##  Keep cells that were detected first in frame(s) 1.
##  Keep cells that were detected last in frame(s) 24.
## Parent object: TMZ + LY34
## New object: complete_tracks
## Cells remaining: 7875