scRNA-seq data integration

Modified on Tue, 15 Nov, 2022 at 1:03 PM

Data integration is the process of combining data from different sources, generated using a variety of methods and tools. It is widely used in bioinformatics to perform omics data analysis unraveling complex biological processes. 



1 Algorithm settings   


1.1 Creating a plot



As a first step of the analysis, a plot must be created by clicking on the create plot icon in your analysis track. This will lead to a section where the analysis of interest can be selected.



In order to ensure efficient organization, a name and description must be assigned to the analysis under the appropriate fields. Subsequently under "Choose algorithm to run your analysis" Integration must be selected.




1.2 Selecting data



In the field "Choose track element", input analyses can be selected. In the cell selection tab, you can choose the observations to use as input. For more information see the section on Cell/sample selection. Note that subsetting at the pretreatment step is a "hard" subset meaning that excluded cells/samples at this step will not be present in the downstream steps.




1.3 Setting parameters



In the set parameters field, you will be able to define how to perform the integration.


Algorithm parameters:


Algorithm sample integration:


    - SCTransform

Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression


    - seurat_integration

Use of a global-scaling normalization method “LogNormalize” which performs normalization the gene expression measurements for each cell by the total expression, multiplication by a scale factor, log-transformation of the result


Split data matrix by:

    - values in chosen metadata column



Integration anchors:


Algorithm sample integration:


    - cca

Canonical correlation analysis


    - rpca

Reciprocal PCA


    - rlsi

Reciprocal LSI



Method for nearest neighbor finding:


    - annoy

approximate nearest neighbor search


    - rann

randomized approximate nearest neighbors algorithm




1.4 Running the analysis




Once you have selected all the necessary settings and commands, the analysis can be initiated by clicking on the "Run" icon on the top right.




2 Accessing results of the analysis



You will be redirected back to the Tracks page where a new placeholder named "Integrated Data" will appear. Click "Select" to view the newly executed analysis.




Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article