This tutorial covers Upload Mode 1 (Data Tables) for Statistical Integration in OmicsAnalyst 2.0. After uploading your data tables and metadata file, you will use the Multi-omics Data Harmonization page to review, edit, and prepare your data for downstream analysis.
| Format | Extension | Max Size |
|---|---|---|
| CSV | .csv | 50 MB (Genes/mRNAs), 25 MB (others) |
| TXT | .txt | 50 MB (Genes/mRNAs), 25 MB (others) |
| TSV | .tsv | 50 MB (Genes/mRNAs), 25 MB (others) |
Each omics data table should follow this structure:
A metadata file describing sample information is required:
| Omics Type | Supported ID Formats |
|---|---|
| Genes/mRNAs | Entrez ID, Ensembl Gene ID, Official Gene Symbol, RefSeq ID |
| Proteins | UniProt Protein ID, Entrez ID, Ensembl Gene ID, Official Gene Symbol |
| miRNAs | miRBase mature ID, miRBase accession, miRBase ID (e.g., hsa-miR-21) |
| Metabolites | KEGG ID, PubChem ID, HMDB ID, Common Name |
| Microbiome | Taxonomy label, OTU ID (Phylum to Strain level) |
After uploading your files, you will see the Metadata Overview tab where you can review and edit your metadata before proceeding to analysis.
Ensure the type (discrete or continuous) for each metadata column is correct:
The metadata editor provides several functions to prepare your metadata:
Click the "Edit metadata" button to access the following options:
The Omics Data Overview tab allows you to apply data processing and visualize quality control plots.
Filter out low-quality or uninformative features:
| Option | Description |
|---|---|
| Dataset | Select which dataset to filter, or "Apply to all" for all datasets |
| Method | Filtering method (e.g., variance-based, mean-based) |
| Percentage to filter out | Percentage of features to remove (0-100%) |
Scale your data to make different omics layers comparable:
| Method | Description | Best For |
|---|---|---|
| None | No scaling applied | Data already normalized |
| Auto Scaling | Mean-center and divide by standard deviation | General purpose, most multi-omics analyses |
| Pareto Scaling | Mean-center and divide by square root of standard deviation | Metabolomics data |
| Range Scaling | Scale to 0-1 range | When absolute ranges matter |
Click the "Update" button to apply your filtering and scaling choices.
Two key visualizations help you assess data quality:
The aggregated density plot compares the distributions across different omics layers. If the overall distributions are in very different ranges, it is advised to perform scaling.
PCA plots are generated separately for each omics data to show patterns within each layer. Use these to visually check for:
After completing data harmonization, click "Proceed" to continue to Statistical Integration for single-omics characterization, pairwise omics analysis, and multi-omics integration methods.