OmicsAnalyst

1. Introduction

OmicsAnalyst 2.0 is a comprehensive web-based platform designed to support end-to-end analysis of multi-omics data. The platform bridges the gap between exploratory statistics and mechanistic interpretation, enabling researchers to move seamlessly from pattern discovery to functional insight and causal hypothesis testing.

What's New in Version 2.0: Enhanced statistical methods, knowledge-based network integration capabilities and function exploration, and new causal analysis tools including mediation analysis and interaction modeling.

OmicsAnalyst 2.0 operates through three complementary analytical components that are integrated within a unified workflow:

Comprehensive Statistical Integration - Explore patterns and covariance through individual, pairwise, and multi-omics joint analysis
Interpretable Network Integration - Visualize omics signatures in biological context across omics layers
Causal Mechanistic Analysis - Elucidate regulatory mechanisms and driver features

2. System Requirements

OmicsAnalyst 2.0 is a web-based application that runs in your browser. For optimal performance, we recommend:

Component	Requirement
Browser	Chrome, Firefox, Safari, or Edge (latest versions)
Internet Connection	Stable broadband connection recommended
Screen Resolution	1280x800 or higher
JavaScript	Must be enabled

3. Supported Omics Types

OmicsAnalyst 2.0 supports the following five omics data types for multi-omics integration analysis. A maximum of 5 omics data types can be uploaded per analysis:

Genes/mRNAs

Gene expression data from RNA-seq or microarray experiments. Supported ID types: Entrez ID, Ensembl Gene ID, Ensembl Transcript ID, Official Gene Symbol.

Proteins

Protein abundance data from mass spectrometry-based proteomics. Supported ID types: UniProt Protein ID, Entrez ID, Ensembl Gene ID, Official Gene Symbol.

miRNAs

MicroRNA expression data. Supported ID types: miRBase mature ID, miRBase accession, miRBase ID (e.g., hsa-miR-21).

Metabolites

Metabolite concentration data from metabolomics studies. Supported ID types: KEGG ID, PubChem ID, HMDB ID, Common Name.

Microbiome

Taxonomic abundance data from 16S rRNA or metagenomics. Supported ID types: Taxonomy label, OTU ID. Taxon levels: Phylum, Class, Order, Family, Genus, Species, Strain.

Note: At least two omics data types are required for multi-omics integration analysis.

4. Input Data Formats

OmicsAnalyst 2.0 provides two upload modes for different analysis workflows:

Upload Mode 1: Data Tables (Statistical Integration)

Upload expression/abundance data tables with a metadata file for comprehensive statistical analysis.

Upload a single metadata file and at least two omics data tables
The metadata table should describe the same sample IDs shared across all omics data
A small percentage of missing values are acceptable

Upload Mode 2: Feature Lists (Network Integration)

Upload pre-identified feature lists from external analysis tools ( limma, DESeq2, edgeR, etc.) for network-based analysis.

Upload one or more feature lists (genes, proteins, metabolites, etc.)
Specify the model organism (Human, Mouse, or Other)
For microbiome data, specify its host (currently Human or Mouse only)

Supported File Formats

Format	Extension	Max Size
CSV	.csv	50 MB (Genes/mRNAs), 25 MB (others)
TXT	.txt	50 MB (Genes/mRNAs), 25 MB (others)
TSV	.tsv	50 MB (Genes/mRNAs), 25 MB (others)

Data Table Structure

Each omics data table should follow this structure:

First column: Feature identifiers (gene symbols, protein IDs, metabolite names, etc.)
Subsequent columns: Sample measurements (one column per sample)
First row: Header with sample names matching the metadata file

Metadata File Structure

A metadata file describing sample information is required for Statistical Integration:

First column: Sample names (must match column headers in data files)
Second column: Primary study factor (group/condition) - no missing values allowed
Additional columns: Other sample attributes (batch, clinical variables, etc.)

Tips:

Ensure sample names are consistent across all data files and the metadata file
Avoid special characters in sample names and feature IDs
No missing values allowed for the primary study factor (first column after sample names)

5. Key Features

OmicsAnalyst 2.0 bridges the gap between exploratory statistics and mechanistic interpretation through three key analytical components:

Statistical Integration

Comprehensive multi-omics statistical analysis including limma-based differential analysis, correlation analysis, and advanced integration methods (MCIA, MOFA, DIABLO) to identify significant features and patterns across data types.

Network Integration

Project significant features onto comprehensive molecular networks, including protein-protein interaction, metabolic pathways, and regulatory networks for biological context and functional interpretation.

Causal Analysis

Test mechanistic hypotheses through mediation analysis and interaction modeling (IntLIM) to identify regulatory relationships and driver features across omics layers.

Statistical Integration Methods

The Statistical Integration component provides a progressive analytical framework from individual omics characterization to full multi-omics integration:

Single-Omics Characterization - Analyze each omics layer individually:

Significant Features: Identify features significantly associated with experimental factors using linear models (limma)
Overall Patterns: Explore sample clustering, grouping patterns, and major sources of variation
Variance Partitioning: Decompose global and feature-level variance to identify features driven by specific factors

Pairwise Omics Analysis - Discover relationships between two omics layers:

Clustering Heatmap: Hierarchical clustering to reveal global correlation patterns between two omics layers
Correlation Network: Network-based visualization of significant feature-to-feature correlations
Differential Chord Diagram: Compare correlation structures between conditions to identify network rewiring
Sparse CCA: Maximize correlation between two blocks and select key features driving the link

Multi-Omics Integration - Integrate all omics layers simultaneously:

Global Exploration: Consensus PCA and Multiple Co-inertia Analysis (MCIA) for visualizing shared trends
Latent Factor Discovery: Fast NMF, Semi-NMF, and MOFA for discovering biologically interpretable patterns
Feature Selection: DIABLO for finding discriminative and correlated features across omics

Network Integration Methods

For feature list inputs, build and explore comprehensive molecular networks:

Database Selection: Choose from protein-protein interaction, metabolic, regulatory, and other network databases
Network Builder: Construct multi-omics networks connecting genes, proteins, metabolites, and microbiome
Network Viewer: Interactive visualization with functional enrichment analysis

Causal Analysis Methods

Test mechanistic hypotheses and identify regulatory relationships:

Pairwise Linear Model (IntLIM): Test for significant linear relationships between paired features across conditions
Mediation Analysis: Identify mediating relationships where one omics layer influences another through an intermediate

6. Analytical Workflows

OmicsAnalyst 2.0 provides two entry points depending on your input data type:

Data Tables → Statistical Integration

Upload Data Tables

Metadata + expression/abundance tables for 2-5 omics types

→

Quality Control

Data filtering, normalization, and harmonization

→

Analysis Hub

Single-omics, pairwise, and multi-omics methods

→

Visualization

Interactive plots, heatmaps, and networks

Feature Lists → Network Integration

Upload Feature Lists

Pre-identified features from external tools (DESeq2, limma, etc.)

→

Database Selection

Choose network databases (PPI, metabolic, regulatory)

→

Network Building

Construct integrated molecular networks

→

Functional Analysis

Pathway enrichment and module identification

Example Analytical Pipelines

Based on these workflows, users can design their own analytical strategies to address specific biological questions. Below are two example pipelines demonstrating how the three key components work together:

Discovery Workflow: Statistics → Networks → Function

Statistical Analysis

Upload multi-omics data and perform differential analysis

→

Feature Selection

Identify significant features and cross-omics correlations

→

Network Projection

Map features onto biological interaction networks

→

Functional Analysis

Perform pathway enrichment and functional annotation

Validation Workflow: Networks → Statistics → Causality

Network Analysis

Build networks from feature lists and identify hub genes

→

Hypothesis Generation

Form mechanistic questions based on network topology

→

Statistical Testing

Test candidate features with interaction modeling

→

Causal Validation

Quantify direct and mediated effects via mediation analysis

Note: These workflows are flexible. You can choose any combination of methods based on your data characteristics to address your specific research questions.

7. Getting Started

To start using OmicsAnalyst 2.0, visit the Upload page and choose your data input type. Before uploading, ensure your data files are properly formatted as described in Section 4: Input Data Formats.

Which pathway should I choose?

Data Tables: Choose this if you have raw expression/abundance data and want to perform comprehensive statistical analysis from scratch
Feature Lists: Choose this if you have already identified significant features using external tools and want to focus on network-based interpretation

Next Steps: Continue with the detailed tutorials for each analytical component:

Platform Overview

Table of Contents