Transcriptomics and Expression Analysis in KBase

Expression Analysis

KBase’s transcriptomics and expression analysis tools enable researchers to build transcriptomes from RNA-seq reads, analyze patterns of gene expression, identify differentially expressed genes, and visualize expression data from microarray or RNA-seq platforms. Expression data is also integrated with KBase’s metabolic modeling tools to compare empirical gene expression data with metabolic models to explore differences in biological behavior and composition.

KBase Transcriptomics and Expression Analysis Capabilities

Align Reads to a Reference Genome

  • Perform alignment for prokaryotic genomes with Align Reads using BowTie2
  • Align eukaryotic transcripts using Align Reads using TopHat2 and Align Reads using HISAT2
  • Download the alignment output object generated by aligner Apps for further analysis

Transcript Assembly

  • Assemble transcripts for a given sample or a sample set using Assemble Transcripts using Cufflinks or Assemble Transcripts using StringTie
  • View downloadable normalized full expression matrices in FPKM (fragments per kilobase of exon model per million mapped reads) and TPM (transcripts per million)

Differential Analysis of Gene Expression

  • Calculate gene and transcript levels in multiple conditions and find significant change in expression levels with Create Differential Expression Matrix using CuffdiffCreate Differential Expression Matrix using Ballgown, and Create Differential Expression Matrix using DESeq2.
  • Select appropriate q-value and fold change cutoffs to help fine-tune the threshold cutoff as an input parameter for the differential expression analysis Apps using the Interactive Volcano Plot visualization app

Downstream Analysis

  • Analyze patterns of gene expression by clustering data from the expression matrices using Cluster Expression Data – Hierarchical, Cluster Expression Data – K-Means and Cluster Expression Data – WGCNA
  • View the generated clusters as a heat map with the Interactive View HeatMap 

RNA-seq Workflow

The RNA-seq analysis workflow in KBase typically consists of (i) mapping short sequence reads to the reference genome (ii) assembling the transcripts into full length transcripts and expression quantification and (iii) and differential analysis of the gene expression.
KBase provides a suite of Apps that allow users to run the tools from the popular Tuxedo RNA-seq suites to get the normalized full and differential expression matrix of the reads obtained from Illumina sequencing platforms using the reference genome. The original Tuxedo suite uses TopHat2, Cufflinks, and Cuffdiff, whereas the new Tuxedo suite uses HISAT2, StringTie and Ballgown for alignment of the reads to the reference genome, transcriptome profiling, and identification of differentially expressed genes (DEG). The RNA-seq Apps in KBase can be combined into multiple workflows, allowing users to select their choice of reads aligner and assembler for the differential gene expression analysis (see Figures 1,2). However, Ballgown does not work for prokaryotes due to its dependency on introns.

For detailed information about the expression analysis tools in KBase, see the app detail pages for the relevant apps. For instructions on using the RNA-Seq apps, look here.

Figure 1: The original and new Tuxedo RNA-seq analysis suites in KBase have modular Apps for building flexible prokaryotic (left) and eukaryotic (right) analysis workflows.


Expression Analysis Resources in KBase

Narrative Tutorials

Reads Management

Reads Aligners

Reads Assemblers

Differential gene identifiers

Filter expression matrix

Clustering algorithms