Analysis Packages

Bioinformatics services for research use only (RUO) are offered in three levels of data analysis and return, with increasing costs. We also have accredited assays for clinical trials.

Types of RUO analysis:

  • Raw data (Fastq) : we return only the sequence data in Fastq format.
    Choose if you have a non-standard or non-human reference (e.g. T2T or mouse), or if you have your own bioinformatics team.
  • Aligned sequence : we return sequence data aligned to the human reference (hg38) in BAM format.
    Choose if you have specific bioinformatics analysis you want to run that’s distinct from our analysis packages.
  • Analysis packages: These are predefined pipelines that vary based on the the sequencing library type. Deliverables will vary. Details are below.
    Choose if the analysis package meets your research aims.

Most of our analysis packages are built as automated pipelines that can be configured to execute as our data comes off our sequencer. Each pipeline consists of one or more workflows implemented in WDL (workflow description language), all of which are publicly available on the Genome Sequence Informatics github repository. Each pipeline produces a predefined set of deliverables.

We also have a variety of analysis procedures through which data can be processed by one of our analysts. This will generally require additional consultation to discuss the input data and experimental design.

We return data to you in different ways depending on the service requested. We also provide data deposition services to EGA or cBioPortal

We can also accept data sequenced elsewhere (“Bring your own data”) for analysis through one of our packaged services. Please contact us for a quote.

If you’re sequencing many samples at OICR and would like to have it run through automated analysis on our high performance cluster, but don’t see an analysis package that suits your needs, we can also develop custom workflow/analysis routines. Please contact us for further details.

Analysis Pipelines

These are pre-configured sets of workflows that launch in succession as the data becomes available.

Illumina Fastq Generation

Generation of fastq files from Illumina run folders. This is run on all samples from illumina run folders generated through the Translational Genomics Laboratory, and data return is included as part of the sequencing costs.

Alignment Only Pipelines

Alignment of each lane of sequencing against the hg38 genomic reference, using bwa mem (DNASeq) or STAR (RNASeq, splice aware)

Alignment, Merging and Preprocessing Pipeline (DNASeq)

Multiple units of DNA seq data from the same sample are aligned, merged into a single file, then preprocessed to a call-ready state, suitable for processing through downstream analysis tools.

Whole Genome, Somatic Variant Calling Pipeline

Alignment of DNA seq data to a call-ready state, followed by variant calling : mutations (snvs, short indels), structural rearrangements and copy number variation.

Whole Transcriptome RNASeq Analysis Pipeline

Splice aware alignment to the genomic reference, transcript quantification and fusion detection

WGTS Pipeline (Whole Genome, Transcriptome)

Whole Genome Somatic Variant Calling +
Whole Transcriptome RNASeq Analysis Pipelines.
These are run together from DNA and RNA isolated from samples from a single Donor/Case

Whole Exome, Somatic Variant Calling Pipeline

Alignment of DNA seq data to a call-ready state, followed by variant calling : mutations (snvs, short indels), and copy number variation.

Whole Genome or Whole Exome Tumor-only Variant Calling Pipeline

Alignment of DNA seq data from samples without a matched donor, followed by variant calling : mutations (snps/indels) and copy number variation.

Germline Variant Calling Pipeline

Alignment, and genome-wide germline variant calling with GATK Haplotype Caller

Whole Transcriptome, Immune Analysis Pipeline

Analysis through a variety of Immune Analysis tools

Targeted Sequencing Panels Variant Calling Pipeline

Targeted Sequencing Libraries, with UMI incorporation are aligned then partitioned, collapsed and error corrected with Consensus Cruncher prior to calling mutations in single-sample mode.

UMI-Extraction, Alignment and Collapse Pipeline

Available for our Targeted-sequencing and cfMeDIP libraries, both of which contain unique molecular identifiers

Shallow Whole Genome Analysis Pipeline

Shallow whole genome copy number calling.

Analysis Procedures

Analysis procedures are predefined protocols that can be configured to take in data produced by our pipelines, or provided to us. These generally involve running analysis across a set of samples and may have considerations for experimental design.

Differential Expression Analysis Protocol

Analysis of whole transcriptome expression calls to determine differences between experimental groups

Geneset Enrichment Analysis

Analysis of whole transcriptome expression calls to determine if pre-defined sets of genes show statistically significant enrichment between two groups

Germline Joint Genotyping

Cross-sample joint genotyping of germline genome wide variant calls, followed by quality score recalibration.