In addition to our automated analysis pipelines, the GSI analysis team can also process data through a variety of well-defined analysis procedures. This generally involves an analysis working directly with the project owner to collect the necessary information, then running the tools directly on our high-performance cluster to generate the analysis deliverables.
Our Whole transcriptome pipelines will gene counts and normalized expression scores. Differential expression analysis will help to identify statistical differences between multiple groups for each gene in the gene model. This typically requires having replicate sample for each of your conditions.
Geneset Enrichment Analysis
Germline Joint Genotype Calling
Our germline analysis pipeline generates genome-wide variant calls using GATK haplotype caller, followed by genotyping on a per-sample level. Joint-genotyping leverages information across mutiple samples to provide more accurate calls.
Tools : GATK
Deliverables : joint genotyped germline calls (vcf)
Somatic Calling with a panel of normals (no matched normal)
Our mutect2 somatic calling pipeline relies on the availability of a matched normal sample. This is often unavailable. If there are unmatched normal samples available, we can use these to generate a panel of normals (PON) against which somatic calls can be made from each tumour sample
Tools : Mutect2/GATK
Deliverables : PON Somatic Calls (vcf)
Back Extraction of Aligned Sequence to fastq format
Aligned sequence reads sometimes need to be reverted back to fastq format for use in a variety of tools. We have developed procedures and workflows to run this, using information stored in the the bam files to assist in accurate recreation of the original fastq files.
Tools : picard tools
Deliverables : fastq files (paired end)