Characterisation of the Genomic Landscape in Splenic Marginal Zone Lymphoma
The development and application of massive parallel sequencing have allowed large amounts of genomic information to be generated on patients with cancer. Splenic Marginal Zone Lymphoma (SMZL) remains less studied than other more common blood cancers and a heterogeneous diagnosis combined with a superficial understanding of the molecular pathogenesis make it difficult to establish appropriate treatment and prognosis. Using next generation sequencing, this project aims to expand the catalogue of mutated genes in SMZL in parallel to the optimisation of a bioinformatics pipeline to be able to identify variants without having paired (normal-tumour) samples.
Our biggest challenge is the lack of paired (control) samples that are crucial in removing the noise from the sequencing data, therefore we will use different computational tools to try and eliminate any false positive that may be called. We will also create, annotate and filter a list of previously identified SMZL variants to establish a high quality up-to-date SMZL database which will serve as as a reference point.
Using targeted high-throughput sequencing, tumour samples from different centres across Europe will be sequenced using the HaloPlex HS Target Enrichment System (Agilent) consisting of 55 target genes relevant in B-cell malignancies. The raw data will be run through an in-house pipeline in iridis4 where all samples will be aligned to the latest reference genome. The variants in each sample will then be called using various tools (Varscan, GATK, Pisces, MuTect2) and then annotated with predictive scores and known database information. Once a list of mutations is established, these will be linked to clinical data with the aim of improving diagnosis and developing staging systems and development of targeted drugs to fight this disease.
Visualisation and data handling methods: Database
Software Engineering Tools: RStudio
Transdisciplinary tags: Quantitative Biology