Development and application of powerful methods for identifying selective sweeps
Around 10,000 years ago, humans started to domesticate cows. It was also around this time that humans started to drink cow’s milk, which provided an advantage to those people who had the genetic mutation giving them the ability to process lactose into adulthood. Evidence of this selective advantage can be found in the DNA of some populations of humans today. Other examples of selected traits found in some human populations include disease resistance and adaptation to live at high-altitudes.
Genetic signatures are left in the DNA of a population when natural selection or evolution occurs. As a selected trait sweeps through a population (generation by generation), DNA strings around the site of selection “hitchhike” along. This creates areas of the genome which are highly correlated: that is, if you have one allele, you are very likely to have another particular allele nearby. These signatures can be identified using mathematical models designed to locate these areas of unusually high correlation.
This PhD project involves the development and application of models to population genomic data to find evidence of selective sweeps in humans and also in other species, such as chickens. This will involve interrogating huge datasets of whole genome sequences for large populations (for example, the 100,000 Genomes Project), requiring the use of the Iridis cluster to manage and process these large datasets. In addition, developing and testing methods will necessitate large simulation runs and analysis which will also require high performance computing capabilities.
Algorithms and computational methods: Machine learning
Software Engineering Tools: RStudio
Programming languages and libraries: R
Computational platforms: Iridis
Transdisciplinary tags: IfLS