Self-Similarity and Biological Function in Genome Sequences
Mammalian genomes are riddled with transposons. Contemporary genomes can thus be thought of as resulting from transposon mutatgenesis experiments carried out over evolutionary time. Absence of transposons would thus indicate sensitivity to their insertion and hence biological function. If a region is the result of transposition, it is characterized by high similarity to some other part of the genome. Conversely, regions without intra-genomic similarity have not transposed recently. Pirogov et al. (2018) published a method for rapidly calculating local similarity to the rest of the genome. They found that in mice and men regions with low similarity are highly enriched for developmental genes. The aim of this PhD project is to extend this result by systematically exploring the relationship between intra-genomic sequence similarity and biological function in other mammals, vertebrates, and finally across the tree of life.
The ideal candidate holds a Master’s degree in bioinformatics and enjoys genomics.
A. Pirogov, P. Pfaffelhuber, A. G. Börsch-Haubold, and B. Haubold. Highcomplexity regions in mammalian genomes are enriched for developmental genes.
Bioinformatics, page bty922, 2018. doi: 10.1093/bioinformatics/bty922.