Bernhard Haubold

Please contact Bernhard for further information on the project:

Surveying Local Sequence Complexity in Vertebrate Genomes

Vertebrate genomes are full of transposons. These are not distributed uniformly across the genome; instead, developmental genes like the Hox genes contain comparatively few transposons. The reason for this is presumably that most transposon insertions in such regions are highly deleterious. This suggests that vertebrate genomes can be read as the result of a transposon mutagenesis experiment carried out over evolutionary time. Highly conserved regions would then lack close homologues elsewhere in the genome.

To quickly quantify homology, Pirogov et al. (2019) developed and implemented the match complexity, which ranges from zero in regions repeated exactly elsewhere, to 1 in regions without close homologue in the rest of the genome. As hypothesized, they found that in the human and mouse genomes regions with maximal match complexity were highly enriched for developmental genes.

The aim of this project is to extend the analysis by Pirogov et al. (2019) to all published vertebrate genomes. The ideal candidate has a background in bioinformatics.

A. Pirogov, P. Pfaffelhuber, A. G. Börsch-Haubold, and B. Haubold. High-complexity regions in mammalian genomes are enriched for developmental genes. Bioinformatics, 35:1813–1819, 2019.

Go to Editor View