Skip to main content
Aller à la page d’accueil de la Commission européenne (s’ouvre dans une nouvelle fenêtre)
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS

Modelling the genomic landscapes of selection and speciation

Periodic Reporting for period 4 - ModelGenomLand (Modelling the genomic landscapes of selection and speciation)

Période du rapport: 2022-08-01 au 2024-07-31

Speciation – the course over which a population within one species separates and develops its own unique characteristics – is a slow evolutionary process that is difficult to observe directly. Often the only source of information we have about when and how speciation happened is sequence variation in present day populations. While we now have genome sequence data for an ever growing number of species, our ability to meaningfully interpret such data has remained fundamentally limited both by a lack of powerful and efficient statistical methods that allow extracting information about the evolutionary process from sequence data and a lack of meaningful comparative analyses of natural speciation processes.Understanding how to most efficiently model gene flow and barriers to gene flow also has obvious practical applications: screening natural populations for variation that may be advantageous for the genetic improvement of animal and plant crops, as well as estimating the probability and genetic risks associated with introgression from domesticated populations and/or introduced species into wild populations.

ModelGenomLand uses a combination of theory, development of new inference tools and seeks to fundamentally improve both our understanding of speciation and selection and our ability to use sequence data to study population processes (be they selection, demography or both) in any system. The project has two principal aims:

WP1&2: to develop general, statistical frameworks for making inferences about the joint action of past selection and demography from genome sequence data. This will be achieved using analytic calculations and approximations for the joint distribution of linked polymorphic sites. We will use these results to develop new methods to quantify the genome-wide rates of positive and background selection and to scan for genomic outliers of divergence between and positive selection within species.

WP3: to apply the new inference approach to genome data for 20 species pairs of European butterflies and conduct a systematic comparison of the demographic and selective forces involved in speciation. This will reveal how repeatable speciation processes are both in terms of the demographic and selective events, and the genes and genomic architectures involved.

This project seeks to fundamentally improve both our understanding of speciation and selection and our ability to use sequence data to study population processes (be they selection, demography or both) in any system.
The ModelGenomLand team has developed new mathematical and computational tools for modelling species barriers (WP1) and past selective events (WP2) from the signatures they leave in genomic variation and has conducted systematic analyses and comparisons of speciation histories in European butterflies (WP3).

WP1 Demographically explicit scans for barriers to gene flow

A mathematical breakthrough was made by PhD student Gertjan Bisschop (published in PLoSCompBio), who developed a graph based algorithm to speed up likelihood calculations for models of divergence and gene flow between species. This efficiency gain has enabled the project team to build an open source tool (gIMble, published in PLoSGenetics) for demographically explicit genome scans for reproductive barriers. We have conducted a comprehensive example analysis on data from Heliconius butterflies which showed that this method not only detects known major effect loci that control barrier phenotypes (wing pattern differences), but also is able to uncover previously unknown genomic barriers.

WP2 New coalescent results for selective sweeps

We have implemented two coalescent approximations for selective sweeps in the generating function framework: the star-like and the Yule approximation and have quantified (using forwards simulations) how well both approximate the distribution of genealogical branch lengths around a selective sweep target. We have applied this framework to both human and butterfly test data and showed that these analytic results enable more powerful likelihood inference that can pinpoint the targets of historic sweeps in population genomic data (published in Genetics).

WP3 Speciation histories and species barriers in European butterflies

Speciation researchers have largely focused efforts on a small number of model systems including fruit flies and other insects or taxa where speciation happened rapidly or involved conspicuous changes such as colour pattern differences in butterflies. To investigate speciation processes comparatively and in a less biased way, the ModelGenomLand team collaborated with Roger Vila (Barcelona) and butterfly researchers across Europe to generate whole genome resequence data for 20 sister pairs of butterfly species. The ModelGenomLand team also collaborated with the Darwin Tree of Life project and Alex Hayward (Exeter) to assemble chromosome complete reference genomes for these species.

Population genomic analyses of these data have shown that most butterfly species that originated in Europe did not form during the major ice age cycles as had been assumed previously but in most cases predate the start of the Pleistocene 2.6 million years ago (published in MolEcol). Nevertheless, we have found that many of these species pairs continue to exchange genetic variation. Another surprising finding of population genomic analyses led by PhD student Alexander Mackintosh (using tools developed in WP1&2) is that chromosome fusions have acted as long term barriers to gene flow, thus promoting species divergence.
The ability to scan genomes for barriers to gene flow in a quantitative framework is a major advance in evolutionary/population genomics. gIMble, the computational tool developed by the ModelGenomLand team has already been used to infer species barriers by several labs in a wide range of systems including flowering plant, ants and Drosophila. These analyses have made it possible to intersect inferences of barriers with other sources of information on species barriers (e.g. data on expression differences and incompatibilities). For example, the ModelGenomeLand team has shown that long-term barriers (acting over millions of generations) overlap with regions of reduced introgression in a hybrid zone (100s of generations, PLosGenetics). Previous attempts to demonstrate the congruence of barriers across timescales have so far largely yielded negative results.

While theoretical models have shown that inter-chromosomal rearrangements can act as triggers of speciation processes, our speciation genomic studies in Brenthis butterflies (published in MBE, G3 and MolEcol) are the first evidence that support a direct role for chromosomal fusions in speciation. Alex Macintosh, the PhD student leading this work has been awarded the International Birnstiel Prize for Doctoral Research in Molecular Life Sciences.

Another surprising result is the discovery of a new sex chromosome system in Melanargia Marbled White butterflies. Analyses led by RA Decroly have demonstrated that the decay of butterfly W chromosomes can be reversed through rare recombination in females and leads to plateaus of sex chromosome divergence (published in MBE).

Analyses of the speciation genomic data generated by the project are ongoing. A key next step is to understand how ‘leaky’ species barriers are, and to characterise the beneficial variation that continues to be shared between emerging species.
Several butterfly sister taxa form contact zones - natural laboratories for speciation research
The genealogical history of a sample of genomes can be represented as a graph
Mon livret 0 0