The algorithm developed by the international team is based on a simple idea: using the branches of a genealogical tree as reference, it infers an organism’s capability of surviving – i.e. its biological fitness. Fitter lineages have more offspring, which is why their genealogical tree comprises more branches. Highly branched branches therefore represent lineages that are expected to prevail in the future. “Predicting evolution is the ultimate test for our understanding of evolution,” says Richard Neher from the Max Planck Institute for Developmental Biology. Such predictions could also help scientists produce vaccines against rapidly developing pathogens such as influenza viruses.
The method is based on two key assumptions: the organism population is under persistent directional selection, and the fitness of individuals changes in small steps due to mutations. The input then needed by the algorithm is the genealogical tree derived from the genetic analysis of the organism’s various lineages.
Validation tests have already proven the reliability of this method. The researchers tested it on the development of the A/H3N2 influenza virus occurring in Asia and North America from 1995 to 2013. They used the genetic data of the surface receptor haemagglutinin 1 of one year to reconstruct a genealogical tree that the program then used to predict the upcoming flu season’s fittest virus lineages. “In 30% of all cases, our algorithm was able to determine the virus type that would bring forth the dominant type the next year. For 16 of the 19 years analyzed in this time period, it made informative predictions regarding the virus type that would circulate in the upcoming season. This indicates that the fitness of the influenza virus is mainly determined by mutations that individually have a small effect but accumulate over time,” says Neher.
The researchers in Tübingen also compared the evolutionary trajectory of A/H3N2 with the predictions published by researchers from Cologne and New York in the spring of 2014. This algorithm uses long time series of genetic data of influenza viruses to predict which virus type will be dominant in the upcoming year, and is designed specifically for influenza. It turned out that the method from Tübingen makes predictions with a similar reliability, even though its underlying algorithm is much simpler and can be applied to many different organisms.
Combining this approach with models of the spread and transmission of pathogens could increase the algorithm’s power of prediction even further. “Our method works without historical data and does not require detailed knowledge of how an organism’s genome influences its fitness. This makes the method much more versatile, so that it can also be applied to other virus types as well as bacteria and cancer cells,” says Neher. In a next step, the scientists plan to apply it to HI- and noroviruses.
Predicting evolution from the shape of genealogical trees
Richard A. Neher, Colin A. Russell, and Boris I. Shraiman
eLife, November 11, 2014. DOI: http://dx.doi.org/10.7554/eLife.03568