The colon bacillus Escherichia coli is one of the best studied model organisms in the life sciences. However, the reference organism for this species, its so-called type strain, has been overlooked in microbial genomics until now. In the “Genomic Encyclopedia of Bacteria and Archaea” (GEBA) project, the DNA of type strain DSM 30083T has now been sequenced and compared to that of close relatives of the strain. This study not only allows an entirely new view of the numerous E. coli strains that play relevant roles in medicine and biotechnology, including the EHEC pathogen and Shigella, but they also yielded a generally applicable method for determining the subspecies of any bacterial species. The research was conducted at the Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany, and at the Joint Genome Institute, Walnut Creek, CA, USA.
The colon bacterium Escherichia (E.) coli to microbiologists and biotechnologists is like a “pet bacterium” and looks back on an exciting history. Initially described as „Bacterium coli commune“ by bacteriologist Theodor Escherich in 1886, its original isolate was lost at the beginning of the 1920s. It was not until 1941 that it was isolated again, this time by Fritz Kaufmann at the State Serum Institute in Copenhagen, Denmark, who also deposited it in in several collections of microbial strains and provided a scientific description. Today, E. coli is likely the best understood microorganism in the world and serves as an important indicator for the quality of drinking and recreational waters.
„It seems strange that the number one, the type strain of a bacterium that has entire scientific conferences dedicated to it as a model organism, had not been fully sequenced until now“, said Christine Rohde, Head of the E. coli strain collection at DSMZ, Braunschweig, Germany. „Initially, scientists primarily sequenced the genomes of pathogenic strains of E. coli, or of genetically modified strains of biotechnological relevance. In addition, physicians and hygienists in their daily practice use serotypes that are quickly determined by antibody tests in order to differentiate between different strains of E. coli.”
As Markus Göker, a bioinformatics scientist at DSMZ added: “Complete bacterial genomes are of fundamental importance for diagnostics in humans, for biotechnology, and for the search for antimicrobial agents. Today, this is truer than ever, as some strains of E. coli have developed into dangerous pathogens such as EHEC or EAHEC. The E. coli type strain was sequenced as part of the GEBA project that focuses on type strains exhibiting an unusual physiology or occupying a key place in the phylogenetic tree. This is the only microorganism in the project that was included based on its importance as a model organism.”
A genome with pathogenic potential
There are major physiological and genomic differences between the E. coli type strain and the harmless laboratory strain K-12. “Due to its serotype, the type strain had been grouped into the biological containment level 2, and its genome sequence now confirmed its pathogenic potential, “ said Jörn Petersen, an expert of plasmid biology at the DSMZ. “Unlike laboratory strain K-12, the E. coli type strain harbors an additional circular plasmid of 131,289 base pairs in its genome of 5,038,133 base pairs; this plasmid exhibits a sequence identity of 99% with plasmids from pathogenic E. coli isolates. These strains cause, e.g., colibacillosis in poultry and meningitis in newborns, with the horizontally transferable plasmid being responsible for their virulence,” explained Petersen.
Sophisticated computer-aided phylogenetic analysis
Thanks to the complete genome sequence of the E. coli type strain, the Braunschweig scientists were able to examine whether the huge number of previously sequenced isolates of E. coli actually belong to the same species, using modern taxonomic techniques in the process. “To this end, we analyzed more than 250 strains of E. coli and also verified their published taxonomic classification in subgroups, the ‚phylotypes‘. This bioinformatics-based analysis was performed with the state-of-the-art GGDC method. This technique is analogous to classical DNA-DNA hybridization in the laboratory, but yields significantly more exact results,“ as Markus Göker explained.
The analysis confirmed that all sequenced strains of E. coli belong to the same species. What is new, however, is the realization that E. coli is to be classified as having several subspecies. One of these subspecies includes all strains of the genus Shigella, known to cause shigellosis. “However, the name Shigella has historically been established in medicine, so we were not striving for taxonomic changes in this case,” Markus Göker added. “What is much more important is that the techniques tested in E. coli can now been used to classify bacterial species into subspecies in general.”
Meier-Kolthoff JP et al. (2014). Complete genome sequence of DSM 30083T, the type strain (U5/41T) of Escherichia coli, and a proposal for delineating subspecies in microbial taxonomy. Stand Genomic Sci 9: 2 http://www.standardsingenomics.com/content/pdf/1944-3277-9-2.pdf
Peigne C et al. (2009). The plasmid of Escherichia coli strain S88 (O45:K1:H7) that causes neonatal meningitis is closely related to avian pathogenic E. coli plasmids and is associated with high-level bacteremia in a neonatal rat meningitis model. Infect Immun 77: 2272-2284 (http://dx.doi.org/10.1128/IAI.01333-08).
Meier-Kolthoff JP et al. (2013). Genome sequence-based species delimitation with confidence intervals and improved distance functions. BMC Bioinformatics 14: 60 (http://dx.doi.org/10.1186/1471-2105-14-60).
The Genome to Genome Distance Calculator provides a bioinformatics-based approach for calculating the distances and similarities of genome sequences. These can be used to create phylogenetic trees, but they also allow a mathematical transformation that replaces traditional DNA-DNA hybridization techniques. Using this method, bacteria can be classified into species (and, thanks to the E. coli study, into subspecies as well). GGDC is available as a web-based service at http://ggdc.dsmz.de.
The GEBA (Genomic Encyclopedia of Bacteria and Archaea) project and its successor projects aim at using genome sequencing to systematically close the gaps that still exist in the microbial branches of the phylogenetic tree of life. DSMZ is working on this project in close collaboration with the Joint Genome Institute in California, USA. At DMSZ, Markus Göker heads up these projects. Annotated genomes are deposited via the “GenBank” portal (http://www.ncbi.nlm.nih.gov/genbank/) and can be interactively accessed at https://img.jgi.doe.gov/cgi-bin/w/main.cgi.
Head of Public Relations
Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH
Inhoffenstraße 7 B
Mobile phone: ++49151-140-925-14
susanne.thiele dsmz de
About Leibniz Institute DSMZ
The Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures GmbH is a Leibniz Association industrial applications. Offering comprehensive scientific services and a wide range of biological materials it has been a partner for research and industry organizations worldwide for decades. DSMZ is one of the largest biological resource centers of its kind to be compliant with the internationally recognized quality norm ISO 9001:2008. As a patent depository, DSMZ currently offers the only option in Germany of accepting biological materials according to the requirements of the Budapest Treaty. The second major function of DSMZ in addition to its scientific services is its collection-related research. The Brunswick (Braunschweig), Germany, based collection has been around for 42 years and holds more than 48,000 cultures and biomaterials. DSMZ is the most diverse collection worldwide: In addition to fungi, yeasts, bacteria, and archea, it is home to human and animal cell cultures, plant viruses, and plan cell cultures that are archived and studied there. www.dsmz.de
The Leibniz Association connects 89 independent research institutions that range in focus from the natural, engineering and environmental sciences via economics, spatial and social sciences to the humanities. Leibniz institutes address issues of social, economic and ecological relevance. They conduct knowledge-driven and applied basic research, maintain scientific infrastructure and provide research-based services. The Leibniz Association identifies focus areas for knowledge transfer to policy-makers, academia, business and the public. Leibniz institutions collaborate intensively with universities – in the form of “WissenschaftsCampi” (thematic partnerships between university and non-university research institutes), for example – as well as with industry and other partners at home and abroad. They are subject to an independent evaluation procedure that is unparalleled in its transparency. Due to the importance of the institutions for the country as a whole, they are funded jointly by the Federation and the Länder, employing some 17,500 individuals, including 8,800 researchers. The entire budget of all the institutes is approximately 1.5 billion EUR.