Modern life sciences generate a constantly growing amount of data in shorter and shorter cycles. Making such data controllable and suitable for evaluation is the objective of Dr. Dr. Alexander Wolf and his colleagues at the Helmholtz Zentrum München’s Institute of Computational Biology (ICB). With this in mind, the researchers are attempting to develop software that handles this evaluation. But of course there are various hurdles to clear.
“In the current study, we dealt with the problem that software cannot assign image data to continuous processes,” explains study leader Wolf. “For example, it is possible to classify image information according to clearly defined categories, but in disease progression and developmental biology, the limits are quickly reached because the processes are continuous and not individual steps.”
In order to take this into account, the Helmholtz team employed methods from so-called Deep Learning* (i.e. machine learning processes). “Using artificial neural networks, we can now combine individual pictures into processes and additionally display them in a way that humans understand,” say Philipp Eulenberg and Niklas Köhler, former Master’s students at the ICB and the study’s first authors.
Blood cells and retinas as sparring partners
In order to demonstrate the method’s capability, the scientists selected two examples. In the first approach, the software reconstructed the continuous cell cycle of white blood cells using images from an imaging flow cytometer (producing pictures in a fluorescence microscope). “A further advantage of this examination is that our software is so fast that it is possible to extract the cell development on the fly, meaning while the analysis in the cytometer is still running,” explains Wolf. “In addition, our software makes six times less errors than previous approaches.”
In the second experiment, the researchers reconstructed the progress of diabetic retinopathy.** “We did this by feeding our software 30,000 individual images of retinas as sparring partners, so to speak,” explains Niklas Köhler. “Since it automatically compiles these data into a continuous process, the software allows us to predict the disease progression on a continuous scale.”
And if the data are not part of a continuous biological process? “In such a case, the software recognizes that individual categories are involved and assigns the measured data to individual clusters,” Wolf explains. In addition to further applications for the method, in the future Wolf and his colleagues want to solve other problems involving the evaluation of biological data using machine learning.
* Deep Learning algorithms simulate the learning processes in people using artificial neural networks. The principle functions particularly well when large quantities of data (Big Data) are available for training. Image recognition is one of Deep Learning’s strengths. More decision layers are placed between the input and the output than usually found in neuronal networks, which is why the term „deep“ is used.
** Diabetic retinopathy is the main cause of early vision loss in the Western world. The diagnosis is usually made by an expert, who assigns it to one of the four stages healthy, mild, medium and severe. Working with 8,000 images, the software was able to describe the progression or increasing severity of the disease without being provided with the ordering information.
Alex Wolf and the team recently took one of the top places in the Data Science Bowl, one of the world’s highest endowed competitions in Big Data. For their entry, the team programmed an algorithm that recognizes lung cancer on the basis of 300 slices from a three-dimensional computer tomography scan in less than a few milliseconds, a process that can take a radiologist several hours in the worst case.
The ICB also deals with the subject of Deep Learning in other contents: The scientists recently introduced an algorithm in ‘Nature Methods’ that predicts hematopoietic stem cell development. In the video “Deep Learning Predicts Stem Cell Development”, they explain how this works.
Eulenberg, P. et al. (2017): Reconstructing cell cycle and disease progression using deep learning. Nature Communications, DOI: 10.1038/s41467-017-00623-3
The Helmholtz Zentrum München, the German Research Center for Environmental Health, pursues the goal of developing personalized medical approaches for the prevention and therapy of major common diseases such as diabetes and lung diseases. To achieve this, it investigates the interaction of genetics, environmental factors and lifestyle. The Helmholtz Zentrum München is headquartered in Neuherberg in the north of Munich and has about 2,300 staff members. It is a member of the Helmholtz Association, a community of 18 scientific-technical and medical-biological research centers with a total of about 37,000 staff members.
The Institute of Computational Biology (ICB) develops and applies methods for the model-based description of biological systems, using a data-driven approach by integrating information on multiple scales ranging from single-cell time series to large-scale omics. Given the fast technological advances in molecular biology, the aim is to provide and collaboratively apply innovative tools with experimental groups in order to jointly advance the understanding and treatment of common human diseases.
Contact for the media:
Department of Communication, Helmholtz Zentrum München – German Research Center for Environmental Health, Ingolstädter Landstr. 1, 85764 Neuherberg – Tel. +49 89 3187 2238 – Fax: +49 89 3187 3324 – E-mail:
Dr. Dr. Alexander Wolf, Helmholtz Zentrum München – German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstr. 1, 85764 Neuherberg – Tel. +49 89 3187 4217, E-mail: