12 October 2020

Putting puzzles together

Bioinformatics specialists of St. Petersburg State University have created a new collector for reading the genomes of microbial communities

Employees of the laboratory "Center for Algorithmic Biotechnology" of St. Petersburg State University, as part of a group of Russian and American scientists, have developed a metaFlye collector specializing in the collection of DNA samples of microbial communities. With its help, it is possible to solve a wide range of fundamental and applied tasks, among which is the control of the human treatment process and even the creation of new drugs.

An article about the assembler was published in the prestigious scientific journal Nature Methods (Kolmogorov et al., metaFlye: scalable long-read metagenome assembly using repeat graphs).

Scientists have access to several dozen different assemblers that are being developed in leading bioinformatics laboratories around the world. This diversity is due to the fact that the algorithms underlying the collectors need to be adapted to different types of input data obtained on different types of sequencers, as well as to different organisms. For example, approaches for assembling the genome of bacteria may be completely unsuitable for assembling the human genome and vice versa. In addition, the developers of genomic assemblers are constantly striving to improve their solutions so that their programs work faster, use less memory, and the final assemblies are longer and more accurate than those of competitors.

The new metaFlye collector is used in the assembly of metagenomes, that is, DNA samples of microbial communities obtained from various environments, for example, from the depths of the ocean, soil in a park or human intestines. Receiving the assembly of such a sample, it is possible to determine what kind of organisms are represented in it and how many of them. Using additional assembly analysis, it is often possible to find out what these organisms can eat, how they interact, what substances they synthesize. All this information can be used in the future, for example, to search for new medicines of natural origin, to determine the causes underlying the special fertility of the soil, when checking the course of human treatment and in many other fundamental and applied tasks.

The metaFlye collector is designed for data obtained using the most modern sequencing technology at the moment – long-read sequencing technology. For short-read metagenomic sequencing data (short-read sequencing, or next-generation sequencing, NGS), there are already several collectors used worldwide on the Illumina platform. These include the metaSPAdes collector, developed at the Center for Algorithmic Biotechnology of St. Petersburg State University in 2016. There are also already programs for assembling individual genomes from long readings. The new metaFlye product allows you to take advantage of the new technology for complex metagenomic data. This is the first specialized collector for metagenomes, working with Oxford Nanopore and PacBio technologies.

"The incentive to create metaFlye was the lack of a specialized metagenomic collector for the technology of long reads. This technology has already radically changed the whole modern genomic science, we have learned to get much more complete assemblies. For example, with its help, many missing fragments of the human genome were recently read and localized (using the original Flye tool and also with the participation of members of our laboratory). But for metagenomes, such data has just begun to appear, and, of course, they required special tools," notes Mikhail Raiko, one of the authors of the project, senior researcher at the Center for Algorithmic Biotechnology of St. Petersburg State University.

Work on metaFlye began about two years ago. If we count down from the creation of its predecessor, the genomic collector Flye, on the basis of which the new project was implemented, it turns out twice as much – four years.

"In our study published in the journal Nature Methods, we used metaFlye and other collectors to analyze several simulated (that is, computer-generated, without sequencing real DNA) and real metagenomic samples from the gastrointestinal tract of humans, cows and sheep," says another author of the collector, senior researcher at the Center for Algorithmic Biotechnology of St. Petersburg State University Alexey Gurevich. – Perhaps the most interesting is a sample of the sheep microbiome, since it was first obtained and studied in this work, while the initial sequencing data for the other two samples were taken from the work of third-party authors. Thanks to metaFlye, it was possible to collect an order of magnitude more viral genomes and one and a half times more plasmids in this sample than using the best of the existing analog programs. The metaFlye collector is a tool for solving a wide range of tasks that will be available to all researchers working with such data. Of the specific projects carried out in our laboratory, we use a collector to study the composition of the soil of the black taiga – a unique biocenosis of Western Siberia with abnormally high productivity."

Another interesting result was that the genomes of not only bacteria and archaea, but also eukaryotes were collected in the sample. At the same time, bioinformatic analysis showed that almost half of eukaryotic genomic fragments belong to representatives of nematodes, or roundworms. This result fully corresponds to the autopsy report of the animal's corpse, in which signs of a parasitic infection were found.

The publication about metaFlye is the result of a collaboration of 11 Russian and American scientists representing St. Petersburg State University, the University of California in San Diego (UCSD), the Institute of Bioinformatics (St. Petersburg) and American research centers for dairy and meat products. The metaFlye collector itself is mainly developed at UCSD. Its creator and the first author of the publication is Mikhail Kolmogorov, a UCSD postdoc. The scientific director of the project is Pavel Pevsner, UCSD Professor and chief scientific consultant of the Center for Algorithmic Biotechnology of St. Petersburg State University.

Portal "Eternal youth" http://vechnayamolodost.ru


Found a typo? Select it and press ctrl + enter Print version