30 January 2018

Small, yes, he was good

Pocket sequencer read the human genome

Daria Spasskaya, N+1

Scientists managed to read the human genome using a "pocket" MinION sequencing device the size of a smartphone. In an article by Jain et al. Nanopore sequencing and assembly of a human genome with ultra-long reads, published in Nature Biotechnology, also describes the record set using the same device for the longest reading of a DNA molecule – the length of continuous reading was 882 thousand base pairs.

Despite the fact that the genomes of dozens of animal species have been read to date, and the cost of sequencing has decreased by orders of magnitude compared to the first such experiments, determining the complete DNA sequence of eukaryotic organisms is still a non-trivial task. This is mainly due to the fact that a significant part of the genome is a repeating sequence – microsatellite DNA, tandem repeats, retroelements, and so on.

The most common technology of high-performance DNA sequencing today involves splitting the DNA molecule into small pieces of several hundred base pairs, their amplification (reproduction) and reading. From such small pieces, using mathematical algorithms, the complete genome sequence is then restored (this process is called assembly). Many sections of DNA, especially those containing repeats, fall out at all, or researchers are not sure of their exact sequence. Even in the human reference genome, which was first published in 2001, there are still gaps.

To avoid this, engineers focused on sequencing technologies that allow determining the sequence of the longest possible DNA molecule, preferably without amplification. To date, the most popular solution for reading and assembling large genomes has been PacBio's technology, which allows you to continuously read several tens of thousands of pairs. For small genomes, for example, bacterial ones, Oxford Nanopore Technologies in 2014 proposed a "pocket" sequencer MinION, which is also capable of continuously reading long sequences, but is limited in capacity.

MinION.jpg

MinION is a smartphone-sized device that connects to a computer via a USB cable. The principle of its operation is based on the measurement of electrical conductivity during the stretching of a DNA molecule through a pore in the membrane of the device. The cost of the device and the starter kit of reagents is one thousand dollars, which is quite cheap compared to other existing technologies. The developers position it as a field sequencer that can be used "in the jungle, in the Arctic, on the space station." In confirmation of this, several DNA sequences, including the mouse mitochondrial genome, were actually read recently using MinION on the ISS.

Researchers from several American and Canadian institutes, including the University of California and the National Human Genome Research Institute (USA) have shown that with the help of MinION it is possible to successfully read the human genome. Moreover, scientists have optimized the sequencing protocol to read ultra-long fragments of DNA in hundreds of thousands of base pairs. The size of the human genome is approximately 3 billion base pairs (three gigabases). The reading of the DNA of the GM12878 human cell line was carried out by the staff of five laboratories, 39 MinION working cells and an optimized sample preparation protocol were used in the work. As a result, scientists received 91.2 gigabases of data, which corresponds to a 30-fold coverage of the genome. The length of more than half of the read DNA fragments was 100 thousand base pairs or more. Additionally, the researchers showed that with an optimized protocol, it is possible to determine the DNA sequence of up to 882 thousand base pairs. In fact, the maximum reading length is determined only by the quality of DNA isolation.

To combine the sequences into one, the authors also had to optimize the assembly algorithms, since most of the existing programs are sharpened for short fragments. After comparison with the reference genome of the GM12878 line, which was read by more traditional methods many times, it turned out that the resulting sequence covers 85.8 percent of the genome, and the assembly accuracy is approaching 100 percent.

Reading such long DNA fragments during sequencing allowed the authors to fill in the gaps in the sequence of the human genome and simplify the analysis of many of its sections. For example, the locus of the main histocompatibility complex (HLA) has a complex structure and many repeats, so its sequence is very difficult to determine. In this case, all the genes fell into one continuous read sequence, which saved the researchers from having to painstakingly assemble it from pieces. In addition, scientists have demonstrated the ability to recognize epigenetic tags using this sequencing technology, in particular, DNA methylation, which was impossible with other technologies.

Sequencing of the human genome is a kind of control point in determining the performance of the technology. This work promises to further simplify and reduce the cost of not only the direct determination of the DNA sequence, but also its analysis.

Portal "Eternal youth" http://vechnayamolodost.ru


Found a typo? Select it and press ctrl + enter Print version