07 April 2016

CRISPR/Cas9: a short encyclopedia

Genome Editing with CRISPR/Cas9

Konstantin Severinov, Post-science

CRISPR/Cas9 is a new technology for editing the genomes of higher organisms based on the immune system of bacteria. This system is based on special sections of bacterial DNA, short palindromic cluster repeats, or CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats). Between identical repeats there are fragments of DNA spacers that differ from each other, many of which correspond to sections of the genomes of viruses that parasitize this bacterium. When a virus enters a bacterial cell, it is detected by specialized Cas proteins (CRISPR-associated sequence - a sequence associated with CRISPR) associated with CRISPR RNA. If a fragment of the virus is "recorded" in the CRISPR RNA spacer, Cas proteins cut the viral DNA and destroy it, protecting the cell from infection.

At the beginning of 2013, several groups of scientists showed that CRISPR/Cas systems can work not only in bacterial cells, but also in cells of higher organisms, which means that CRISPR/Cas systems make it possible to correct incorrect gene sequences and thus treat hereditary human diseases.

Discovery of the bacterial immune system

No one could have imagined that the practical possibility of treating human genetic diseases would appear "thanks" to bacteria. In the late 80s, Japanese scientists partially sequenced the E. coli genome and found an interesting site that did not encode anything. This site contained repeating DNA sequences separated by variable spacer sites. The presence of an extended non-coding site surprised the Japanese, since bacteria are economical with their DNA and usually do not carry extra sequences. Later, similar "cassettes" of repeats and spacers will be found in a large number of bacteria and archaea and will be called CRISPR.

Bacteria of the same species are characterized by the presence of numerous strains, which often differ greatly from each other. In a sense, strains can be considered as analogous to races or breeds of animals. One strain of the same type of bacterium can be completely harmless, and another can be a dangerous pathogen. Different bacterial strains showed variability, or polymorphism, in the presence, absence or order of spacers in CRISPR cassettes. This property, the meaning of which was completely unknown, became widely used for strain typing and epidemiological analysis. In particular, the company Danisco, engaged in the production of starter cultures for the dairy industry, began to use this property to classify its commercial strains. It was also convenient for patent reasons, because unauthorized use of typed strains of Danisco could easily identify and sue violators.

In the early 2000s, several scientists independently compared the sequences of known CRISPR spacers with DNA sequences deposited in public databases. It turned out that quite often the sequences of spacers were similar to the sequences of viruses. This suggested that CRISPR cassettes may have a protective function. The results were published in journals located at the bottom of the scientific "table of ranks", and in general, few people were interested. At the same time, Cas genes were discovered, often located next to CRISPR cassettes. The bioinformatics group of Evgeny Kunin proposed a rather detailed hypothetical scheme of the mechanism of action of CRISPR/Cas systems. According to their model, when a virus enters a cell, it is detected by a Cas protein using a CRISPR-synthesized RNA copy. If any fragment of the virus genome coincides with the one recorded in the spacer, Cas cuts the viral DNA and starts a chain of reactions, as a result, all DNA is destroyed.

CRISPRCASscheme.jpg
Schematic representation of the CRISPR/Cas system (Annual Review of Genetics)

In the industrial production of fermented milk products, bacteriophage viruses that accidentally enter fermenters (huge vats of milk and lactic acid bacteria introduced there) disrupt fermentation, which leads to huge losses. In order to avoid this, you need to use virus-resistant bacteria. It is easy to remove such bacteria: it is enough to select clones of bacteria capable of growth in the presence of a virus in laboratory conditions. This procedure was carried out at Danisco, but during the sampling we noticed one important feature. It turned out that new spacers appeared in the CRISPR cassettes of clones that became resistant to the virus, corresponding to the sections of the viral genome. Then the scientists conducted a direct experiment and, using molecular genetics, inserted a spacer with a sequence of virus DNA into the CRISPR cassette of the bacterium. And such a genetically modified bacterium really turned out to be resistant to the virus. These results were published in the journal Science in 2007 and were the first experimental confirmation of the protective effect of CRISPR/Cas systems mediated by spacer sequences. A little later, an article was published in Science by the group of John van der Oost (Prof. Dr. John van der Oost) "CRISPR-Cas Systems: RNA-mediated Adaptive Immunity in Bacteria and Archaea", which showed that the system really works through small CRISPR RNAs. The article was published in collaboration with the Kunin group.

Development of CRISPR/Cas9 technology

The initial systems predicted in the group by Kunin and other scientists encoded a large number of Cas proteins necessary for bacterial protection. But a class of systems was also discovered that encoded only one Cas protein, although a very large one. The protective effect of such systems was demonstrated by the French researcher Emmanuelle Charpentier. If in standard systems several proteins are assembled into a complex complex that binds CRISPR RNA, and then this complex recognizes the viral DNA target and attracts another protein that "bites" viral DNA, then in the system that Charpentier was lucky to study, one protein, called Cas9, performs all these functions: and binds CRISPR RNA, and recognizes the target, and "cracks" it. In 2012, the groups of Charpentier and Jennifer Dudny from the University of Berkeley published a joint article in Science, where they proposed a way to reprogram the CRISPR/Cas system so that it could directionally cut DNA in the sites purposefully selected by the researcher. In nature, CRISPR RNA is encoded in a CRISPR cassette, binds to proteins and then recognizes the target. It turned out that it is possible to obtain non-native CRISPR RNA using chemical or enzymatic synthesis. At the same time, the spacer's place in such RNA is occupied by the sequence chosen by the researcher. The Cas9 protein is able to "recognize" and communicate with such a synthetic CRISPR RNA (it is called a "guide") and becomes programmed to recognize and cut its corresponding place in DNA. The Charpentier and Dudna groups demonstrated the possibility of such an approach in vitro, that is, in vitro.

Almost at the same time, the groups of George Church and his former graduate student Feng Zhang from the Broad Institute at MIT showed that the bacterial Cas9 protein and RNA guide are able to "work", recognize and cut DNA in the cells of higher organisms, in particular humans. MIT managed to apply for a patent a day earlier than Berkeley. Since then, patent wars have started between the two universities, which continue to this day.

Mechanism of genomic editing using CRISPR/Cas9

We are diploids. This means that we have a double set of chromosomes — one from dad and mom. If one of the parent chromosomes is "wrong", that is, the DNA sequence in some important gene is changed in it, a state of carrying a genetic disease may occur, and if both copies are incorrect, a genetic disease will arise. A classic example is hemophilia in Tsarevich Alexei Romanov. His grandmother Victoria gave him an incorrect copy of the gene on the X chromosome, although she did not suffer from hemophilia herself, because she had two X chromosomes and a healthy chromosome worked instead of the defective one. And Alexey was unlucky, because he has only one X chromosome.

In order to cure a genetic disease, it is necessary to correct the genetic information affected by the mutation. Hemophilia, like most genetic diseases, is caused by a change in only one letter of DNA, and in total there are 6 billion letters in our genome. These are thousands of books the size of "War and Peace". We have to find only one "typo" and fix it in a given place without changing anything else. This is the task of genomic medicine.

To correct the "wrong" gene, we need a very precise molecular "scalpel" that will find the mutant sequence of nucleotides and be able to "cut" it from DNA. Such a "scalpel" is Cas9. With the help of an RNA guide, the sequence of which coincides with the desired location, he can make a break in the right place of the genome. Target recognition occurs at a site 20-30 nucleotides long. On average, sequences of this length occur only once in the human genome, which allows for accuracy. The cell will not die from making a break in the DNA, as it will be corrected by a healthy copy from the paired chromosome due to the natural process of DNA repair. If there is no paired chromosome, as in the case of hemophilia, you can insert a section of the "correct" gene into the cell simultaneously with Cas9 and the RNA guide and use it as a matrix to heal the introduced gap.

Using CRISPR/Cas9, you can do multiplex editing of several incorrect genes at once. To do this, it is enough to introduce the Cas9 protein and several different RNA guides. Each of them will direct Cas9 to its own target, and together they will eliminate the genetic problem.

In general, the described mechanism functions due to the principle of complementarity, which was first proposed by Jim Watson and Francis Crick in their famous double-stranded DNA model. DNA double helix chains "recognize" each other according to the rules of complementarity. CRISPR RNA recognizes its targets in double-stranded DNA in the same way, while an unusual structure is formed containing a double-stranded section of mutually complementary RNAs and one of the target DNA chains, and the other DNA chain is "displaced".

ZFNs and TALENs systems

In parallel with the CRISPR/Cas9 system, other approaches to genome editing have been developed, namely using TALEN proteins and proteins with so-called zinc fingers. These are genetically engineered proteins that can "bite" DNA. Scientists tried to teach them to recognize a specific, ideally any given DNA sequence. Sometimes it worked, but for each sequence it was necessary to create its own separate protein, and this is a painstaking and long work. To edit the genome using the CRISPR/Cas9 system, a single protein is used, and an RNA guide can be created in a short time in any decent laboratory or simply bought. This is a whole new level of editing, cheap and accurate. Its main advantage is that it is based on a simple principle of complementary recognition, which is used to recognize the sequence of some nucleic acids with the help of other, complementary nucleic acids.

CRISPR/Cas9 in the treatment of hereditary diseases

First of all, with the help of CRISPR/Cas9, we will be able to treat "simple", monogenic genetic diseases: hemophilia, cystic fibrosis, leukemia. In these cases, it is clear what exactly needs to be edited, but there are diseases with high heritability, the genetic nature of which is very complex. Such diseases are a complex result of the interaction of different genes and their variants. For example, many scientists are looking for genes for schizophrenia and alcoholism, every year they find new ones, every year a part of previously discovered genes turns out to have nothing to do with it. How to treat such complex diseases with CRISPR/Cas9 is unclear, and, obviously, multiplex approaches will be required.

It should be understood that the practical application of CRISPR/Cas9 in medicine is rather a distant future, it will take a lot of work, improving the technology, its reliability and safety. In general, the situation with blood diseases is better, since a corrupted gene is needed only for hematopoiesis, and cell therapy technologies for such diseases are well developed. Suppose a person has leukemia. Now, in order to eliminate the disease, he will be irradiated, a suitable donor will be found and bone marrow will be transplanted. It takes a long time to find a donor, but there is never a complete immunological match.

Using the CRISPR/Cas9 system, we could get a sample of a patient's bone marrow and cure his own hematopoietic stem cells by changing the wrong letter. Then the patient will have to be irradiated to kill the affected hematopoietic cells, and inject his own edited cells back — not the cells of a close relative or a stranger at all, but his, fully compatible ones. They will begin to divide and produce healthy blood cells. If we are talking about editing, for example, a liver tumor, everything is much more complicated. It will be necessary to solve the main medical problem — the problem of delivering the components of the CRISPR/Cas9 system to the affected cells.

Experiment with human embryos

In 2015, Chinese scientists attempted to correct the genome of a human embryo. They took a fertilized human egg with a corrupted gene that leads to the blood disease beta-thalassemia. The Cas9 protein and RNA guide were introduced into the cell, which were supposed to find and "crack" the wrong copy of the gene, followed by repair on a healthy matrix. As a result of the experiment, in 5-10% of embryos, the mutation responsible for the occurrence of the disease in adults was indeed corrected. This is good news.

The bad news was that in all the cells of the treated embryos there were a large number of mutations that did not appear at all where they were supposed to. Thus, the technology needs to be improved, it is not accurate enough. Precise editing is obtained when a section of the target DNA with a length of slightly more than 20 nucleotides complementarily interacts with a fully corresponding RNA guide. But after all, in the genome there may be a large number of variants of the target sequence that differ from it by only one letter, even more variants that differ by two, and so on. Each of these variant targets interacts worse than a perfectly suitable target, let's say 10 times worse. But since there are many such sequences, it is very difficult to avoid incorrect recognition (and therefore cutting and editing). How to deal with this is still unclear. Obviously, it is necessary to improve the specificity of the Cas9 protein and choose guides very carefully.

Prospects for studying CRISPR systems

CRISPR is one of the most popular technologies today. Many young people, students, dream of working with CRISPR. But now these studies are becoming generally technological. There are few fundamental questions left. There are several well-established strong groups in the world, the competition is very strong. It is more promising to do something that in 5-10 years could shoot, repeat the success of CRISPR/Cas. But it is impossible to predict where the next breakthrough will be, this is the beauty of science, and this is what makes all kinds of foresight pointless and drives various scientific predictors-"experts" into a frenzy. Interestingly, this as yet unknown area of the future breakthrough should not be in the mainstream at all, be "important". After all, even 10 years ago, none of the "serious" scientists were engaged in CRISPR. By the way, CRISPR/Cas is already the second case of how work on the interaction of bacteria and their viruses leads to a revolution in biomedicine. The first revolution occurred in the 1970s, when restriction enzymes were discovered, without which molecular cloning and genetic engineering are impossible.

Among the existing unsolved problems in the biology of CRISPR/Cas systems, the following can be distinguished. We don't know where most spacers come from. After all, only a few percent of spacers are of viral origin, similar to DNA sections of known viruses, all the rest, the vast majority, are not similar to anything. The question of the evolutionary origin of CRISPR/Cas systems is interesting. Kunin proposed the hypothesis that they are related to transposons — DNA sites that encode special proteins engaged in rearranging the very DNA sites that encode them. Such unusual jumping genes. Now they are trying to confirm this hypothesis experimentally. In addition, the search for new, still unknown CRISPR/Cas systems is quite relevant. Until recently, three different types were known, one of which, type II, turned out to be suitable for editing. Recently, our laboratory at Skoltech and Rutgers, in collaboration with the Kunin and Zhang groups, predicted and experimentally confirmed the presence of three additional types of these systems, that is, we do not know all their diversity. And among the unknown systems there may be those that are promising from a practical point of view and are free from the disadvantages of Cas9-based systems.

Another interesting research goal, which has obvious practical interest, is to understand the molecular mechanism of target recognition and learn how to control this process. This is a special case of the general problem of specific interaction of macromolecules. Scientists do not understand very well how the molecules of proteins and nucleic acids in the cell find their "right" partners and avoid "wrong" interactions.

About the author:
Konstantin Severinov – Doctor of Biological Sciences, Head of the Laboratory of Regulation of Gene Expression of Prokaryotic Elements of the Institute of Molecular Genetics of the Russian Academy of Sciences, Head of the Laboratory of Molecular Genetics of Microorganisms of the Institute of Gene Biology of the Russian Academy of Sciences, Professor at Rutgers University (USA), Professor at the Skolkovo Institute of Science and Technology (SkolTech).

Portal "Eternal youth" http://vechnayamolodost.ru  07.04.2016

Found a typo? Select it and press ctrl + enter Print version