02 March 2016

Bioinformatics in questions and answers

The submarine of computer science in the steppes of biology

Chico_Fernandez, Geektimes

Bioinformatics is rapidly gaining popularity and is turning from a haven for geeks into a well-known established discipline. I think most Geektimes readers can say with confidence that a rabbit is not only valuable fur and 3-4 kilograms of dietary meat, but also 44 chromosomes, a variety of proteins, transcription and translation mechanisms, and what not. Also, I'm unlikely to surprise anyone if I say that all this can be studied and analyzed not only standing in a white coat at a microscope in a sterile laboratory, but also lying on the couch with a laptop, drinking something Scotch with ice. However, they usually do not go beyond this knowledge. I decided to try to correct this annoying misunderstanding and make a short excursion into what bioinformatics looks like from the inside from a practical point of view, based on my experience.

In this article I will collect the questions that I myself asked three years ago, when I was still a student of the Faculty of Mathematics, and I will try to answer them.

Why is bioinformatics needed?

The task of bioinformatics, informally speaking, is to find logic in biological data. These data are obtained during experiments, and if for a biologist the data may look like a glowing fish or a beautiful multicolored spot in a photo, then for a bioinformatician the data is presented as:

  • strings (sequences of characters describing DNA/RNA/proteins);
  • three-dimensional and two-dimensional coordinates (microscopy data);
  • arrays of real numbers (for example, each number can be an experimentally measured mass of a protein or a part of it); 
  • vectors of non-negative integers (for example, the depth of coverage of discrete objects, the so-called reeds);
  • matrices of zeros and ones (for example, can different types of bacteria get along with each other);

and many other possible representations of real biological phenomena using mathematical objects.

Do biologists have more interesting data?

Undoubtedly. But bioinformatics do not need to run to the laboratory on weekends (cell cultures, for example, do not know about weekends and tend to die without proper care). And research in biology often lasts for years (depending on the properties of model organisms), while in bioinformatics progress depends mainly on the ability to solve algorithmic problems and write "smart" code. Well, the possibility of remote work from anywhere in the world is also an undoubted plus in favor of bioinformatics.

How much is bio in bioinformatics, and how much is computer science?

It very much depends on the specific research center and research group. You need to understand biology at a minimum level – no one will chew up a scientific project to the level of a school math problem. You will have to model the situation yourself based on your understanding of biology. However, a really deep understanding is not expected, so the fact that you remember only about the pistils and stamens will not be an obstacle if you decide to do this particular science. It is not difficult to learn the necessary basics of biology already in the process of working on a bioinformatic project.

What is really useful and necessary for the future bioinformatics "from computer science" is the knowledge of biotechnologies, that is, how your data was obtained, what problems could arise during the experiment. In my opinion, it is enough to gallop through some course of molecular biology, but spend time and seriously comprehend the principles of operation of modern devices used for experiments.

I would advise the future bioinformatics "from biology" in the learning process to skip the proofs and descriptions of methods and algorithms at first and study them as "black boxes", that is, in a purely applied aspect: "A at the input – B at the output", otherwise there is a risk of "drowning" in theoretical calculations for several years. However, after skipping theory and learning something in practice, it will not be difficult for you to come back and look at it with different eyes.

But if I become a bioinformatician, then I will know bioinformatics?

Unfortunately, no. Bioinformatics in its current state is a set of rather voluminous sections, as in any other science. If we compare, for example, with physics, it is quite obvious that a specialist in theoretical mechanics is likely to have some difficulties in understanding the latest articles on quantum physics, and moreover, he most likely will not have time to read these articles.

And there are many sections in bioinformatics and for every taste:

  • Evolution (and not only in the form of "first there were pithecanthropus", but also lesser-known issues, such as evolution occurring in a cancerous tumor);
  • Search for genetic variants that lead to diseases;
  • Design and selection of drugs that bind to certain types of "dangerous for the body" proteins;
  • Study of the functions of genes, their annotation;
  • Structural bioinformatics (manipulations with 2D and 3D structures, such as, for example, proteins or RNA);
  • Assembling genomes;
  • Building maps of how this whole mess of proteins /RNA/DNA /fats /smart thoughts /gym classes /Kremlin diet and other things reacts with each other (approximately as in the video below, but even more interesting and more complicated);
  • Modeling of complex systems (such as the development of an organism from the embryo);
  • Neurobiology (or rather, the analysis of data obtained by neuroscientists);

and much more (may bioinformatics forgive me, whose field I forgot to mention).

The last three points are often referred to as systems biology, but these sciences are, as they say, "at the junction", and you can jump back and forth with minimal effort.

Does it make sense to choose bioinformatics as your profession?

To answer this question, distribute the following characteristics according to the degree of significance for you (assign rank 6 to the most important characteristic, 1 to the least important one), and then summarize with the specified sign.

+ I have always wanted to be a researcher and feel that I am making a certain contribution to the future of humanity.
+ I am interested in life sciences, I would like to be able to learn something new about biology every day, but my university studies were not related to biology (or - I am a biologist, but I am tired of monotonous technical manipulations with pipettes and I want to understand more what kind of data I received and be able to work with them).
+ Bioinformatics is interesting to me as a subsection of computer science, it seems to me that there are many tasks that you can think about.

– I want to get a big salary right after graduation from the university.
– I would like to wear a white coat all the time, like a real scientist.
– I like to think about problems and read interesting articles about biology, but I don't like programming.

If you get a result less than 0, you definitely should not go into bioinformatics. Do you feel pain from how lax and not universal this test is, but do you understand its idea and even like it in some way? Add yourself +3 points to the result.

What does the career ladder look like for a bioinformatician?

"If you really want to, you can fly into space," but if you are 2 meters tall and weigh 150 kg, you are unlikely to be taken into the cosmonaut squad. And what about bioinformatics?

Basic education

A career is laid down from higher education. Bachelor's degree can be anything, but still not humanitarian. Economics, physics, chemistry, mathematics, not to mention computer science and biology.

The most favorable choice of a master's degree is either a master's degree in bioinformatics, or an "addition" to your bachelor's degree, so that you have both something biological and something computational after these two stages. However, entering a master's degree with a completely different profile is not an easy task.

As for the possibility of obtaining the first stage of higher education (bachelor/specialist) immediately with a specialization in bioinformatics – my attitude to this is ambiguous.

Bioinformatics should be a conscious choice, and it looks rather difficult to make such a choice after school, but if you are sure that this is your vocation, then why not. I am more impressed with the approach of "get a general education and then choose a specialization", rather than immediately start working in a narrow direction. I am not sure if it is possible to easily retrain into a specialist of another profile after 4-6 years of training, but there are examples of successful information analysis.

Additional education

For a brief acquaintance with bioinformatics, many online courses have already been created (Russian Stepic.org Among the online courses there are very useful ones (I would recommend a course from UCSD on algorithms in bioinformatics and a course on evolution from Duke University), sign up, take it if it gets boring or difficult – you will calmly quit this case without wasting anyone else's time or your nerves. After all, for full–time education, it is decent to go motivated - figuratively speaking, in a three–piece suit, with a bouquet in your hands and a carnation in your buttonhole - so that bioinformatics immediately understands that this relationship is serious for you.

Additional education is a wonderful thing, which has almost the same advantages – classes on weekends or evenings (does not interfere with basic studies or work), an enthusiastic team and often even the absence of tuition fees. But – the selection for such programs is quite tough, the courses are voluminous and the pace is fast. That is why, if you still only want to understand whether to continue doing bioinformatics, it's better to do it before – watch online courses, talk to people from the profession, read something popular science (from what people I know personally wrote - an article on Habrahabr, an article on Geektimes, a review "I would go to bioinformatics – let them teach me" on a Biomolecule).

As far as I know, there are two additional programs in Russia – at the Institute of Bioinformatics (IB) in St. Petersburg and in the capital – the Moscow School of Bioinformatics (MSHB). In my opinion, they are approximately equal to the master's degree in terms of the level of knowledge obtained in the specialty, but only "a rare bird will fly to the middle of the Dnieper" – many students fall off after attending a dozen classes – oh, it's not an easy job to collect the hippo genome.

I myself graduated from the Institute of Bioinformatics as part of the master's program of the Academic University (SPbAU), so I will tell you more about IB (I know practically nothing about MSB after they parted with Yandex). The program lasts a year, classes are held on Saturdays. I liked almost all the seminars and lectures, but the most wonderful part of the training was scientific projects. Scientific supervisors there are from the leading scientific centers of Russia and insidious abroad. In theory, projects should be educational in the first place, but most often it is a real science. It was a glorious time: sleepless nights full of Arabic fairy tales "1000 and 1 script" (in fact, at first the fairy tales were Hindu), fierce project defenses, and a sense of belonging to the very leading edge from which scientific articles come, translations of which can often be found on Geektimes. Oh, by the way, there's a buffet there. And the recruitment is going there now. At the same time, the advantage and disadvantage of information security is the lack of fundamental disciplines – only bioinformatics, and nothing else.

If you want more subjects and fundamental training, then holders of bachelor's/specialist's technical diplomas can, like me, immediately enroll in a 2-year master's degree in algorithmic bioinformatics. The admission process is standard: online applications until mid-summer, then an interview. Applications for 2016/18 are already open. But it makes absolutely no sense for biologists to go there.

To complete the story, I had to use my agent network. On the eve of publication, one of the scouts finally broke the radio silence and sent a radiogram about the MSHB to the headquarters. The main points in the deciphered message about the learning process at the MSB were: a) the possibility of obtaining an official diploma from HSE; b) the presence of fundamental disciplines like matan (in my opinion, this is a mockery, but matan is useful because it puts the mind in order); c) scientific projects are carried out under the guidance of leading bioinformatics specialists in Moscow; d) due to the abundance of homework, students have to flock together and think about problems collectively; e) students, nevertheless, squeak with delight and ask for more bioinformatics. Recruitment to the MSHB will begin in May.

Summer schools

Another type of additional education. From what I know: for schoolchildren there is a School of molecular and theoretical Biology (more in biology, but for the future bioinformatics benefits are undoubted), for students and "novice" graduate students – the Summer School of the Institute of Bioinformatics (LSHB), from abroad – Research Summer School in Statistical Omics (RSSSO). To be very brief about the schools I attended – LSHB is ideal for a short intensive introduction to bioinformatics, RSSSO is for those who have already understood what computational biology is and want to "pump up" their statistical base. On LSHB / RSSSO, you can / need to take part in interesting scientific projects, during which you can feel painlessly like a real researcher for a short time. Also, a wonderful way to have fun in the summer in great company. LSHB is held alternately in Moscow and St. Petersburg, RSSSO – in Croatia, the city of Split. The SMTB will be in Barcelona.

Career Bioinformatics

Then, in fact, a career begins – after the master's degree, you can get a job as a bioinformatician (yes, yes, I hear indignant voices, you can after the bachelor's degree, after school, and after kindergarten, but let's agree that the end of the master's degree is the best starting point in many respects). This can be done as in Russia (the database on vacancies is collected on the website blastim.ru ), and abroad. The second option is to go to get a PhD or Ph.D. To find a graduate school (in almost any country – even in Russia, even in Costa Rica) is quite simple, provided that you are a good specialist. The grades in the diploma play a role, but not a decisive one. Where is it better – abroad or at home? Let's hang this question. Perhaps by the time you are ready to enter graduate school, you will have already decided for yourself. All the same, in the process of studying in graduate school, you will most likely be interned in another country once or several times for several months.

After Ph.D. there are already 3 options:

The first is to understand that life is rotten, quit science altogether and go to the Yamalo–Nenets Autonomous Okrug to breed deer. We will not dwell on this option, since it is no longer connected with the topic of the article (but I would advise you to beware of wolves and not to anger deer, their horns look quite dangerous).

The second option is to continue an academic career, and the third is to go into the industry (many companies are now looking for specialists of the appropriate profile). An academic career involves obtaining several internships, which are called, for short, postdocs. The salaries of postdocs are several times higher than those of graduate students, but, as a rule, less than the salaries of those specialists who go into the industry. Finding a job in the industry after obtaining a Ph.D. degree and (optionally) several postdocs is much easier. Then you can get a permanent position as a researcher or try to create your own laboratory and head it. This case is complicated and, frankly, I don't know anything about what is happening "behind the postdoc".

Instead of a conclusion

I will continue to answer your questions asked in the comments to this article. Also, if there is interest in that, I can tell you about what I am doing (studying the links between genetic and epigenetic variability and diseases) in a separate article.

I hope this text was informative for you.

About the author:
Specialist, MSU Mehmat, 2013; Master's degree (Bioinformatics), MiIT SPbAU, 2015; currently – postgraduate student at CRG, Barcelona, group "Genomic and Epigenomic Variation in Disease".

P.S. Before sending this article, colleagues read it and said that it was written unnecessarily pessimistic. I can assure readers that when I asked these questions a few years ago, I received much darker answers.

Portal "Eternal youth" http://vechnayamolodost.ru 02.03.2016

Found a typo? Select it and press ctrl + enter Print version