Illustration by Armine Shahbazyan.
“I’ve got some data from the lab. Could you please help me analyze it?” This type of request is coming up more and more often in conversations between biologists and bioinformaticians around the world.
Recent technological developments have enabled the collection of huge amounts of data, and brought data analysis to the forefront of discoveries and innovation in life sciences, biomedicine and biotechnologies. It is estimated that the amount of data produced in the field of genomics alone will soon surpass that of astronomy, YouTube and Twitter combined[1]. And when talking about challenges, the amount of data is not the only problem. It is its complexity and diverse nature that troubles many bioinformaticians. We can obtain information about the sequence of letters encoded in our DNA (genomics). We can have data about which parts of it are active or turned off (epigenomics). We can gather data about how much activity each of our genes possesses in health and disease (transcriptomics), and how much protein is synthesized in our cells (proteomics).
Bioinformatics is the organization, annotation and analysis of multiple layers of biological data for knowledge inference. Graphic by author.
We can also investigate the variety of microorganisms living in our bodies. It is becoming evident that all these layers of data should be analyzed together in an integrative manner for knowledge inference. And this is where you need to be math savvy, a computer scientist, a biologist and a creative thinker all at the same time, in order to be able to put all these layers of data together, visualize, communicate and support inferences regarding, e.g. the role of a certain gene in the rapid growth of tomatoes, mechanisms of disease development, the side effects of an antibiotic or the need to personalize drug prescriptions.
Genomics, including bioinformatics, is currently among the fastest-developing fields not only in science, but also in biotechnology, molecular medicine and pharma.[2] Development of vaccines against COVID-19 by Moderna and Pfizer-BioNTech within record time was only possible by applying novel genomic technologies in well-prepared companies. Similar opportunities arise beyond academia in research-oriented and routine applications, for example, in cancer diagnostics and targeted drug development.[3]
The growth of life science articles that include bioinformatics data analysis over the years. The colors indicate different types of data that arise through measurements of various layers of biological information – from genomes to proteomes to microbiomes. Data retrieved from Pubmed in February, 2021. Graphic by author.
“I’ve got some data from the lab. Could you please help me analyze it?”
“I’m sorry. Unfortunately I am overwhelmed with other projects at the moment.” This type of answer is coming up more and more often in these conversations. The educational and academic programs around the world are not preparing enough bioinformaticians to deal with the exponential data growth we are observing in the 21st century.[2] This is because genome bioinformatics is a hard STEM direction that combines informatics and data science with life sciences, and requires solid skills and knowledge ranging over a wide spectrum of disciplines from mathematics and programming to biology and medicine. Education and training here goes along the bachelor’s to master’s tracks, but usually also requires a PhD and postdoctoral skills to start working independently, even at the lowest expert level.
It is estimated that nearly 50% of open positions for bioinformatics specialists are currently not filled in academia and in the pharma/biotech industry.[2] And while the world as a whole needs to double the amount of bioinformaticians, we in Armenia need to have around 20 times (see below) more specialists in genomics and bioinformatics than we do now. While this presents a great challenge, it is also an opportunity to bring our unique contribution to the developing world of data-driven life sciences, biotechnologies and biomedicine.
Genome Bioinformatics in Armenia
Bioinformatics for genomic research has a short history of only about 12 years in Armenia. It all started with the establishment of the Bioinformatics Group (BIG) at the Institute of Molecular Biology of the National Academy of Sciences by Dr. Arsen Arakelyan. The group has since produced and applied various computational tools for the analysis of big genomic data to study complex human disorders, such as a tool to assess how cancer cells regulate their telomere length, a tool to estimate drug target processes and for drug repurposing, as well as a tool to predict genetic predisposition to non-infectious human disorders. Since its establishment, the group has had various collaborators from around the world, including a long-standing and fruitful relationship with the University of Leipzig, and has also provided data analysis services to academic and industrial entities abroad. In total, the team has attracted nearly $500,000 in funding. This example has been a great success and has planted the seeds for the further development of genome bioinformatics in Armenia. However, the main challenge that the group has faced has been the lack of human capital, impeding the capacity to leverage the myriad of opportunities to attract academic collaborations, funding and, importantly, to supply biotech companies with bioinformatics specialists to establish their branches in Armenia.
Many research labs in life sciences in the country are struggling to find bioinformatics specialists to analyze their data, and the Bioinformatics Group itself doesn’t have enough manpower to help those labs boost the data-driven aspect of their studies. Since its establishment in 2011, the BIG group has grown from a team of two (one BSc student and a group leader), to a team of three (2 PhD students and a group leader) in 2021. Its alumni include one PhD, three MSc and one BSc graduates that have acquired educational and academic positions abroad. It’s becoming evident that this speed of organic growth of research entities within state-funded institutions, however successful their academic track record is, does not solve this problem in a reasonable timeframe. Genomics and bioinformatics are currently the bottleneck of progress in several research institutions and hospitals in Armenia, and their lack is hindering the establishment of biotech startups and companies.
There are 30 entities that are in need of genomics/bioinformatics specialists in Armenia. Several educational programs at a few universities in Armenia provide only a few bioinformatics-related courses, and do not train bioinformaticians at a sufficient level. There is an urgent need to establish programs that are more specialized on genome bioinformatics; however, the lack of experts, as well as the lack of opportunities in industry after graduation from such programs, hinders their establishment. Therefore, boosting research in bioinformatics is a promising starting point to creating the human capital to be able to gradually fill the educational gaps, as well as to support the establishment of related companies.
Establishment of the Armenian Bioinformatics Institute
In many western countries, national bioinformatics institutes and infrastructures were established over 20-30 years ago, such as the European Bioinformatics Institute (EMBL-EBI), the National Center for Biotechnology Information in the U.S. and the Swiss Institute of Bioinformatics (SIB).
Today, states, as well as private foundations, are continuously injecting more and more funds for further development of data-driven life sciences.[4] Armenia is lagging behind. The number of people with a PhD degree in genome bioinformatics in the country is just two, but at least 30 are needed in the academic, medical and industrial sectors. Even more troubling is the fact that educational programs in Armenia do not prepare genomics/bioinformatics specialists at a solid level and in sufficient numbers to satisfy the needs that academic, medical and industrial entities will have in the near future. The only way to overcome this problem is to initiate a combined educational/research program by extending local expertise with international partners to build a minimum foundation of the required human capital.
With this in mind, in February 2021, a team of bioinformatics specialists from Armenia and abroad established the Armenian Bioinformatics Institute (ABI) as a non-profit scientific educational foundation. The goal is to develop human capital in bioinformatics, to boost data-driven research in life sciences, and to accelerate developments in biomedicine and biotechnology. In order to reach this goal, ABI serves as a platform to unite experts from around the world, and to recruit students to the exciting field of bioinformatics, connect them to their potential remote mentors and supervisors, and provide a fruitful environment for learning and networking. ABI’s ultimate goal is to host world-class research labs to boost innovation in life sciences in the world, and to prepare human capital to support the establishment of biotech/biomedical companies in Armenia.
The first large event by ABI was the summer school in genome bioinformatics OMICSS-2021, where 19 participants from Armenia received extensive training in biology, statistics, programming and bioinformatics for seven hours a day over 11 weeks. The school involved self-learning activities, peer-to-peer discussions, individual meetings with mentors, and invited lectures delivered by 36 speakers from 11 countries.
OMICSS-2021 served its purpose well. At its start in June, some of the participants had little to no knowledge in biology, others were new to programming, and none had worked in bioinformatics. But by the end of August, all of the participants were able to analyze the sequencing data of coronavirus genomes and identify the variants prevalent in Armenia.
Today, about 10 graduates of OMICSS-2021 and other students have found their supervisors and mentors in Armenia and abroad, and started conducting research projects. Some of the students are receiving research scholarships to fully devote their time to their projects. The projects span a wide range of topics in molecular biology, including human genome variations, biomolecular networks and pathways in health and disease, machine learning applications to study big biological data, neuroscience, ageing, microbiome, blood cancer biomarkers, gene editing, etc.
ABI plans to formally host several independent research labs established by prominent scientists from around the world. It will soon open its first international lab, supervised by Prof. Hans Binder, the managing director of the Centre for Bioinformatics at Leipzig University in Germany, and the chair of the ABI board. He will spend a few months a year in Armenia to supervise the ongoing projects and to support teaching efforts at ABI. The lab will develop methods and address bioinformatics applications in the fields of molecular medicine and biotechnology.
ABI also provides support to life scientists abroad who need bioinformatics training and data analysis in their research projects. ABI mentors students to acquire the data analysis skills needed for their postdoctoral study. The postdoc acquires bioinformatics skills via interaction with the student (Mentee) and engages them in their own research (Mentor). ABI has piloted the first Mentor & Mentee program with a postdoctoral student at Harvard University, and a data science student at the American University of Armenia (AUA). The student has received a scholarship from the ARPA Institute.
Finally, ABI has multiple ways of collaboration with the industry, including training specialists for their needs, and providing data analysis services for specific projects. These are “baby steps” supporting the entry of biotech and biomedical companies to the country. Eventually, these companies will drive further developments in the establishment of human capital, ultimately incubating spin-off companies in the long run.
ABI has an ambitious vision that will open many doors for the development of the biotechnology and biomedical sectors in Armenia. However, it is crucial to mobilize resources from Armenia and abroad for the successful implementation and growth of the institute. PhD students, postdoctoral researchers and principal investigators in the field of life and data sciences from around the world can join the initiative, and contribute their time and expertise to scientific discussions, meetups, educational and research activities. Next, even though ABI is a non-governmental entity, the government can offer state-funded research and capacity-building grants, as well as become part of international bioinformatics networks and associations.
It is through the involvement of life science researchers, data scientists, students, donors and supporters from around the world, that we can achieve a transformation of the life science sector to the data-heavy realities of the 21st century.
No glad to hear this