Great scientific discoveries, including those that result in solutions for medicine or technology after many years, always begin with basic research. In the case of the first complete sequence of human genome, applications will also follow, says geneticist Professor Paweł Golik.
In the first days of April 2022, an international group of scientists working in the Telomere-to-Telomere (T2T) consortium published the first fully complete sequence of the human genome. This is a breakthrough achievement; while the Human Genome Project revealed the large majority (92 percent) of the human DNA sequence 20 years ago, 8 percent of it remained a mystery until now.
Professor Paweł Golik, a geneticist and director of the Institute of Genetics and Biotechnology at the University of Warsaw talks about the importance of this discovery, its potential applications, and about why great science begins with the desire to know the world - and not to implement specific solutions.
PAP: Professor, why is the discovery recently described in Science making such waves? the Human Genome Project revealed almost all (92 percent) of the human DNA sequence 20 years ago. Do these remaining 8 percent contribute so much to our knowledge or will it translate into new applications?
Professor Paweł Golik: I do not agree with the approach that something is important only when it directly and immediately translates into practical applications. Science does not work like that. All great scientific discoveries, including those that after many years actually produce practical solutions for medicine, technology, etc., always begin with basic research. And the main motivation for this research is simply a desire to understand the world, explore it.
This is also the case here, with the human genome project. We do have various future applications in mind, but above all we should focus on the fact that this project began and continued because it was interesting, fascinating, it expanded our knowledge about the world. And only from this curiosity, desire for knowledge, we sooner or later move to practical applications.
For example, when more than half a century ago Rosalind Franklin, James Watson and Francis Crick began to explore the DNA structure, they did not think about gene therapy, diseases and biotechnology. They were simply interested in what DNA looks like. Everything else came later.
Now we should also begin with the fact that this is basic research that will translate into various practical solutions in the future, but this is not its direct purpose. And, while we report on scientific discoveries to the public, we should not focus only on practical aspects, because this can lead to disappointment. People expect quick development of new drugs or technologies and ask us where they are. That was the case, for example, a decade after the end of the first stage of the human genome project, in 2001. Colloquially speaking: possible, but very far-reaching applications were used to 'sell' that project. As a result, years later there were questions where these innovative drugs were, or this personalized medicine. And all this does happen, but it takes time.
PAP: Does that mean that the discovery of the missing 8 percent of the genome is so important for cognitive reasons?
Professor Paweł Golik: When it comes to this project, in principle, it did not reveal any genes whose counterparts we would not know before, or anything innovative. But despite that, cognitively, but also practically, it is an extremely important discovery for human genomics. For many reasons.
PAP: What are those reasons?
Professor Paweł Golik: Firstly, the previous methods of exploring the genome focused on reading the sequence, the order of nucleotides in the DNA molecule, in quite short fragments. From these fragments, researchers then tried to 'put together' whole genomes. What was overlooked and insufficiently understood was the fact that in the genome of multicellular organisms, and thus also humans, a very large number of areas consisting of many of the same (or similar) repetitions of some fragments. Previous methods of genome study did not allow to investigate these repeated fragments.
This can be illustrated in this way: if we divide something into tiny pieces and then we want put them together, it turns out that one small piece can fit in many different places, it can repeat in many copies. And in the beginning we do not know how many such copies there are or how they are arranged. Computer algorithms that were used to put together gene sequences from these small fragments could not handle such repetitions. That is why the genome announced 20 years ago contained a lot of gaps in places of repeated sequences.
PAP: What is the role of such sequences in the human genome?
Professor Paweł Golik: They are characteristic of, for example, the regions of chromosomes called centromeres. These are the fragments that are needed for the chromosomes to divide evenly into daughter cells during the cellular division.
PAP: Its a kind of chromosome waist?
Professor Paweł Golik: Yes it is the narrow part. That is where cytoskeleton fibres that take part in cell division attach and tear the chromosome apart, and the chromosome divides into new daughter cells.
And we only now fully understood these regions of the genome. Thanks to this, we will be able to better study how exactly the process of cell division works. This, in turn, can be important for our understanding of certain genetic disorders, because there are disorders that are the result of uneven chromosome division. Most of them are very serious defects that occur at the stage of gamete formation and are associated with an abnormal number of chromosomes. But such phenomena also take place in cancer cells. And that is another aspect that we will finally be able to explore.
This genome that we have now obtained is complete, free of gaps.
PAP: What else is groundbreaking in this discovery?
Professor Paweł Golik: Another very important thing is that the earlier model sequence of the human genome was a kind of composite. It was established on the basis of different cells from different people. Consequently, it did not represent any particular genotype of any particular human. This time we have a single, specific genotype. Maybe not from a person, because it comes from a molar pregnancy that occurs when the process of fertilization and embryonic development goes wrong, but it is a fully human genome.
PAP: Why did it take so many years to describe the last 8 percent of DNA sequences?
Professor Paweł Golik: Only now, in the last decade, techniques have emerged that allow to sequence DNA not only in tiny fragments we talked about earlier, but also in much longer ones. This allowed us to discover regions composed of many repeating pieces of DNA.
PAP: Perhaps you could mention the possible implications of this discovery?
Professor Paweł Golik: For example, a consortium has even been established as the next stage of the T2T project. It will study the complete genomes of very different people. We will obtain complete sequences of representatives of different populations and ethnic groups inhabiting different regions of our planet. This is extremely important because although we already have quite extensive databases of genetic variation on Earth, including fossil DNA samples from tens of thousands of years ago, the part that eluded us so far was precisely the variability within these repeated areas. And they also have a very important role in evolution. With these new tools, we will be able to better recreate the history of our species.
PAP: Does that mean that people also differ in these repeated fragments?
Professor Paweł Golik: Yes, we know that there is a lot of variation in these repetitions and their number. Until now, these regions were inaccessible. And now that we have a complete pattern and the techniques to study it, we can understand the entirety of human genetic variation.
I am convinced that it is only in these next stages of the human genome project that spectacular discoveries will come. They will be based on the articles that have just appeared in Science.
PAP: Will we also learn something about the genetic background of various diseases?
Professor Paweł Golik: Not only diseases, but also human traits that make each of us different. They, too, result from a combination of genes and environmental factors, roughly in half. The matter is additionally complicated by the fact that in the great majority of cases, that way that genes work, one gene does not determines one trait. Many people have a problem with understanding this, because at school we learn genetics on a very simple Mendelian inheritance model, where, for example, we have traits such as the colour of pea flowers and one gene whose variants determine whether these flowers are white or red.
And, unfortunately, we then transfer thus understood genetics to humans and think that there should be one gene that determines, for example, our intellectual and sports capacity, or the tendency to develop diabetes, heart disease, etc.
Meanwhile, it is almost never like that. Yes, humans have monogenic traits, but there are relatively few of them. They are, for example, associated with very rare genetic diseases such as cystic fibrosis and Duchenne muscular dystrophy. The rest of our traits are the result of the action of many genes. Take such a seemingly simple trait as height. We know that it is influenced by at least 1,000 different genes that interact with environmental factors.
We've had a big problem with that so far; even when we were able to identify all changes in the gene sequence that correlate with a specific trait, we did not even come close to those, for example, 50 percent. dependent on genetics. There was always a missing piece. We knew it was some genetic variation, but we couldn't find it. Or perhaps it is the variability of the repeated sequences we now know. It eluded us completely because we did not have good tools to study it.
PAP: And now we have a complete model that can be used in further research?
Professor Paweł Golik: And this model is very important. I am convinced that in the coming years we will see many publications in which new genetic factors responsible for various human traits will be revealed, resulting from changes in the number of repeated fragments.
For example, when we study the genome of a representative of an ethnic group to study its prehistory, or the genome of a sick person to compare it with the genome of people who do not suffer from this condition, first we need to find out how a person differs from this model. It can be compared to a jigsaw puzzle: it is much easier for us to solve it when we have a photo of the end result on the box. Then we can compare our pieces with this picture and put them in the right places. The same is true in the case we're talking about: when we don't have a model, this reference sequence, it's like solving a jigsaw puzzle without a picture.
Thanks to this discovery, we have obtained this picture we can now use to solve our genetic puzzles. There are no gaps, blank spots anymore. We can now understand human genetic variability much better and more easily, and once we do fully understand it, will we be able to think about possible treatments, disease prediction, etc. These things are much further in the future. So we are dealing with a very important discovery that advances our knowledge enormously, but we should not count on its applications to appear in pharmacies in a year or five.
PAP: Are the repeated fragments we are talking about mostly non-coding parts of DNA, or do they contain protein-coding genes?
Professor Paweł Golik: Most of the non-coding DNA fragments are indeed repetitive. These include the aforementioned centromeres and telomeres - the endings of chromosomes. But repetitions also happen in areas that contain genes. There aren't many of them: maybe a hundred out of 20,000 protein-coding genes, but they could turn out to be very important. And again: only now will we be able to find out.
PAP: You mentioned the study of variability between populations living in different regions of the world. I understand that this is important not only for medicine, but also for the history of mankind, anthropogenesis?
Professor Paweł Golik: Yes, because our genomes are known to be different. The genetic material varies from one generation to the next. It reflects the kinship between people, and the history of kinship is the history of migration: from the first people in Africa, through successive generations spreading to all continents.
I should also emphasise that in modern biology we do not use the term 'race', because it is impossible to clearly divide people into 3 or 5 groups.
PAP: Does our genome contain a record of ancestral migration and mixing?
Professor Paweł Golik: One of the articles recently published in Science is even devoted to the evolutionary variability of centromere regions. Until now, they have been completely ignored in analyses of human origins. This publication confirmed something that had been suggested previously by the study of DNA sequences: that humans came from Africa and over time settled on other continents. Thanks to this discovery, we will certainly be able to solve a lot more historical puzzles.
PAP: To sum up: even though 8 percent is not much, it brings a lot of knowledge? It that how we should think about it?
Professor Paweł Golik: In science, we don't begin with thinking about something practical. We always begin with the will to know. Only with time do specific applications come. Such applications will also be here. Initially in the form of a better understanding of how certain genetic diseases arise, but over time, based on this knowledge, also better diagnosis and prediction of these diseases. The way to treatment is further still.
PAP - Science in Poland, Katarzyna Czechowicz
kap/ zan/ kap/
tr. RL