Today the human genome has been fully sequenced. And the enormous quantity of data that is contained within the DNA molecules makes it possible to analyse relationships between species. Researchers at the Bioinformatics Centre (BiRC) are developing methods to deal with the large quantities of data.
A broad perspective approach and the correct methods are required in order to obtain reliable results from the data contained in the human genome. Asger Hobolth, Assoc. Prof. at the Department of Mathematical Sciences at Aarhus University and affiliated to BiRC, explains:
“The methods that we have developed are able to scan through the whole genome and in this way give very reliable indications of the relationship between the humans, the chimpanzee, the gorilla and the orang-utan, for example. By comparing the DNA from the different species we can identify a number of patterns that can be used in our statistical analyses”.
However, dealing with the large quantities of data is not an easy task:
“The algorithm of the method is very simple, otherwise we wouldn’t be able to analyse so much data. Genetics and statistics are the more complicated areas. The input to the algorithm is a table with 36 states and 2 billion positions and the output is a number of very large tables that can be used to define the probability of a relationship between two species”, explains Assoc. Prof. at BiRC, Thomas Mailund.
It can take up to 100 CPU days if the algorithm is executed on BiRC’s own machines. If this is to be made more efficient, we borrow computer power from other research institutes and the CPU time can be reduced to “only” a week.
This is why people are different
The original aim of the method was to establish a date for speciation, for example when humans and chimpanzees separated into independent species from their common ancestor. Other methods are available, but the method developed at BiRC can be used to substantiate the conclusions that have been reached using a different method. The most sensational result that the method has thrown up to now is to make it highly probable that the human species is only 4-5 million years old and not 5-7 million as many people have until now assumed.
Mikkel Schierup, biologist at BiRC, explains:
“Using DNA sequencing, we can now say something about why people are different. And we can also say something about the risk of developing different diseases. Similarly, we can look at the DNA information of related species and in the historical perspective say something about the original species’ “risk” of developing into a human. And this actually puts us in a position to say something about how natural selection functioned around the time of speciation.”
When we have established the probability of a relationship between species the perspectives are broad, according to Mikkel Schierup. As an example, he mentions a theory that researchers at the Broad Institute, which is collaboration between MIT and Harvard, have put forward:
“Researchers at the Broad Institute believe that during the phase in which the human and chimpanzee species separated there occurred a certain degree of “hybridisation” where the two future species continued to reproduce. Eventually as additional data becomes available, we will be able to use our method to verify this claim.