The complete genetic blueprint of humans was mapped out by scientists in 2001 as part of the Human Genome Project. Surprisingly, they only found around 20,000 genes that produce proteins – just twice the number of genes in a common fly. In the twenty years since, advances in technology have allowed researchers a closer look at possible stretches of the DNA that could code for proteins. Proteins are the workhorses of the body, and enable our cells and organs to function properly and stay healthy.
Possible proteins
Now, researchers from 20 institutions worldwide have brought together more than 7,200 unrecognized gene segments that potentially code for new proteins. For the first time, the study catalogues stretches of DNA that have been discovered using Ribo-seq, a relatively new technology that looks in detail at the protein-producing machinery in cells to find possible proteins in humans.
Huge step forward
The study published today in the prestigious journal Nature Biotechnology and was co-led by dr. Sebastiaan van Heesch, group leader at the Princess Máxima Center, working with colleagues from Germany, the United Kingdom and the United States. Van Heesch: ‘Our research marks a huge step forward in understanding the genetic make-up and complete number of proteins in humans. It’s tremendously exciting to enable the research community with our new catalog. Our work could pave the way for brand-new research avenues into human health and disease, including childhood cancer. It’s too soon to say whether all of the unexplored sections of DNA truly represent proteins, but we can clearly see that something unexplored is happening across the human genome and that the world should be paying attention.’
Enabling further research
The stretches of DNA brought together in the new catalogue are known as open reading frames (ORFs), for the way the genetic information is decoded to make proteins. Until now, newly discovered ORFs were not made easily accessible for other researchers to further study their relevance for human health and disease. The research consortium collected thousands of such ORFs from previous studies and integrated the data into the major human genome and protein databases. The team encourages the wider scientific community to build on their efforts by considering the ORFs in their future research.
What makes us human
Traditionally, scientists have identified protein-coding regions in genes by comparing DNA sequences from multiple species. But this method has a drawback: coding regions that arose relatively recently in evolution have been missed, and are therefore missing from research reference databases.
‘It is especially remarkable that most of these 7,200 ORFs are exclusive to primates and might represent evolutionary innovations unique to our species,’ says Jorge Ruiz-Orera, co-author of the new study and evolutionary biologist in the Hübner lab at the Max Delbrück Center in Germany. ‘This shows how these elements can provide important hints of what makes us human.’
Childhood cancer research
Many of the ORFs collected in the catalog will turn out to code for proteins that play a role in human traits and diseases, the scientists expect. That’s what makes the research also valuable for research into childhood cancer, says Van Heesch, whose research group focuses on immunotherapy development, particularly for children with cancer. ‘Many of the human-specific proteins are produced during early development, exactly the phase when most childhood cancers arise. And now we have a fuller picture of human-specific proteins, we can distinguish better between ‘normal’ proteins and those unique to childhood cancer – an essential step in developing new immunotherapies.’
Van Heesch co-led the new research together with dr. Jorge Ruiz-Orera from Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC) in Germany, dr. Jonathan Mudge from the European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI) in the United Kingdom, and dr. John Prensner from the Broad Institute of MIT and Harvard in the United States. The study was supported by a wide range of funding bodies and charities in the participating countries.