NEW YORK — A colossal international effort has yielded the first comprehensive look at how our DNA works, an encyclopedia of information that will rewrite the textbooks and offer new insights into the biology of disease.
The findings, reported Wednesday by more than 500 scientists, reveal extraordinarily complex networks that tell our genes what to do and when, with millions of on-off switches.
"It's this incredible choreography going on, of a modest number of genes and an immense number of ... switches that are choreographing how those genes are used," said Dr. Eric Green, director of the National Human Genome Research Institute, which organized the project.
The work also shows that at least 80 percent of the human genetic code, or genome, is active. That's surprisingly high and a sharp contrast to the idea that the vast majority of our DNA is junk.
Most people know that DNA contains genes, which hold the instructions for life. But scientists have long known those genetic blueprints take up only about 2 percent of the genome, and their understanding of what's going on in the rest has been murky.
Similarly, they have known that the genome contains regulators that control the activity of genes, so that one set of genes is active in a liver cell and another set in a brain cell, for example. But the new work shows how that happens on a broad scale.
It's "our first global view of how the genome functions," sort of a Google Maps that allows both bird's-eye and close-up views of what's going on, said Elise Feingold of the genome institute.
While scientists already knew the detailed chemical makeup of the genome, "we didn't really know how to read it," she said in an interview. "It didn't come with an instruction manual to figure out how the DNA actually works."
One key participant, Ewan Birney of the European Molecular Biology Laboratory in Hinxton, England, compared the new work to a first translation of a very long book.
"The big surprise is just how much activity there is," he said. "It's a jungle."
The trove of findings was released in 30 papers published by three scientific journals, while related papers appear in some other journals. In all, the 30 papers involved more than 500 authors. The project is called ENCODE, for Encyclopedia of DNA Elements.
The human genome is made up of about 3 billion "letters" along strands that make up the familiar double helix structure of DNA. Particular sequences of these letters form genes, which tell cells how to make proteins. People have about 20,000 genes, but the vast majority of DNA lies outside of genes.
So what is it doing? In recent years, scientists have uncovered uses for some of that DNA, so it was clearly not all junk, but overall it has remained a mystery.
Scientists found that at least three-quarters of the genome is involved in making RNA, a chemical cousin of DNA. Within genes, making RNA is a first step toward creating a protein, but that's not how it's used across most of the genome. Instead, it appears to help regulate gene activity.
Scientists also mapped more than 4 million sites where proteins bind to DNA to regulate genetic function, sort of like a switch. "We are finding way more switches than we were expecting," Birney said.
The discovery of so many switches may help scientists in their search for the biology of disease.
In recent years, researchers have scanned the genome and found thousands of particular DNA sequences that seem to raise the risk of disease. But many of these lie outside of genes, raising the question of how they could have any effect.
The new work found that many of these sequences fall within or near regulatory regions identified by the ENCODE project, suggesting a way they could meddle with gene activity.
Another finding raises questions about just how best to define a gene, researcher Thomas Gingeras of the Cold Spring Harbor Laboratory in New York and colleagues suggest in their report in the journal Nature. The common notion that genes are specific regions of DNA that are separated from other genes "is simply not true," he said.Comment on this story
He and colleagues said it would make more sense to define a gene as a collection of RNA molecules instead of a particular location on the DNA.
Birney said that with the finding of widespread activity across a person's DNA, scientists will be debating how much of it is really crucial to life.
Still, "it's worth reminding ourselves that we are very, very complex machines," Birney said. "It shouldn't be so surprising that the instruction manual is really pretty fearsomely complicated."
Journal Nature: http://www.nature.com/nature