For nearly two decades, researchers at HudsonAlpha Institute for Biotechnology, along with collaborators at Caltech, Penn State and UC Irvine, have been trying to understand how the human genome functions. As members of the Encyclopedia of DNA Elements (ENCODE) Project, a worldwide effort, they have been performing experiments and data analysis to elucidate the importance of the 98% of the genome that does not code for proteins.
During phase three of ENCODE, consortium researchers drew closer to their goal of developing a comprehensive map of the functional elements of human and mouse genomes by adding to the database millions of candidate DNA switches that regulate when and where genes are turned on.
Results, including a manuscript from the Myers and Mendenhall labs at HudsonAlpha, were published July 30 as a nine-manuscript compendium in the scientific journal Nature, accompanied by 21 additional in-depth studies published in other major journals.
All of our cells (with a few exceptions) contain the same two full copies of our genome. However, for the more than 200 different cell types in our body to perform different functions, not all of the genes can be active in every cell type. Transcription factor proteins control gene expression by binding to short stretches of DNA that serve as on/off switches for genes.
As part of the ENCODE 3 Project, HudsonAlpha researchers, along with their collaborators, performed the largest study of transcription factors expressed at normal levels to date.
“Understanding the genomic targets of transcription factors is vitally important to understand many aspects of biology, including gene regulation, development, and to help identify the biological mechanisms of many diseases and disorders,” said Chris Partridge, senior scientist at HudsonAlpha and co-first author of the study, along with Ph.D. student Surya Chhetri.
By analyzing such a large group of transcription factors, the researchers were able to identify novel associations, elaborate on their spatial interactions on DN, and distinguish between those that interact with promoters and those that interact with enhancers in the genome.
A key hallmark of the project is the complete and rapid open-access availability of data generated by members of the ENCODE consortium, which has led to more than 2,000 publications from researchers who used data generated by ENCODE.
Rick Myers, whose lab has been a member of the ENCODE Consortium since its inception in 2003, said he is proud of the long-term and wide-reaching collaborative nature of the project.
“One thing we learned working on the Human Genome Project is that huge endeavors like the ENCODE Project work much more efficiently when research groups coordinate their efforts, particularly because ENCODE’s charge is to generate a resource of data, materials and results that are meant to be used by the entire research and biomedical community,” said Myer, who is president, science director and M.A. Loya Chair in Genomics at HudsonAlpha. “Another thing we recognized early on is that making the data freely available to everyone on a weekly basis, prior to publication and with no strings attached, allows researchers everywhere to make advances in their research much faster than would otherwise be possible.”
This collaborative effort included HudsonAlpha researchers Mark Maciewicz, Kim Newberry, Sarah Meadows, Laurel Brandsmeier, Megan McEown, Amy Nesmith, Rosy Nguyen, Luke Messer, Ryne Ramaker, Andrew Hardigan and Dan Savic.
To learn more about ENCODE, listen to Myers talk on a podcast released by Nature.