A universal framework for regulatory element discovery across all genomes and data types.

Submitted by als2076 on August 12, 2020 - 10:18am

Title	A universal framework for regulatory element discovery across all genomes and data types.
Publication Type	Journal Article
Year of Publication	2007
Authors	Elemento O, Slonim N, Tavazoie S
Journal	Mol Cell
Volume	28
Issue	2
Pagination	337-50
Date Published	2007 Oct 26
ISSN	1097-2765
Keywords	Animals, Cluster Analysis, Databases, Genetic, DNA, Fungal, DNA, Protozoan, Gene Expression Profiling, Gene Expression Regulation, Gene Expression Regulation, Fungal, Humans, Mice, MicroRNAs, Nucleic Acid Conformation, Oligonucleotide Array Sequence Analysis, Plasmodium falciparum, Regulatory Elements, Transcriptional, Regulatory Sequences, Nucleic Acid, Reproducibility of Results, RNA, Fungal, RNA, Protozoan, Saccharomyces cerevisiae, Sequence Analysis, DNA, Sequence Analysis, RNA, Sequence Homology, Nucleic Acid, Software, Time Factors, Transcription Factors, Untranslated Regions
Abstract	Deciphering the noncoding regulatory genome has proved a formidable challenge. Despite the wealth of available gene expression data, there currently exists no broadly applicable method for characterizing the regulatory elements that shape the rich underlying dynamics. We present a general framework for detecting such regulatory DNA and RNA motifs that relies on directly assessing the mutual information between sequence and gene expression measurements. Our approach makes minimal assumptions about the background sequence model and the mechanisms by which elements affect gene expression. This provides a versatile motif discovery framework, across all data types and genomes, with exceptional sensitivity and near-zero false-positive rates. Applications from yeast to human uncover putative and established transcription-factor binding and miRNA target sites, revealing rich diversity in their spatial configurations, pervasive co-occurrences of DNA and RNA motifs, context-dependent selection for motif avoidance, and the strong impact of posttranscriptional processes on eukaryotic transcriptomes.
DOI	10.1016/j.molcel.2007.09.027
Alternate Journal	Mol. Cell
PubMed ID	17964271
PubMed Central ID	PMC2900317
Grant List	R01 HG003219 / HG / NHGRI NIH HHS / United States R01 HG003219-04 / HG / NHGRI NIH HHS / United States P50 GM071508 / GM / NIGMS NIH HHS / United States