Title | Large-scale discovery and characterization of protein regulatory motifs in eukaryotes. |
Publication Type | Journal Article |
Year of Publication | 2010 |
Authors | Lieber DS, Elemento O, Tavazoie S |
Journal | PLoS One |
Volume | 5 |
Issue | 12 |
Pagination | e14444 |
Date Published | 2010 Dec 29 |
ISSN | 1932-6203 |
Keywords | Algorithms, Amino Acid Motifs, Computational Biology, Databases, Protein, Eukaryota, Humans, Mitochondria, Phosphorylation, Protein Processing, Post-Translational, Protein Structure, Tertiary, Proteins, Proteome, Proteomics, Saccharomyces cerevisiae, Schizosaccharomyces |
Abstract | The increasing ability to generate large-scale, quantitative proteomic data has brought with it the challenge of analyzing such data to discover the sequence elements that underlie systems-level protein behavior. Here we show that short, linear protein motifs can be efficiently recovered from proteome-scale datasets such as sub-cellular localization, molecular function, half-life, and protein abundance data using an information theoretic approach. Using this approach, we have identified many known protein motifs, such as phosphorylation sites and localization signals, and discovered a large number of candidate elements. We estimate that ~80% of these are novel predictions in that they do not match a known motif in both sequence and biological context, suggesting that post-translational regulation of protein behavior is still largely unexplored. These predicted motifs, many of which display preferential association with specific biological pathways and non-random positioning in the linear protein sequence, provide focused hypotheses for experimental validation. |
DOI | 10.1371/journal.pone.0014444 |
Alternate Journal | PLoS ONE |
PubMed ID | 21206902 |
PubMed Central ID | PMC3012054 |
Grant List | P50 GM071508 / GM / NIGMS NIH HHS / United States R01 HG003219 / HG / NHGRI NIH HHS / United States 1DP10D003787-01 / DP / NCCDPHP CDC HHS / United States 2R01HG003219 / HG / NHGRI NIH HHS / United States |