Genome-Wide Identification of Coding Small Open Reading Frames: The Unknown Transcriptome

Expand
  • (a. Key Laboratory of Systems Biomedicine (Ministry of Education); b. School of Biomedical Engineering, Shanghai Jiaotong University, Shanghai 200240, China)

Online published: 2014-12-08

Abstract

The identification of the complete repertoire of functional peptides in a cell is ultimately essential for a systems-wide understanding of its behavior. There have indeed been a plethora of studies purportedly designed to this end. However, these studies in fact routinely overlook a potentially significant portion of their data that might encode for peptides that are smaller than 100 amino acids. This is largely owing to technical reasons associated with the difficulty of distinguishing, with statistical significance, a coding sequence of this length from a non-coding sequence. Recently, a growing number of studies have shown that there are indeed many small open reading frame (sORF) encoded peptides that play important roles in a wide range of different biological processes. As such, there is now significant interest in methodologies that can be used to identify this drastically neglected portion of the cellular proteome. In this review, we introduce the presently known annotated sORFs and describe the new strategies that have been used to determine the coding sORFs, genome-wide.

Cite this article

LI Hong-meia (李红梅), HU Chuan-shenga (胡传圣), BAI Lingb* (白玲) . Genome-Wide Identification of Coding Small Open Reading Frames: The Unknown Transcriptome[J]. Journal of Shanghai Jiaotong University(Science), 2014 , 19(6) : 663 -668 . DOI: 10.1007/s12204-014-1563-x

References

[1] Nekrutenko A, Makova K D, Li W H. The Ka/Ks ratio test for assessing the protein-coding potential of genomic regions: An empirical and simulation study [J]. Genome Research, 2002, 12(1): 198-202.
[2] Camby I, le Mercier M, Lefranc F, et al. Galectin-1: A small protein with major functions [J]. Glycobiology,2006, 16(11): 137R-157R.
[3] Ikeuchi M, Yamaguchi T, Kazama T, et al. ROTUNDIFOLIA4 regulates cell proliferation along the body axis in arabidopsis shoot [J]. Plant and Cell Physiology,2011, 52(1): 59-69.
[4] Galindo M I, Pueyo J I, Fouix S, et al. Peptides encoded by short ORFs control development and define a new eukaryotic gene family [J]. PLoS Biology,2007, 5(5): e106.
[5] Kondo T, Hashimoto Y, Kato K, et al. Small peptide regulators of actin-based cell morphogenesis encoded by a polycistronic mRNA [J]. Nature Cell Biology,2007, 9(6): 660-665.
[6] Savard J, Marques-Souza H, Aranda M, et al.A segmentation gene in tribolium produces a polycistronic mRNA that codes for multiple conserved peptides[J]. Cell, 2006, 126(3): 559-569.
[7] Basrai M A, Hieter P, Boeke J D. Small open reading frames: Beautiful needles in the haystack [J].Genome Research, 1997, 7(8): 768-771.
[8] N¨asel D R, Winther °A M E. Drosophila neuropeptides in regulation of physiology and behavior [J].Progress in Neurobiology, 2010, 92(1): 42-104.
[9] Kastenmayer J P, Ni L, Chu A, et al. Functional genomics of genes with small open reading frames (sORFs) in S. cerevisiae [J]. Genome Research, 2006,16(3): 365-373.
[10] Hanada K, Akiyama K, Sakurai T, et al. sORF finder: A program package to identify small open reading frames with high coding potential [J]. Bioinformatics,2010, 26(3): 399-400.
[11] Ladoukakis E, Pereira V, Magny E G, et al.Hundreds of putatively functional small open reading frames in drosophila [J]. Genome Biol, 2011, 12(11):R118.
[12] The Uniprot Consortium. Update on activities at the universal protein resource (UniProt) in 2013 [J]. Nucleic Acids Research, 2013, 41(D1): D43-D47.
[13] Werner M, Feller A, Messenguy F, et al. The leader peptide of yeast gene CPA1 is essential for the translational repression of its expression [J]. Cell, 1987,49(6): 805-813.
[14] Akimoto C, Sakashita E, Kasashima K, et al.Translational repression of the McKusick-Kaufman syndrome transcript by unique upstream open reading frames encoding mitochondrial proteins with alternative polyadenylation sites [J]. Biochimica et Biophysica Acta, 2013, 1830(3): 2728-2738.
[15] Casson S A, Chilley P M, Topping J F, et al. The POLARIS gene of arabidopsis encodes a predicted peptide required for correct root growth and leaf vascular patterning [J]. The Plant Cell, 2002, 14(8): 1705-1721.
[16] Magny E G, Pueyo J I, Pearl F M, et al.Conserved regulation of cardiac calcium uptake by peptides encoded in small open reading frames [J]. Science, 2013,341(6150): 1116-1120.
[17] Vanderperre B, Staskevicius A B, Tremblay G,et al. An overlapping reading frame in the PRNP gene encodes a novel polypeptide distinct from the prion protein [J]. The FASEB Journal, 2011, 25(7): 2373-2386.
[18] Slavoff S A, Mitchell A J, Schwaid A G, et al. Peptidomic discovery of short open reading frameencoded peptides in human cells [J]. Nature Chemical Biology, 2013, 9(1): 59-64.
[19] Ghaemmaghami S, Huh W K, Bower K, et al.Global analysis of protein expression in yeast [J]. Nature,2003, 425(6959): 737-741.
[20] Ingolia N T, Brar G A, Rouskin S, et al. The ribosome profiling strategy for monitoring translationin vivo by deep sequencing of ribosome-protected mRNA fragments [J]. Nature Protocols, 2012, 7(8): 1534-1550.
[21] Brar G A, Yassour M, Friedman N, et al. Highresolution view of the yeast meiotic program revealed by ribosome profiling [J]. Science, 2012, 335(6068):552-557.
[22] Dunn J G, Foo C K, Belletier N G, et al. Ribosome profiling reveals pervasive and regulated stop codon readthrough in drosophila melanogaster [J].eLife, 2013, 2: e01179.
[23] Li G W, Oh E, Weissman J S. The anti-shinedalgarno sequence drives translational pausing and codon choice in bacteria [J]. Nature, 2012, 484(7395):538-541.
[24] Stern-Ginossar N, Weisburd B, Michalski A,et al. Decoding human cytomegalovirus [J]. Science,2012, 338(6110): 1088-1093.
[25] Ingolia N T, Lareau L F, Weissman J S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes [J]. Cell, 2011, 147(4): 789-802.
[26] Lee S, Liu B, Lee S, et al. Global mapping of translation initiation sites in mammalian cells at singlenucleotide resolution [J]. Proceedings of the National Academy of Sciences, 2012, 109(37): 2424-2432.
[27] Guttman M, Russell P, Ingolia N T, et al. Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins [J]. Cell, 2013, 154(1):240-251.
[28] Chew G L, Pauli A, Rinn J L, et al. Ribosome profiling reveals resemblance between long non-coding RNAs and 5’ leaders of coding RNAs [J]. Development,2013, 140(13): 2828-2834.
[29] Brannan C I, Dees E C, Ingram R S, et al. The product of the H19 gene may function as an RNA [J].Molecular and Cellular Biology, 1990, 10(1): 28-36.
[30] Sotomaru Y, Katsuzawa Y, Hatada I, et al.Unregulated expression of the imprinted genes H19 and Igf2r in mouse uniparental fetuses [J]. Journal of Biological Chemistry, 2002, 277(14): 12474-12478.
[31] Lin M F, Jungreis I, Kellis M. PhyloCSF: A comparative genomics method to distinguish protein coding and non-coding regions [J]. Bioinformatics, 2011,27(13): 275-282.
[32] Washietl S, Findei S, M¨uler S A, et al. RNAcode:Robust discrimination of coding and noncoding regions in comparative sequence data [J]. RNA, 2011, 17(4):578-594.
[33] Kong L, Zhang Y, Ye Z Q, et al. CPC: Assess the protein-coding potential of transcripts using sequence features and support vector machine [J]. Nucleic Acids Research, 2007, 35(Sup 2): W345-W349.
[34] Wang L, Park H J, Dasari S, et al. CPAT: Codingpotential assessment tool using an alignment-free logistic regression model [J]. Nucleic Acids Research, 2013,41(6): e74.
[35] Sun K, Chen X, Jiang P, et al. iSeeRNA: Identification of long intergenic non-coding RNA transcripts from transcriptome sequencing data [J]. BMC Genomics,2013, 14(Sup 2): S1-S7.
[36] Crapp′e J, Van Criekinge W, Trooskens G, et al.Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs [J]. BMC Genomics, 2013, 14(1): 648-660.
[37] Yang X, Tschaplinski T J, Hurst G B, et al.Discovery and annotation of small proteins using genomics,proteomics, and computational approaches [J].Genome Research, 2011, 21(4): 634-641.
[38] Gascoigne D K, Cheetham S W, Cattenoz P B,et al. Pinstripe: A suite of programs for integrating transcriptomic and proteomic datasets identifies novel proteins and improves differentiation of proteincoding and non-coding genes [J]. Bioinformatics, 2012,28(23): 3042-3050.
[39] Menschaert G, Van Criekinge W, Notelaers T,et al. Deep proteome coverage based on ribosome profiling aids mass spectrometry-based protein and peptide discovery and provides evidence of alternative translation products and near-cognate translation initiation events [J]. Molecular & Cellular Proteomics,2013, 12(7): 1780-1790.

Options
Outlines

/