Nature Reviews Genetics | AOP, published online 28 January 2014; doi:10.1038/nrg3645

PERSPECTIVES A P P L I C AT I O N S O F N E X T- G E N E R AT I O N S E Q U E N C I N G — I N N OVAT I O N

Ribosome profiling: new views of translation, from single codons to genome scale Nicholas T. Ingolia

Abstract | Genome-wide analyses of gene expression have so far focused on the abundance of mRNA species as measured either by microarray or, more recently, by RNA sequencing. However, neither approach provides information on protein synthesis, which is the true end point of gene expression. Ribosome profiling is an emerging technique that uses deep sequencing to monitor in vivo translation. Studies using ribosome profiling have already provided new insights into the identity and the amount of proteins that are produced by cells, as well as detailed views into the mechanism of protein synthesis itself. Gene expression patterns underlie the diverse fates that are adopted by genetically identical cells in an organism, and regulated expression changes allow these cells to maintain homeostasis and respond to the environment. Genome-wide expression profiling has thus emerged as a key tool for examining cellular physiology and for studying the molecular basis of the regulation of gene expression1. Most expression profiling is carried out either by microarray or, more recently, by mRNA sequencing (RNA-seq), both of which report on the transcripts that are present in cells2. Analyses of mRNA abundance have revealed diverse transcriptional gene expression programmes that have informed about almost every area of biology. Transcript abundance is often a proxy for the synthesis of a protein that actually provides a molecular function. However, the correlation between mRNA and protein levels is frequently poor, which is partly due to the regulation of translation. Direct analyses of translation would provide a more accurate and complete measure of gene expression in the cell than the picture that is presented by analysing mRNA levels alone. These measurements would reveal post-transcriptional gene expression

programmes and permit the dissection of underlying regulatory mechanisms. Microarrays and RNA-seq can be adapted to specifically profile ribosome-bound mRNAs, which provide some information about their translation, but none of these approaches indicate the regions within a transcript that are actually translated. Obtaining this information requires an experimental approach to catalogue the full range of polypeptides that are expressed from the genome. Such an approach would also provide a clearer view of how the process of translation, which has been studied extensively in reconstituted systems, works in living cells. Ribosome profiling has emerged as a technique that takes advantage of the remarkable recent advances in sequencing technology to provide global measurements of translation that are as precise and detailed as data derived from RNA-seq3. In this Innovation article, I first describe the experimental basis of ribosome profiling and the interpretation of the sequencing data that it generates. I then survey results from recent studies (BOX 1). Ribosome profiling has been used in three broad ways: to identify translated sequences within the complicated mammalian transcriptome, to

NATURE REVIEWS | GENETICS

monitor the process of translation and the maturation of nascent polypeptides in vivo, and to measure profiles of cellular protein synthesis. I finish by highlighting a few of the most promising future applications and adaptations of this technique. Genome-wide ribosomal profiling The emergence of mRNA profiling by microarray permitted early global analyses of translation, and such profiling is carried out by comparing ribosome-bound mRNAs, which are undergoing active translation, with the total mRNA pool in the cell4. This separation between translated and non-translated mRNAs is qualitative, but it can be made more quantitative by using ultracentrifugation to separate transcripts into several fractions on the basis of the number of ribosomes bound to each of them5,6. Broad application of this polysome profiling approach is hindered by the technical difficulty of polysome fractionation, which does not cleanly resolve the fractions that have more than a few ribosomes per transcript, as well as by the need to collect and analyse many fractions per sample. Furthermore, analyses of polysomal RNA cannot indicate the exact position of ribosomes on an mRNA; at best, they count the typical number of ribosomes that are bound anywhere on the transcript. Ribosome profiling emerged to address these limitations in analysing translation from intact polysomes.

Mapping ribosomes on mRNAs by footprinting. Nuclease footprinting has long been recognized as a way to determine the positions of ribosomes on the mRNAs that they are translating 7. Cycloheximidetreated ribosomes physically enclose 28–30 nucleotides of the transcript, which shields this region from nuclease digestion8. These protected RNA fragments indicate the exact location of the ribosome. In early studies, neither microarrays nor RNA-seq was available, and the resulting fragments were analysed either by direct internal radiolabelling of the mRNA that is being translated or by primer extension, both of which could be applied only to the footprints from homogeneous in vitro translation of a single ADVANCE ONLINE PUBLICATION | 1

© 2014 Macmillan Publishers Limited. All rights reserved

PERSPECTIVES transcript. Despite this limitation, ribosome footprinting enabled important early studies to map the locations of translation initiation sites for the first time and to identify the positions where the translation of secreted proteins stalled before membrane translocation7,8. Ribosome profiling uses high-throughput sequencing to characterize the complex pool of footprints that result from ribosomes translating in vivo3 (FIG. 1). Deep sequencing experiments currently read the nucleic acid sequences of tens of millions of individual template molecules in parallel and can measure the abundance of different templates in a complex mixture by counting the number of times that each individual sequence is read. Such read counting underlies RNA-seq expression measurements, in

which the abundance of an mRNA determines the number of sequencing reads that align to it 2. In ribosome profiling, the template molecules each reflect the position of one ribosome rather than the arbitrary fragments that constitute the template molecules in RNA-seq; therefore, ribosome profiling measures the number of ribosomes that are translating the mRNA in vivo instead of the abundance of the transcript in the cell. Analysing individual ribosomes avoids many of the challenges of polysome fractionation and provides the positions of ribosomes on the transcript. The similarities between ribosome profiling and RNA-seq for analysing gene expression allow the use of well-established bioinformatic tools with ribosome profiling data. Ribosome footprint sequences can

Box 1 | Major observations from diverse applications of ribosome profiling Annotation of translated sequences • Extensive upstream initiation at non-AUG codons in yeast3 • Extensive upstream initiation at non-AUG codons in mammalian cells and translation at the 5ʹ end of some long interspersed non-coding RNAs (lincRNAs)14,21,22 • Many upstream open reading frames (uORFs) that start at AUG and non-AUG codons in yeast, the translation of which is induced by meiosis19 • Diverse short reading frames in human cytomegalovirus29 Mechanisms of protein synthesis • Slow elongation on the first 30–40 codons of most genes in yeast3 • Slow decoding and/or slow tRNA pairing at wobble codons in humans and Caenorhabditis elegans41 • Trigger factor binds at ~120 codons downstream from the translational start site1 • Internal Shine–Dalgarno sequences stall bacterial elongation39 • Oxidative stress induces an arrest at ~50 codons downstream from the translational start site in yeast49 • Heat shock induces an arrest at ~50 codons downstream from the translational start site in mouse cells48 • Protein misfolding induces an arrest at ~50 codons downstream from the translational start site in human cells50 • Sequential and coordinated co‑translational protein folding52 • Codon adaptation, mRNA structure and amino acid charge all correlate with elongation speed42 • Slowed translation downstream of clusters of positive residues45 • Translation initiation is rate-limiting except in amino acid starvation15 Translational control of gene expression • MicroRNAs (miRNAs) affect both mRNA abundance and translation of target genes61 • Translational repression of ribosome biogenesis and increased translation of membrane proteins as embryonic stem cells differentiate into embryoid bodies14 • miRNAs first repress translation and then promote mRNA turnover62 • Active site mammalian target of rapamycin (mTOR) inhibitors translationally repress ribosome biogenesis57 • Just‑in‑time translational induction in yeast meiosis and sporulation; translational control by transcriptional changes in mRNA 5ʹ leaders19 • In C. elegans, miRNAs simultaneously decrease mRNA abundance and translation63 • Only uORFs that start at AUG codons repress translation16 • Out‑of‑frame internal initiation induces nonsense-mediated mRNA decay16

2 | ADVANCE ONLINE PUBLICATION

be mapped onto the transcriptome using splicing-aware alignment tools, such as TopHat 9, after removing linker sequences that are added during library generation. Statistical models that have been developed for RNA-seq data, which are implemented in tools such as DESeq10 and edgeR11, can test for differential expression in translational data. However, ribosome profiling data do differ from RNA-seq data in their distribution across the transcript (see below) and contain additional information that cannot be extracted by tools that are designed for RNA-seq. Bioinformatic tools that are specifically designed for ribosome profiling data have started to emerge12,13, and their further development is an important frontier for the technique. Interpreting ribosome occupancy profiles. Ribosome profiling data (FIG. 2a) contain detailed information about in vivo translation and delineate the regions of the transcriptome that are actually translated. Not all parts of an mRNA yield ribosomeprotected fragments3. Footprints mainly cover coding DNA sequences (CDSs; also known as protein-coding sequences)) within the mRNA and are almost entirely absent from the 3ʹ untranslated regions (UTRs) further downstream. The patterns of footprints around the start and stop codons — which are derived from ribosomes initiating and terminating translation, respectively — calibrate the exact position of the active codon within the larger footprint. Ribosome positions can be determined with subcodon, single-nucleotide precision, although individual footprints show some variation in length, which probably results from incomplete nuclease digestion. Much of the information in a ribosome profiling experiment is contained in the footprint density, which varies between different genes and across a single mRNA. The average ribosome density on an mRNA species provides an estimate of the synthesis level of the encoded protein, provided that the speed of translation elongation is similar across the genes and conditions that are being investigated. Each translation initiation event results in the production of one protein molecule, but the time required to synthesize that molecule is roughly proportional to the length of the CDS. Longer CDSs will have more ribosomes engaged in their translation than shorter CDSs that produce the same amount of protein, but the density on these CDSs will be the same. Furthermore, footprint density is proportional to the total number of ribosomes that www.nature.com/reviews/genetics

© 2014 Macmillan Publishers Limited. All rights reserved

PERSPECTIVES translate the mRNAs of a gene, and ribosome profiling alone can thus account for changes in total protein synthesis that result from differences in either mRNA abundance or translation (FIG. 2b) provided that the speed of elongation does not change. Ribosomes spend different lengths of time at different positions within a reading frame, which leads to variation in footprint density across a transcript 14. As discussed below, measurements of elongation speed can provide mechanistic insights into translation in vivo. However, differences in elongation speed may also confound expression measurements by uncoupling ribosome density from the initiation rate. Fortunately, independent lines of evidence suggest that averaging occupancy across an entire gene corrects for the consequences of local variation14. Detailed models of elongation, which are partly constructed from profiling data, indicate that initiation rate rather than elongation speed is the major determinant of protein production, at least in log-phase yeast 15. Furthermore, changes in elongation speed between different conditions are likely to manifest as shifts in ribosome density that can be detected by profiling (see below). Limitations of ribosome profiling. Ribosome profiling data comprise the sequences of RNA fragments that are protected by the ribosome and, similarly to other sequencingbased assays, such data therefore depend on the mapping of these fragments to the genome2. As ribosome-protected fragments are fairly short, it is particularly challenging to deconvolve repetitive sequences or alternative transcripts in these data, and longer or paired-end sequencing reads cannot be used. Furthermore, by focusing on individual ribosomes and their footprints rather than on entire transcripts, ribosome profiling does discard some information that is found in polysome profiles. Nuclease digestion degrades the 5ʹ and 3ʹ UTRs of transcripts, which may contain regulatory information. There are cases in which transcript variants with different 5ʹ and 3ʹ UTRs affect translation, and ribosome profiling cannot deconvolve these transcripts when multiple variants are present simultaneously 16. Footprinting also separates the ribosomes that constitute a polysome and obscures the presence of distinct mRNA subpopulations that are translated at different levels and are therefore occupied by different numbers of ribosomes. Ribosome footprints are generated by nuclease digestion in cell lysates. Similarly

to many other molecular profiling techniques, ribosome profiling data can therefore be distorted by changes in cellular physiology that arise during harvesting and lysis. As translational activity can change rapidly relative to the total abundance of mRNA or protein, ribosome profiling may be particularly sensitive to artefactual and biological differences. Ribosome profiling measurements of translation should be comparable to mRNA abundance profiling, which is a more familiar measure of gene expression. However, both of these techniques focus on the current rate of protein production and not on the total abundance of a protein. Translation and proteomics. Unbiased proteomic measurements have advanced remarkably in recent years and provide information that complement the strengths and the limitations of ribosome profiling. Mass spectrometry is the easiest way to measure total protein abundance, whereas translation directly corresponds to protein synthesis, which is the change in protein abundance over time17. Metabolic pulse labelling permits measurements of synthesis and decay rates, but these experiments are even more demanding than protein abundance profiling. Perhaps more importantly, mass spectrometry cannot currently match the depth or the breadth of coverage that is possible in nucleic acid sequencing experiments. Annotating translated sequences Ribosome-protected fragments indicate the translated regions within a complex mammalian transcriptome. Comprehensive ribosome profiling can both reveal the translation of unanticipated protein products and reflect translation that has a mainly regulatory rather than biosynthetic purpose (FIG. 3).

Translation initiation and reading frames. The results of profiling studies so far suggest that translation elongation and termination rarely deviate from the triplet genetic code, although previously known exceptions — including ribosomal frameshifting and stop codon readthrough — have been seen from profiling data3,14,18. By contrast, the position of translation initiation seems to be much more heterogeneous than previously appreciated. Profiling has revealed that ribosome occupancy, particularly on the 5ʹ leaders of transcripts, cannot be explained by initiation that always takes place at the first AUG codon on an mRNA or even by initiation

NATURE REVIEWS | GENETICS

Cytoplasm Nucleus In vivo translation

Ribosome

RNase digestion

mRNA

Nascent polypeptides

Footprint recovery Ribosome footprints Linker ligation and reverse transcription

Circularization PCR

Deep sequencing

Figure 1 | Ribosome footprint profiling.  RNase digestion of polysomes thatReviews are carrying out Nature | Genetics translation in vivo yields ribosome-protected mRNA fragments, which are known as ribosome footprints. These footprints are recovered and converted into a DNA library through ligation of a linker followed by reverse transcription and circularization PCR. cDNA libraries are then analysed by deep sequencing. Figure is reproduced, with permission, from REF. 65 © (2012) Macmillan Publishers Ltd. All rights reserved.

that occurs exclusively at AUG codons3. Similar patterns of upstream translation and non-canonical initiation have been found in yeast, in zebrafish embryos and in cultured mammalian cells3,14,19,20. Furthermore, initiation is modulated over the course of meiosis in yeast, probably in response to other changes in cellular physiology; some of these changes in upstream initiation probably contribute to translational regulation during the meiotic time course19. As translation seems to be processive and predictable following initiation, ADVANCE ONLINE PUBLICATION | 3

© 2014 Macmillan Publishers Limited. All rights reserved

PERSPECTIVES mapping the sites where translation starts is informative for predicting translated reading frames. The pattern of ribosome occupancy on a transcript often implies the exact sites of translation initiation3; however, multiple overlapping reading frames and substantial initiation at non-AUG codons complicate this analysis. Adaptations of ribosome profiling using drugs that target the ribosome have been developed to overcome these issues. Treatment with

either harringtonine14 or lactimidomycin21 immobilizes initiating ribosomes without affecting elongation, thereby causing ribosomes to accumulate specifically at start sites. Ribosome profiling detects footprint enrichment at these sites, which indicates the initiation codons on mRNAs and thereby reveals the translated reading frames. Similarly, high doses of puromycin cause rapid, premature translation termination, which leaves ribosome footprints only

a

Polysomes

Ribosome footprints

Ribosome profiling data

mRNA

Protein-coding region

b

Fewer mRNAs

Fewer ribosomes per mRNA

Or

Fewer ribosome footprints

Figure 2 | Analysis of ribosome occupancy data.  a | The positions of ribosomes on many copies of an mRNA in a cell population are converted into nuclease-protected footprints that correspond Nature Reviews | Genetics to ribosome positions in vivo. Ribosome profiling data indicate the density of ribosomes at each position on the transcript; the thick line represents the protein-coding sequence and the thin lines represent the untranslated regions. b | The amount of protein synthesized corresponds to ribosome density across the protein-coding sequence of a transcript. Ribosome footprints indicate the total number of ribosomes that are engaged in synthesizing a protein, which is determined by the number of mRNAs in the cell and by the number of ribosomes that translate each transcript. Changes in either mRNA abundance (for example, having fewer mRNAs) or ribosome loading (for example, having fewer ribosomes per mRNA) will affect the number of ribosome footprints that are recovered from the gene.

4 | ADVANCE ONLINE PUBLICATION

at and after sites of initiation22. Each of these three drugs, which act in mechanistically distinct ways, indicates a similar pattern of widespread, but often inefficient, initiation at a range of non-AUG codons in cultured mammalian cells. Initiation is strongly biased towards the 5ʹ ends of transcripts and results in the translation of known reading frames of protein-coding genes, extended and truncated forms of these CDSs, and short upstream open reading frames (uORFs) that are likely to have regulatory roles. Different studies have attributed initiation at non-AUG codons either to imprecise start codon recognition by the initiator tRNA23 or to distinct, non-canonical initiation mechanisms24. Ribosome profiling provides a powerful tool for studying the global effect of these possibilities. Alternative protein isoforms. When two translation initiation sites lie in the same reading frame, they yield an extended isoform and a truncated isoform of the same protein. These alternative protein isoforms can either localize to different cellular compartments or show distinct and even opposing molecular functions. Initiation profiling has revealed hundreds of genes with extended or truncated protein products that start at AUG or non-AUG codons14,21,22. The alternative amino termini of many of these predicted protein isoforms were subsequently detected by mass spectrometry in yeast and in mammalian cells25,26. These probably reflect a combination of standard initiation at the first AUG codon of an alternative mRNA variant 16,19,27 and atypical initiation at other codons or positions on an mRNA19,28,29. Regardless of whether diversity is produced from differences in transcription, RNA processing or translation initiation, ribosome profiling data indicate the proteins that are produced by the cell, which are the primary functional products of gene expression. Novel translated sequences. Much of the translational heterogeneity that has been revealed by ribosome profiling occurs in entirely distinct, previously unrecognized reading frames, rather than in alternative versions of annotated protein-coding genes3,14,19,21,22,29. These reading frames lie in 5ʹ transcript leaders, in alternative reading frames that overlap known genes and in some long interspersed non-coding RNAs (lincRNAs) that have no conserved protein-coding potential. This surprising observation raises the question of whether the protected RNA fragments www.nature.com/reviews/genetics

© 2014 Macmillan Publishers Limited. All rights reserved

PERSPECTIVES that are detected in these regions are truly footprints of 80S ribosomes or whether they are artefacts, such as background that arises from other RNAs. However, apparent ribosome footprints outside coding sequences are strongly biased towards the 5ʹ ends of RNAs and are associated with AUG codons, both of which suggest a link to translation14,19,29. Further studies show that footprints of both 5ʹ leaders and lincRNAs physically co‑purify with an affinity-tagged 80S ribosome, and that these footprints otherwise resemble ribosome footprints on CDSs rather than background non-coding RNAs (N.T.I. and J. Weissman, unpublished observations). Peptides have recently been identified from some of these reading frames by mass spectrometry, which provides an entirely independent line of evidence for productive translation29–31. However, the functions of the proteins that are synthesized from these reading frames remain unclear. Small proteins that are under the ~100-amino-acid cutoff (which is often used to delineate coding sequences) can adopt specific structures and carry out cellular functions; genetic evidence supports the function of proteins as short as 11 amino acids, and some of these other sequences probably also encode functional micropeptides32. However, some lincRNAs seem to be translated to some extent despite strong evidence that they do not encode functional proteins. Consistent with the idea that this translation generally does not produce proteins with molecular activities that are useful to the cell, translation of non-coding RNAs occurs in several overlapping reading frames, whereas protein expression requires specific and efficient translation of one single product 20,33. The pattern of translation that is seen on some lincRNAs, which can be detected by metrics such as the ribosome release score20,33, may reflect the default level of translation that occurs near the 5ʹ ends of cytosolic RNAs. Indeed, patterns of ribosome occupancy on many zebrafish lincRNAs resemble those on the 5ʹ leaders of coding transcripts20, and these lincRNA regions have been described as lincRNA uORFs34. However, even when translated sequences show no indication of specific functions, elongating ribosomes will synthesize the encoded polypeptide. These non-functional polypeptides are likely to be degraded quickly, but residual products may aggregate and contribute to the burden of misfolded proteins in the cell35. Furthermore, in vertebrates, protein turnover is the source of cellular antigens

a

AUG

Stop

Protein-coding region

uORF AUG AUG Stop Overlapping uORF

A UGG CUA ... ACC AUG GCG ... CUU GAG ...

AUG AUG

Stop

Alternative reading frame ..A UGG CUA ... CUU GAG ...

AUG AUG Truncation AUG GCG ... ACC AUG GCG ...

AUG

AUG

Extension AUG GCG ... ACC AUG GCG ...

b

AUG

AUG

AUG

AUG

Alternative translation of a canonical mRNA Canonical initiation on an alternative mRNA isoform

Figure 3 | Alternative reading frames.  a | Alternative sites of translation initiation may be found Nature Reviews | Genetics in the 5ʹ transcript leader and may initiate translation of a short upstream open reading frame (uORF, shown in orange). These uORFs can overlap the start codon of the protein-coding gene, which leads to out‑of‑frame translation that extends into the coding sequence. Alternative reading frame translation can also begin within the protein-coding gene. An in‑frame start codon within a gene can lead to a truncated protein, whereas an in‑frame start codon upstream of a gene without an intervening stop codon can produce an extended protein product. The shaded regions represent the alternative mRNA sequences that are being translated. b | Alternative translation start sites may result either from non-canonical initiation, such as ‘leaky scanning’ past the first start codon, or from canonical initiation on alternative mRNA variants.

that are displayed to the adaptive immune system; therefore, non-functional and noncanonical products may still contribute to immunity and autoimmunity 24. Finally, over the course of evolution, one protein product in a transcript may acquire a more specific molecular function, which may be a path for the birth of novel genes36. Regulatory upstream translation. Translation can also exert direct regulatory effects that are independent of protein synthesis. By capturing scanning ribosomes, uORFs decrease the translation levels of downstream protein-coding genes23,37. Ribosome profiling data confirmed that the presence of upstream translation correlated with lower levels of translation on proteincoding sequences across endogenous transcripts16,27. Translation of alternative reading frames also correlated with lower levels

NATURE REVIEWS | GENETICS

of mRNA stability, probably as a result of nonsense-mediated decay 16 that is triggered by termination after translation of these reading frames, which are typically short and are located near the 5ʹ ends of mRNAs. Destabilizing out‑of‑frame translation can be detected on protein-coding transcripts that have inefficient initiation at the canonical start codon16. Protein synthesis in vivo Ribosome profiling also provides insights into the mechanism of protein synthesis and co‑translational processing as they occur in living cells. Analysis of translation by ribosome profiling is conceptually similar to the study of in vivo transcription by chromatin immunoprecipitation (ChIP). Profiles of ribosome occupancy across mRNAs indicate global and gene-specific features of translational speed (FIG. 4a), similarly to how ADVANCE ONLINE PUBLICATION | 5

© 2014 Macmillan Publishers Limited. All rights reserved

PERSPECTIVES RNA polymerase ChIP reveals transcriptional pausing. Subpopulations of ribosomes can also be purified (for example, on the basis of the binding of specific accessory factors) and profiled. The footprints from these ribosomes provide information about a specific time in translation, including

a

the identity of the mRNAs that were being translated, the parts of the transcript that were being decoded and the portion of protein that has been synthesized (FIG. 4b). These data link molecular changes in the translational machinery with the production of specific proteins.

mRNA

Ribosome

Pausing

Time

Ribosome footprints

Ribosome profiling data

b Nascent polypeptide

Folding chaperone

Total footprints Pulldown footprints Total ribosome profiling data Pulldown ribosome profiling data

Figure 4 | Ribosomal pausing and co‑translational processes.  a | Ribosomal stalling causes an Nature Genetics accumulation of ribosome footprints at a specific position within a gene and thusReviews a peak of| ribosome footprint density. b | Footprints of ribosomes can be purified by pulldown assays through their association with co‑translational protein biogenesis factors, such as folding chaperones, which indicate the time during translation and the position in the protein at which these factors act. 6 | ADVANCE ONLINE PUBLICATION

Ribosome stalling and the speed of elongation. Transcript structure, tRNA abundance and protein sequence all have the potential to affect the speed of translation, but it has proved challenging to understand their effects in vivo. The effect of synonymous codons has been particularly controversial38. Ribosome profiling now allows direct, highresolution measurements of ribosome speed on endogenous genes in living cells, as the density of footprints at different points in the same coding sequence indicates the relative time that is spent by the ribosome at each codon. Slow translation manifests as increased ribosome density. Using ribosome profiling, ribosome occupancy and thus decoding speed in bacteria varied by only twofold across codons and did not correlate with genomic codon preferences39. Similar results were obtained in yeast, and the authors argued that slow tRNA recycling could explain constant decoding times in the face of large, but matched, codon usage biases and tRNA abundance40. Profiling in Caenorhabditis elegans and human cells indicated that decoding time is not affected by tRNA levels but is instead slowed by wobble pairing (that is, G·U pairing) in codon– anticodon recognition41. One topic of current debate is the possible presence of a region of slow translation at the beginning of genes42,43. Profiling data seem to suggest the existence of such a translational ‘ramp’ (REF. 43), but other models of translation, which are partly based on ribosome profiling data, suggest that the data are better explained by higher levels of translation of short genes, which contribute disproportionately to averages near the beginning of the ramp15. Future profiling will no doubt continue to illuminate both the factors that control the speed of elongation and the effect of codon choice on translation. In fact, recent work shows that footprinting can be adapted to distinguish between ribosomes in different stages of the translation elongation cycle and thereby determine more precisely how the ribosome spends its time during protein synthesis (L. Lareau, D. Hite and P. Brown, personal communication). The ribosome is capable of synthesizing a wide range of polypeptides that differ markedly in their biophysical properties. However, certain nascent proteins cause marked pauses in translation, which can regulate expression and promote the proper folding and targeting of nascent proteins8. Ribosome profiling provides a broad survey of these pauses, independent of their function, that can complement detailed dissection of individual stalling peptides with known cellular www.nature.com/reviews/genetics

© 2014 Macmillan Publishers Limited. All rights reserved

PERSPECTIVES

Measuring translational regulation Gene expression profiling is used pervasively in studying the molecular basis of cellular physiology, and perhaps the broadest application of ribosome profiling will be in measuring the ultimate levels of protein synthesis

More mRNA mRNA abundance (RNA-seq)

More protein made

Protein synthesis (ribosome profiling)

More protein made

Translation and protein biogenesis. Newly synthesized proteins must fold correctly to function in the cell. Protein folding is aided by a network of chaperone proteins, which is intimately linked to the synthesis of new proteins47. Environmental stresses can compromise folding of new and existing proteins, and cells respond to reduced folding capacity by decreasing protein synthesis levels. Recent profiling studies revealed that heat shock48, oxidative stress49 and chemically induced misfolding 50 all attenuate translation by restricting elongation. Yeast and mammalian cells respond to folding stress by stalling translation at 30–60 codons from the start of every gene. Intriguingly, the arrest occurs at the point in translation when the nascent peptide first emerges from the ribosome, which could either interact with co‑translational chaperones or recognize their absence. Ribosome profiling has been adapted to monitor the normal action of co‑ translational chaperones and to measure the consequence of their disruption. One study 51 profiled the complexes of ribosomes and nascent polypeptides that crosslinked to the bacterial chaperone trigger factor. Footprints from these ribosomes began to appear after ~120 codons, which showed that trigger factor engaged nascent proteins well after their emergence from the ribosome. Ribosomes that translate outer membrane proteins were strongly enriched by trigger factors, which led to the discovery of trigger factor phenotypes that are related to compromise of the bacterial outer membrane. Another study 52 took a similar approach to measure folding of the nascent chain itself, which identified some structures that formed progressively in emerging proteins and others that formed in a concerted manner. Further ribosome profiling of these co‑translational processes promises to greatly sharpen the resolution of the picture that has emerged from bulk mRNA profiling.

Translational induction

Transcriptional induction

Protein synthesis (ribosome profiling)

roles14,39. Many pause sites in bacteria and in mammalian cells are associated with prolinerich motifs44, and positively charged amino acids can also slow the ribosome42,45. Other programmed pause sites probably reflect more complicated and idiosyncratic interactions between the nascent peptide and the exit tunnel, which is ~30 amino acids long 46.

Same mRNA abundance mRNA abundance (RNA-seq)

Figure 5 | Deconvolving transcription and translation.  Matched ribosome profiling and mRNA sequencing (RNA-seq) data distinguish between alternative modes of gene expression Nature Reviews regulation. | Genetics Transcriptional induction results in matched increases in mRNA abundance, as measured by RNA-seq, and protein synthesis, as measured by ribosome profiling. Translational induction manifests as an increase in ribosome profiling measurements without a corresponding change in mRNA abundance.

after all layers of transcriptional, posttranscriptional and translational control. Clearly, the accurate assessment of translation is important for studying the mechanisms of its regulation. It is also valuable for mapping gene expression programmes, as it captures a more complete picture of the cellular response under investigation than transcript levels from standard gene expression profiling do. Translational expression programmes. The logic underlying many profiling experiments is that the coordinated expression changes that happen in the course of a biological process will reveal functionally important genes and pathways. However, in most cases, the real effect of a gene expression programme results from changes in protein levels, which are often due to coordinated regulation of transcription, mRNA processing and translation. Ribosome profiling accounts for all of these layers of regulation and reveals how the cell spends its limited biosynthetic budget. The power of ribosome profiling to identify expression programmes is exemplified by studies of meiosis time course in yeast, in which analyses of co‑regulated genes revealed novel meiotic factors and showed a substantial role for translational regulation in ensuring the expression of meiotic proteins at the specific times that they are needed19. Similar analyses should shed light on genes that are involved in many other normal and pathological responses in which translational control features prominently, which range from cellular stress responses to learning and memory 53–55.

NATURE REVIEWS | GENETICS

Comprehensive analyses of gene expression programmes can also provide a starting point for identifying the regulators that mediate the response, which may in turn lead to further information about upstream signalling pathways. Regulation must result from some features of the target transcripts that distinguish them from unaffected ones. Translational control may depend on the presence of sequence motifs, such as the 5ʹ terminal oligopyrimidine (5ʹ‑TOP) tract that has long been recognized in mRNAs that are involved in ribosome biogenesis56,57. Ribosome profiling studies revealed that all transcripts with translation that responds to the activity of the mammalian target of rapamycin (mTOR) kinase pathway — a key regulator of cell growth and proliferation — contain some variant of this 5ʹ‑TOP motif 56,57. In these studies, the role of the 5ʹ-TOP motif was already suspected; one challenge in future studies of this kind is in determining the mRNA features to consider and how these features correspond to potential regulators. Mechanisms of regulation. Ribosome profiling also offers new insights into the mechanisms of translational control that may help to address our limited understanding of the types of mRNA sequence features that specify regulation. Measuring translational changes that result from the disruption of RNA-binding proteins can reveal their targets, and cross-referencing these affected mRNAs with data from emerging techniques for mapping sites of occupancy of RNA-binding proteins will more specifically ADVANCE ONLINE PUBLICATION | 7

© 2014 Macmillan Publishers Limited. All rights reserved

PERSPECTIVES indicate direct regulatory interactions58. Recent work has identified a wide range of RNA-binding proteins that associate with specific subsets of mRNAs, which hints at a broader world of translational regulators to be explored59,60. These regulators may affect the stability and the translation of their target transcripts. Parallel ribosome profiling and RNA-seq measurements (FIG. 5) can disentangle changes in mRNA abundance from translational control, which has already provided insights into the magnitude and the timing of gene repression, as well as into mRNA decay in microRNA-mediated regulation61–63. Conclusions Our understanding of the full effect of translational control, the true diversity of proteins that are encoded by the transcriptome and the scope of ribosomal pausing has been limited by the methods that are available to study them. Ribosome profiling has emerged as a global, quantitative technique to study translation, which had previously resisted the high-resolution, genome-scale analyses that have swept the study of transcription. This technique has already provided an overview of protein synthesis in vivo and has challenged assumptions about the initiation and elongation of translation. These early studies have posed questions — for example, about the importance of widespread non-canonical initiation — and further application and refinement of ribosome profiling should provide answers to these questions. Novel and unexpected phenomena such as these are exciting ‘windfalls’ from the development and the emergence of new techniques. However, the long-term dividends of ribosome profiling will probably come from comprehensive gene expression measurements.

These measurements may provide important new insights into disease, as translation is a sensitive ‘barometer’ of cellular health. Diverse stresses that range from viral infection to hypoxia induce changes in protein synthesis, and profiling these translational expression programmes will identify proteins that could be therapeutic targets to either support protective stress responses or block maladaptive ones53. Translational control of gene expression is also an essential part of long-term potentiation in neurons and thus of learning and memory 54. Neuronal profiling is difficult partly because neurons are themselves diverse and are embedded in an extremely complex organ (that is, the brain) with many other cell types. Whole mRNAs have been isolated from specific cell types by translating ribosome affinity purification (TRAP) in genetically engineered model organisms64. In TRAP, intact polysomes are purified from the subpopulation of cells that express an epitope-tagged ribosome, which yields a pool of transcripts that are present and that are translated in these cells. The combination of TRAP with ribosome footprinting, which has not yet been described, should provide more specific and quantitative measurements of translation without a loss in tissue specificity. These measurements could address important questions beyond the brain, for example, by profiling expression within the tumour microenvironment of stromal, endothelial and infiltrating immune cells. Ribosome profiling clearly has more to tell us about the mechanisms of translational control and its role in the cell and the organism. Nicholas T. Ingolia was at the Department of Embryology, Carnegie Institution, 3520 San Martin Drive, Baltimore, Maryland 21218, USA. Present address: Department of Molecular and Cell Biology, University of California, Berkeley, Barker Hall 422, Berkeley, California 94720, USA. e‑mail: [email protected] doi:10.1038/nrg3645 Published online 28 January 2014

Glossary Codon usage biases Enrichment for certain synonymous codons, particularly in well-expressed genes.

Polysome Several translating ribosomes that are held together by a single mRNA transcript.

Polysome profiling The analysis of polysomal mRNAs that are fractionated by the number of ribosomes on each transcript.

Reading frame The sequence within an mRNA that begins at a start codon and that extends to the next in‑frame stop codon.

Wobble pairing Non-Watson–Crick base pairing that permits a single tRNA anti-codon to recognize multiple codons.

Brown, P. O. & Botstein, D. Exploring the new world of the genome with DNA microarrays. Nature Genet. 21, 33–37 (1999). 2. Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nature Rev. Genet. 10, 57–63 (2009). 3. Ingolia, N. T., Ghaemmaghami, S., Newman, J. R. & Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009). 4. Johannes, G., Carter, M. S., Eisen, M. B., Brown, P. O. & Sarnow, P. Identification of eukaryotic mRNAs that are translated at reduced cap binding complex eIF4F concentrations using a cDNA microarray. Proc. Natl Acad. Sci. USA 96, 13118–13123 (1999). 5. Arava, Y. et al. Genome-wide analysis of mRNA translation profiles in Saccharomyces cerevisiae. Proc. Natl Acad. Sci. USA 100, 3889–3894 (2003). 6. Hendrickson, D. G. et al. Concordant regulation of translation and mRNA abundance for hundreds of targets of a human microRNA. PLoS Biol. 7, e1000238 (2009). 1.

8 | ADVANCE ONLINE PUBLICATION

Steitz, J. A. Polypeptide chain initiation: nucleotide sequences of the three ribosomal binding sites in bacteriophage R17 RNA. Nature 224, 957–964 (1969). 8. Wolin, S. L. & Walter, P. Ribosome pausing and stacking during translation of a eukaryotic mRNA. EMBO J. 7, 3559–3569 (1988). 9. Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013). 10. Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010). 11. Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010). 12. Olshen, A. B. et al. Assessing gene-level translational control from ribosome profiling. Bioinformatics 29, 2995–3002 (2013). 13. Michel, A. M. et al. GWIPS-viz: development of a ribo-seq genome browser. Nucleic Acids Res. 42, D859–D864 (2013). 14. Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789–802 (2011). 15. Shah, P., Ding, Y., Niemczyk, M., Kudla, G. & Plotkin, J. B. Rate-limiting steps in yeast protein translation. Cell 153, 1589–1601 (2013). 16. Arribere, J. A. & Gilbert, W. V. Roles for transcript leaders in translation and mRNA decay revealed by transcript leader sequencing. Genome Res. 23, 977–987 (2013). 17. Vogel, C. & Marcotte, E. M. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nature Rev. Genet. 13, 227–232 (2012). 18. Michel, A. M. et al. Observation of dually decoded regions of the human genome using ribosome profiling data. Genome Res. 22, 2219–2229 (2012). 19. Brar, G. A. et al. High-resolution view of the yeast meiotic program revealed by ribosome profiling. Science 335, 552–557 (2012). 20. Chew, G. L. et al. Ribosome profiling reveals resemblance between long non-coding RNAs and 5ʹ leaders of coding RNAs. Development 140, 2828–2834 (2013). 21. Lee, S., Liu, B., Huang, S. X., Shen, B. & Qian, S. B. Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution. Proc. Natl Acad. Sci. USA 109, E2424–E2432 (2012). 22. Fritsch, C. et al. Genome-wide search for novel human uORFs and N‑terminal protein extensions using ribosomal footprinting. Genome Res. 22, 2208–2218 (2012). 23. Sonenberg, N. & Hinnebusch, A. G. Regulation of translation initiation in eukaryotes: mechanisms and biological targets. Cell 136, 731–745 (2009). 24. Starck, S. R. et al. Leucine-tRNA initiates at CUG start codons for protein synthesis and presentation by MHC class I. Science 336, 1719–1723 (2012). 25. Menschaert, G. et al. Deep proteome coverage based on ribosome profiling aids mass spectrometry-based protein and peptide discovery and provides evidence of alternative translation products and near-cognate translation initiation events. Mol. Cell. Proteomics 12, 1780–1790 (2013). 26. Fournier, C. T. et al. Amino termini of many yeast proteins map to downstream start codons. J. Proteome Res. 11, 5712–5719 (2012). 27. Pelechano, V., Wei, W. & Steinmetz, L. M. Extensive transcriptional heterogeneity revealed by isoform profiling. Nature 497, 127–131 (2013). 28. Sonenberg, N. & Hinnebusch, A. G. New modes of translational control in development, behavior and disease. Mol. Cell 28, 721–729 (2007). 29. Stern-Ginossar, N. et al. Decoding human cytomegalovirus. Science 338, 1088–1093 (2012). 30. Schwaid, A. G. et al. Chemoproteomic discovery of cysteine-containing human short open reading frames. J. Am. Chem. Soc. 135, 16750–16753 (2013). 31. Slavoff, S. A. et al. Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nature Chem. Biol. 9, 59–64 (2013). 32. Kondo, T. et al. Small peptide regulators of actin-based cell morphogenesis encoded by a polycistronic mRNA. Nature Cell Biol. 9, 660–665 (2007). 33. Guttman, M., Russell, P., Ingolia, N. T., Weissman, J. S. & Lander, E. S. Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins. Cell 154, 240–251 (2013). 7.

www.nature.com/reviews/genetics © 2014 Macmillan Publishers Limited. All rights reserved

PERSPECTIVES 34. Ulitsky, I. & Bartel, D. P. lincRNAs: genomics, evolution, and mechanisms. Cell 154, 26–46 (2013). 35. Drummond, D. A. & Wilke, C. O. The evolutionary consequences of erroneous protein synthesis. Nature Rev. Genet. 10, 715–724 (2009). 36. Carvunis, A. R. et al. Proto-genes and de novo gene birth. Nature 487, 370–374 (2012). 37. Calvo, S. E., Pagliarini, D. J. & Mootha, V. K. Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans. Proc. Natl Acad. Sci. USA 106, 7507–7512 (2009). 38. Plotkin, J. B. & Kudla, G. Synonymous but not the same: the causes and consequences of codon bias. Nature Rev. Genet.12, 32–42 (2011). 39. Li, G. W., Oh, E. & Weissman, J. S. The antiShine–Dalgarno sequence drives translational pausing and codon choice in bacteria. Nature 484, 538–541 (2012). 40. Qian, W., Yang, J. R., Pearson, N. M., Maclean, C. & Zhang, J. Balanced codon usage optimizes eukaryotic translational efficiency. PLoS Genet. 8, e1002603 (2012). 41. Stadler, M. & Fire, A. Wobble base-pairing slows in vivo translation elongation in metazoans. RNA 17, 2063–2073 (2011). 42. Dana, A. & Tuller, T. Determinants of translation elongation speed and ribosomal profiling biases in mouse embryonic stem cells. PLoS Comput. Biol. 8, e1002755 (2012). 43. Tuller, T. et al. An evolutionarily conserved mechanism for controlling the efficiency of protein translation. Cell 141, 344–354 (2010). 44. Woolstenhulme, C. J. et al. Nascent peptides that block protein synthesis in bacteria. Proc. Natl Acad. Sci. USA 110, E878–E887 (2013). 45. Charneski, C. A. & Hurst, L. D. Positively charged residues are the major determinants of ribosomal velocity. PLoS Biol. 11, e1001508 (2013).

46. Ito, K. & Chiba, S. Arrest peptides: cis-acting modulators of translation. Annu. Rev. Biochem. 82, 171–202 (2013). 47. Pechmann, S., Willmund, F. & Frydman, J. The ribosome as a hub for protein quality control. Mol. Cell 49, 411–421 (2013). 48. Shalgi, R. et al. Widespread regulation of translation by elongation pausing in heat shock. Mol. Cell 49, 439–452 (2013). 49. Gerashchenko, M. V., Lobanov, A. V. & Gladyshev, V. N. Genome-wide ribosome profiling reveals complex translational regulation in response to oxidative stress. Proc. Natl Acad. Sci. USA 109, 17394–17399 (2012). 50. Liu, B., Han, Y. & Qian, S. B. Cotranslational response to proteotoxic stress by elongation pausing of ribosomes. Mol. Cell 49, 453–463 (2013). 51. Oh, E. et al. Selective ribosome profiling reveals the cotranslational chaperone action of trigger factor in vivo. Cell 147, 1295–1308 (2011). 52. Han, Y. et al. Monitoring cotranslational protein folding in mammalian cells at codon resolution. Proc. Natl Acad. Sci. USA 109, 12467–12472 (2012). 53. Spriggs, K. A., Bushell, M. & Willis, A. E. Translational regulation of gene expression during conditions of cell stress. Mol. Cell 40, 228–237 (2010). 54. Wang, D. O., Martin, K. C. & Zukin, R. S. Spatially restricting gene expression by local translation at synapses. Trends Neurosci. 33, 173–182 (2010). 55. Hotamisligil, G. S. Endoplasmic reticulum stress and the inflammatory basis of metabolic disease. Cell 140, 900–917 (2010). 56. Thoreen, C. C. et al. A unifying model for mTORC1‑mediated regulation of mRNA translation. Nature 485, 109–113 (2012). 57. Hsieh, A. C. et al. The translational landscape of mTOR signalling steers cancer initiation and metastasis. Nature 485, 55–61 (2012).

NATURE REVIEWS | GENETICS

58. Cho, J. et al. LIN28A is a suppressor of ER‑associated translation in embryonic stem cells. Cell 151, 765–777 (2012). 59. Castello, A. et al. Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149, 1393–1406 (2012). 60. Baltz, A. G. et al. The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts. Mol. Cell 46, 674–690 (2012). 61. Guo, H., Ingolia, N. T., Weissman, J. S. & Bartel, D. P. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466, 835–840 (2010). 62. Bazzini, A. A., Lee, M. T. & Giraldez, A. J. Ribosome profiling shows that miR‑430 reduces translation before causing mRNA decay in zebrafish. Science 336, 233–237 (2012). 63. Stadler, M., Artiles, K., Pak, J. & Fire, A. Contributions of mRNA abundance, ribosome loading, and post- or peri-translational effects to temporal repression of C. elegans heterochronic miRNA targets. Genome Res. 22, 2418–2426 (2012). 64. Heiman, M. et al. A translational profiling approach for the molecular characterization of CNS cell types. Cell 135, 738–748 (2008). 65. Ingolia, N. T., Brar, G. A., Rouskin, S., McGeachy, A. M. & Weissman, J. S. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nature Protoc. 7, 1534–1550 (2012).

Acknowledgements

The author thanks N. McGlincy and his laboratory members more generally for comments on the manuscript, as well as L. Lareau for her suggestions and for sharing unpublished results. This work was partly supported by the Searle Scholars Program.

Competing interests statement

The author declares competing interests: see Web version for details.

ADVANCE ONLINE PUBLICATION | 9 © 2014 Macmillan Publishers Limited. All rights reserved

Ribosome profiling: new views of translation, from single codons to genome scale.

Genome-wide analyses of gene expression have so far focused on the abundance of mRNA species as measured either by microarray or, more recently, by RN...
1MB Sizes 2 Downloads 0 Views