Next-Generation Sequencing RNA-Seq Library Construction

UNIT 4.21

Jessica Podnar,1 Heather Deiderick,1 Gabriella Huerta,1 and Scott Hunicke-Smith1 1

Genomic Sequencing and Analysis Facility, University of Texas at Austin, Austin, Texas

ABSTRACT This unit presents protocols for construction of next-generation sequencing (NGS) directional RNA sequencing libraries for the Illumina HiSeq and MiSeq from a wide variety of input RNA sources. The protocols are based on the New England Biolabs (NEB) small RNA library preparation set for Illumina, although similar kits exist from different vendors. The protocol preserves the orientation of the original RNA in the final sequencing library, enabling strand-specific analysis of the resulting data. These libraries have been used for differential gene expression analysis and small RNA discovery and are currently being tested for de novo transcriptome assembly. The protocol is robust and applicable to a broad range of RNA input types and RNA quality, making it ideal for high-throughput C 2014 by John Wiley & Sons, laboratories. Curr. Protoc. Mol. Biol. 106:4.21.1-4.21.19.  Inc. Keywords: RNA-Seq r NGS r library construction r transcriptome r strand-specific r gene expression

INTRODUCTION This unit provides a coordinated set of protocols for the construction of a strand-specific RNA-Seq library from total RNA (with or without ribosomal RNA removal), poly(A) mRNA, or small RNA. Strand-specific (or directional) means that the orientation of the original RNA transcript is preserved in the final library and thus in the direction of the final sequences (Fig. 4.21.1). The individual protocols are written so that the user can choose protocols best suited to the desired analysis. Protocols should be chosen based on the type of RNA used to construct the library and the type of transcriptome analysis required. The methods provided here are designed to produce a strand-specific library by converting RNA into cDNA with adaptors suitable for NGS on the Illumina HiSeq or MiSeq. The first step is the fragmentation step, Basic Protocol 1, which fragments either total RNA or previously isolated mRNA. This step can be skipped if the user intends to sequence small RNA (e.g., miRNA) or has already isolated RNA of appropriate size (e.g., lncRNA). Basic Protocol 2, library construction, covers ligation of 5 and 3 adaptors, reverse transcription, and a PCR amplification step. Finally, the user may choose one of several size-selection methods appropriate to the experimental goals. Basic Protocols 3 and 4 describe an AMPure Bead Size selection method, while Basic Protocol 5 is a gel-based size selection method using the Blue Pippin system from Sage Science.

Strategic Planning The experimental goals must be considered carefully in the context of the complex role of RNA in biological processes before conducting any RNA-Seq experiment. Choices regarding positive or negative selection as well as fragmentation and size separation before and during library construction will profoundly alter the population of sequences represented in final data. For example, selection of poly(A) mRNA will necessarily Preparation and Analysis of RNA Current Protocols in Molecular Biology 4.21.1-4.21.19, April 2014 Published online April 2014 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/0471142727.mb0421s106 C 2014 John Wiley & Sons, Inc. Copyright 

4.21.1 Supplement 106

A

Mg2+ Mg2+

OH

B P

C

D

P

P

E

F REV/Index library primer

G FWD library primer

Figure 4.21.1 (A) Elevated temperature fragmentation with Mg2+ is performed to produce random ends, (B) T4 PNK treatment restores proper 5 phosphate and 3 hydroxyl groups, (C) ligation of a portion of the 3 Illumina-compatible adaptor, (D) annealing of the RT primer to the 3 adaptor, (E) ligation of a portion of the 5 Illumina-compatible adaptor, (F) reverse transcription, and (G) PCR to incorporate the remaining Illumina adaptor sequences, including index in the reverse primer.

remove noncoding regulatory RNA and RNA fragments, prohibiting subsequent analysis of noncoding RNA (ncRNA), but will provide a clear view of protein-coding genes. Ribosomal RNA removal will leave not only poly(A) mRNA and ncRNA, but a host of RNA biogenesis intermediates and degradation products that will decrease the sensitivity of mRNA detection and may complicate downstream analysis (for example, an intronic miRNA transcript should not be counted as expression of the surrounding gene). Since neither method is 100% efficient, projects with limiting amounts of RNA for analysis may be better treated via negative selection (ribosomal removal).

NGS RNA-Seq Library Construction

Similarly, the choice of fragmentation size and size selection during or after library preparation will impact the relative representation of RNA species. For example, tRNAs (70 to 90 nt long) are similar in size to miRNA precursors (60 to 90 nt long). If RNA fragmentation of mRNA results in many fragments in this same range, and size selection includes them, much of the data will be tRNA and/or miRNA precursor sequence rather than mRNA. As another example, piRNAs (piwi-interacting RNA, 26 to 31 nt) are only marginally larger than miRNA (22 nt). Since the sequencing adaptors add 120 bp to the length of the final library, it is difficult to cleanly separate these fractions by agarose or polyacrylamide gel electrophoresis, and so final sequencing libraries may contain both species.

4.21.2 Supplement 106

Current Protocols in Molecular Biology

Given the large number of RNA subtypes, we cannot make exhaustive recommendations for every type of experiment. However, we can provide broad guidelines for common experimental questions categorized based on the knowledge of the organism’s genome and available resources. In well annotated genomes, if budget is not limiting, we suggest sequencing the entire population of non-ribosomal RNA, since the cost of generating primary sequence data is low and steadily decreasing and the choice of RNA subpopulation for analysis can be made later during analysis. This is particularly true in experiments with small genomes such as those of bacteria, where it may be less expensive to sequence total RNA and remove ribosomal RNA sequences during analysis than to remove the ribosomal RNA during library construction. This data can then be used to measure differential expression of genes and other ncRNA, subject to the size range chosen during library preparation. For example, if the final library size chosen was 250 bp, meaning that the template RNAs were approximately 130 nt, small RNA (such as miRNA) would have been heavily excluded during size selection, rendering differential expression difficult and possibly biased if size selection varied somewhat between samples. For well annotated genomes, if the budget is limiting and only a measure of differential protein-coding gene expression in eukaryotes is desired, then poly(A) mRNA should be isolated first. For prokaryotes, digestion and/or capture of rRNA should be considered, but this may or may not be cost effective depending on the preparation kit cost relative to sequencing cost at the time. In systems lacking well-annotated genomes, the additional noise and complexity of the entire transcriptome (Linsen et al., 2009) may not be desired if the primary goal is determination of protein-coding genes for either genome annotation or differential expression analysis. We suggest poly(A) mRNA isolation in this case, regardless of budget. If differential protein-coding expression is the primary purpose, this dataset should be sufficient, but more complete genome annotation would include ncRNA. In that case, we recommend preserving significant fractions of the original total RNA for later processing, with ribosomal removal and sequencing for longer ncRNA and/or processing for miRNA.

RNA FRAGMENTATION The fragmentation step requires either total RNA (with or without ribosomal RNA removal) or previously isolated mRNA. The NEBNext Magnesium RNA Fragmentation Module has been optimized to fragment RNA using divalent metal ions along with heat. The time that the RNA is elevated to 94°C may need to be adjusted given variability between labs, sample types, and purity, and the desired type of analysis. We recommend running a test sample to ensure these conditions will yield the desired fragmentation results.

BASIC PROTOCOL 1

Materials Purified RNA (2 to 50 μg total RNA or 50 to 250 ng mRNA; see Chapter 4) NEBNext Magnesium RNA Fragmentation Module (NEB, cat. no. E6150S) containing: 10× RNA Fragmentation Buffer Nuclease-free water 10× RNA Fragmentation Stop Solution RNeasy MinElute Cleanup Kit (Qiagen, cat. no. 74204) T4 Polynucleotide Kinase Kit (NEB, cat. no. M0201), including: T4 polynucleotide kinase (PNK) T4 polynucleotode kinase (PNK) buffer

Preparation and Analysis of RNA

4.21.3 Current Protocols in Molecular Biology

Supplement 106

10 mM ATP (Invitrogen, cat. no. AM8110G) RiboMinus Concentration Module (Invitrogen, cat. no. K155005) including: Binding buffer 100% ethanol Wash buffer (W5) Thermal cycler Fragment RNA 1. On ice, mix the following components (from NEBNext Magnesium RNA Fragmentation Module) in a sterile 0.2-ml PCR tube (total volume, 20 μl): 1 to 18 μl RNA 2 μl 10× RNA Fragmentation Buffer Nuclease-free water up to 20 μl. 2. Mix gently by pipetting up and down, and then pulse spin in microcentrifuge to bring liquid to the bottom of the tube. 3. Incubate the sample on a preheated thermocycler set to 94°C for 4 min. 4. Immediately snap-cool in an ice/water slurry and proceed to the next step. 5. Add 2 μl of 10× Fragmentation Stop Solution (from NEBNext Magnesium RNA Fragmentation Module) to sample on ice, vortex gently, then place sample back on ice. 6. Purify sample with RNeasy MinElute column according to the manufacturer’s protocol, but elute using nuclease-free water into a 1.5-ml microcentrifuge tube with a final elution volume of 18 μl, so the sample is ready for the next step.

Phosphorylate the fragmented RNA 7. Add the following components to the 1.5-ml tube from step 6 above (total volume, 24 μl): 18 μl fragmented RNA (already in the tube) 2 μl T4 PNK Buffer 2 μl T4 PNK 2 μl 10 mM ATP. 8. Flick the tube gently to mix and pulse spin in microcentrifuge to bring liquid to bottom of tube. 9. Incubate at 37°C for 30 min. 10. Snap cool the sample on ice.

Purify the sample using RiboMinus concentration module 11. Add 76 μl nuclease-free water to sample. 12. Add 100 μl Binding Buffer from RiboMinus Concentration Module and 300 μl 100% ethanol. Mix well. 13. Load entire amount onto column and spin 1 min at 12,000 × g, room temperature. Discard flow-through and replace column into the same tube. 14. Add 500 μl Wash Buffer (W5) to the column. Make sure ethanol is added to the Wash Buffer. NGS RNA-Seq Library Construction

4.21.4 Supplement 106

Current Protocols in Molecular Biology

A

B

Figure 4.21.2 (image continues on next page) These figures are a collection of electropherograms from the Agilent BioAnalyzer 2100. They track the progress of an RNA-Seq library constructed from 5 μg of Arabidopsis total RNA that went through a ribosomal removal treatment prior to fragmentation. Panel (A) illustrates the quality of RNA prior to the ribosomal removal step. Panel (B) is a BioAnalyzer trace of the RNA after completing the ribosomal removal treatment. The strong peak represents a large abundance of small RNAs; it is not necessarily degraded RNA, which is demonstrated in the next panel. Panel (C) represents the ribosomal depleted RNA after fragmentation. Panel (D) is a BioAnalyzer trace of the final library after 10 cycles of PCR, size-selected with AMPure Beads.

15. Microcentrifuge 1 min at 12,000 × g, room temperature. 16. Discard flowthrough and place column in a clean wash tube (supplied with the kit). Centrifuge column dry for 2 min at maximum speed in centrifuge. 17. Place column in a clean 1.5-ml recovery tube (supplied with the kit). 18. Elute RNA by adding 12 μl RNase-free water to the center of the column. Let stand for 1 min, then microcentrifuge 1 min at maximum speed. 19. Repeat elution with the eluate. Recovery will be approximately 10 μl.

Preparation and Analysis of RNA

4.21.5 Current Protocols in Molecular Biology

Supplement 106

C

D

Figure 4.21.2

continued

20. Store the fragmented sample at −80°C until ready to construct the RNA-Seq library. The RNA is safe to use for at least 1 week post fragmentation. Analysis via Agilent BioAnalzyer is now recommended to verify the amount of mRNA and the size distribution. See Figure 4.21.2C for an example of fragmented, ribosomal-depleted RNA. BASIC PROTOCOL 2

NGS RNA-Seq Library Construction

LIBRARY CONSTRUCTION These steps ligate the necessary 5 - and 3 -specific adaptors and incorporate indexes compatible for sequencing on the Illumina HiSeq or MiSeq platforms. This protocol is designed to start with the fragmented RNA from Basic Protocol 1 or a variety of RNAs including total RNA, small RNA, or mRNA that has not been fragmented. A variety of alternate protocols for size selection may be used on the final library product. NOTE: Dilute all adaptors and primers 1:2 for each reaction if starting with 100 ng or less of RNA

4.21.6 Supplement 106

Current Protocols in Molecular Biology

Materials RNA (100 ng 1 μg typically required; e.g., from Basic Protocol 1) NEBNext Small RNA Library Prep Set for Illumina (Multiplex Compatible; NEB, cat. no. E7300; comes with barcodes) including: Nuclease-free water 3´ SR Adaptor for Illumina 3´ Ligation Reaction Buffer (2×) 3´ Ligation Enzyme Mix SR RT Primer for Illumina 5´ SR Adaptor for Illumina 5´ Ligation Reaction Buffer (10×) 5´ Ligase Enzyme Mix Deoxynucleotide Solution Mix Murine RNase inhibitor Long Amp Taq 2× Master Mix Multiplex SR primer Index (X) Primer-Index 1-12 SuperScript III Reverse Transcriptase (Invitrogen, cat. no. 18080-044) 10 mM dNTP mix (NEB; cat no. N0447L) 5× First Strand Buffer (supplied with SuperScript III) 0.1 M DTT (supplied with SuperScript III) Thermal cycler Additional reagents and equipment for purification/size selection (Basic Protocol 3, 4, or 5) Ligate the 3 SR adaptor 1. On ice, combine the following in a PCR tube (total volume, 7 μl): 1 to 6 μl 100 ng to 1 μg input RNA 1 μl Multiplex 3 SR Adaptor Nuclease-free water up to 7 μl. The Multiplex 3 SR Adaptor and nuclease-free water are provided with the NEBNext Small RNA Library Prep Set for Illumina.

2. Pipet up and down gently to mix, then pulse spin in microcentrifuge to bring liquid to the bottom of the tube. 3. Incubate in a preheated thermocycler for 2 min at 70°C with the heated lid on. 4. Transfer the tube to ice, then add the following components from the NEBNext kit to the tube (total volume, 20 μl):

10 μl 3 Ligation Reaction Buffer 3 μl 3 Ligation Enzyme Mix. 5. Vortex gently to mix and spin briefly in microcentrifuge to bring liquid to the bottom of the tube. 6. Incubate in thermal cycler for 1 hr at 25°C with the heated lid off. Optional: 18 hr at 16°C may increase ligation efficiency for certain RNA species, such as methylated RNAs; however extended incubation may also result in concatamerization.

7. Prepare the 5 SR Adaptor for the next ligation if not already completed. Resuspend the 5 SR Adaptor in 120 μl of nuclease-free water. Prepare aliquots based on the

Preparation and Analysis of RNA

4.21.7 Current Protocols in Molecular Biology

Supplement 106

typical number of samples being processed at a time, 1.1 × (N) for each aliquot, with N being the number of samples to process. Store aliquots with reagents. 8. Transfer the tube from step 6 to ice after the incubation has completed.

Hybridize the reverse transcription primer 9. To the tube on ice, add the following components (total volume, 25.5 μl): 4.5 μl nuclease-free water 1 μl Multiplex SR RT Primer. 10. Flick the tube gently to mix, then spin briefly to gather the liquid at the bottom of the tube. 11. Place the tube in the thermal cycler with the heated lid on and run the following program:

5 min at 75°C 15 min at 37°C 15 min at 25°C. 12. With 5 min of the RT Primer hybridization remaining, heat the prepared and aliquotted 5 SR Adaptor dilution from step 7 at 70°C for 2 min. 13. Immediately snap cool on ice. Make sure the denaturation and cooling are performed immediately prior to setting up the 5 ligation reaction. 14. When the RT primer hybridization is complete, transfer the PCR tube to ice and add the following components (total volume, 30 μl):

1 μl 5 SR Adaptor (denatured) 1 μl 10× 5 Ligation Reaction Buffer 2.5 μl 5 Ligation Enzyme Mix. 15. Vortex gently and spin briefly. 16. Incubate in thermal cycler with the heated lid off for 1 hr at 25°C.

Perform reverse transcription 17. Add the following components to a sterile, nuclease-free tube (total volume, 40 μl): 24 μl 3 and 5 ligated RNA (from step 16) 2 μl 10 mM dNTP mix 8 μl 5× first strand buffer (supplied with SuperScript III) 4 μl 0.1 M DTT (supplied with SuperScript III) 1 μl Murine RNase Inhibitor (from NEBNext Small RNA Library Prep Set) 1 μl SuperScript III Reverse Transcriptase. 18. Vortex gently and briefly, and then microcentrifuge briefly to gather the liquid in the bottom of the tube. 19. Incubate in thermal cycler for 1 hr at 50°C with the heated lid off. 20. Store the remaining 6 μl of ligated RNA at −80°C. This is useful for troubleshooting in case there is a problem downstream with library preparation. NGS RNA-Seq Library Construction

21. Either proceed immediately to PCR amplification or heat-inactivate the reaction at 70°C for 15 min. After the sample has been heat-inactivated, it may be safely stored at −20°C overnight.

4.21.8 Supplement 106

Current Protocols in Molecular Biology

We have not evaluated the effect of storing the samples longer than 18 hr.

PCR amplify cDNA 22. Add the following components to the RT reaction mix from step 21 (total volume, 100 μl): 40 μl RT Reaction Mixture (from step 21) 50 μl Long Amp Taq 2× Master Mix (from NEBNext Small RNA Library Prep Set) 5 μl Multiplex SR Primer (from NEBNext Small RNA Library Prep Set) 5 μl Index (X) Primer-Index 1-12 (NEBNext kit comes with 12 indexes, it is only necessary to use 1 index per sample). Each index contains a unique six-base sequence. During base calling, the Illumina software on the MiSeq or HiSeq will separate sequences based on their associated index sequences into sample-specific fastq data files.

23. Mix gently and spin briefly then put the reaction in the thermal cycler with the following program: 1 cycle: 10 cycles:

1 cycle: 1 cycle:

30 sec 15 sec 30 sec 15 sec 5 min indefinite

94°C 94°C 62°C 70°C 70°C 4°C

(initial denaturation) (denaturation) (annealing) (extension) (final extension) (final extension).

The number of cycles will be determined by the starting input of RNA; we suggest 10 PCR cycles when starting with at least 250 ng of input RNA. For input RNA

Next-Generation Sequencing RNA-Seq Library Construction.

This unit presents protocols for construction of next-generation sequencing (NGS) directional RNA sequencing libraries for the Illumina HiSeq and MiSe...
1MB Sizes 4 Downloads 3 Views