Abstract
High-throughput RNA sequencing (RNA-seq) greatly expands the potential for genomics discoveries, but the wide variety of platforms, protocols and performance capabilitites has created the need for comprehensive reference data. Here we describe the Association of Biomolecular Resource Facilities next-generation sequencing (ABRF-NGS) study on RNA-seq. We carried out replicate experiments across 15 laboratory sites using reference RNA standards to test four protocols (poly-Aâselected, ribo-depleted, size-selected and degraded) on five sequencing platforms (Illumina HiSeq, Life Technologies PGM and Proton, Pacific Biosciences RS and Roche 454). The results show high intraplatform (Spearman rank R > 0.86) and inter-platform (R > 0.83) concordance for expression measures across the deep-count platforms, but highly variable efficiency and cost for splice junction and variant detection between all platforms. For intact RNA, gene expression profiles from rRNA-depletion and poly-A enrichment are similar. In addition, rRNA depletion enables effective analysis of degraded RNA samples. This study provides a broad foundation for cross-platform standardization, evaluation and improvement of RNA-seq.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
209,00 ⬠per year
only 17,42 ⬠per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Accession codes
Change history
10 October 2014
In the version of this article initially published, author Jeffrey Rosenfeld's middle initial âAâ was omitted. The error has been corrected in the HTML and PDF versions of the article.
References
Wang, E.T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470â476 (2008).
Nagalakshmi, U., Waern, K. & Snyder, M. RNA-Seq: a method for comprehensive transcriptome analysis. Curr. Protoc. Mol. Biol. 89, 4.11 (2010).
Liu, S., Lin, L., Jiang, P., Wang, D. & Xing, Y. A comparison of RNA-Seq and high-density exon array for detecting differential gene expression between closely related species. Nucleic Acids Res. 39, 578â588 (2011).
Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M. & Gilad, Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509â1517 (2008).
Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621â628 (2008).
Liu, L. et al. Comparison of next-generation sequencing systems. J. Biomed. Biotechnol. 2012, 251364 (2012).
Ratan, A. et al. Comparison of sequencing platforms for single nucleotide variant calls in a human sample. PLoS ONE 8, e55089 (2013).
Quail, M.A. et al. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13, 341 (2012).
Loman, N.J. et al. Performance comparison of benchtop high-throughput sequencing platforms. Nat. Biotechnol. 30, 434â439 (2012).
Shi, L. et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24, 1151â1161 (2006).
SEQC/MAQC-III Consortium. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 10.1038/nbt.2957 (24 August 2014).
't Hoen, P.A. et al. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat. Biotechnol. 31, 1015â1022 (2013).
Tarazona, S., Garcia-Alcalde, F., Dopazo, J., Ferrer, A. & Conesa, A. Differential expression in RNA-seq: a matter of depth. Genome Res. 21, 2213â2223 (2011).
Katz, Y., Wang, E.T., Airoldi, E.M. & Burge, C.B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009â1015 (2010).
Åabaj, P.P. et al. Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling. Bioinformatics 27, i383âi391 (2011).
McIntyre, L.M. et al. RNA-seq: technical variability and sampling. BMC Genomics 12, 293 (2011).
Huang, R. et al. An RNA-Seq strategy to detect the complete coding and non-coding transcriptome including full-length imprinted macro ncRNAs. PLoS ONE 6, e27288 (2011).
Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57â63 (2009).
Toung, J.M., Morley, M., Li, M. & Cheung, V.G. RNA-sequence analysis of human B-cells. Genome Res. 21, 991â998 (2011).
Roberts, A., Pimentel, H., Trapnell, C. & Pachter, L. Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics 27, 2325â2329 (2011).
Angeletti, R.H. et al. Research technologies: fulfilling the promise. FASEB J. 13, 595â601 (1999).
Moelans, C.B., Oostenrijk, D., Moons, M.J. & van Diest, P.J. Formaldehyde substitute fixatives: effects on nucleic acid preservation. J. Clin. Pathol. 64, 960â967 (2011).
Opitz, L. et al. Impact of RNA degradation on gene expression profiling. BMC Med. Genomics 3, 36 (2010).
Morlan, J.D., Qu, K. & Sinicropi, D.V. Selective depletion of rRNA enables whole transcriptome profiling of archival fixed tissue. PLoS ONE 7, e42882 (2012).
Li, S. et al. Detecting and correcting systematic variation in large-scale RNA sequencing data. Nat. Biotechnol. 10.1038/nbt.3000 (24 August 2014).
Pareek, C.S., Smoczynski, R. & Tretyn, A. Sequencing technologies and genome sequencing. J. Appl. Genet. 52, 413â435 (2011).
Adiconis, X. et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10, 623â629 (2013).
Boland, J.F. et al. The new sequencer on the block: comparison of Life Technology's Proton sequencer to an Illumina HiSeq for whole-exome sequencing. Hum. Genet. 132, 1153â1163 (2013).
Glenn, T.C. Field guide to next-generation DNA sequencers. Mol. Ecol. Resour. 11, 759â769 (2011).
Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543â1551 (2011).
Zook, J.M., Samarov, D., McDaniel, J., Sen, S.K. & Salit, M. Synthetic spike-in standards improve run-specific systematic error analysis for DNA and RNA sequencing. PLoS ONE 7, e41356 (2012).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15â21 (2013).
Hansen, K.D., Brenner, S.E. & Dudoit, S. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res. 38, e131 (2010).
Aird, D. et al. Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol. 12, R18 (2011).
Risso, D., Schwartz, K., Sherlock, G. & Dudoit, S. GC-content normalization for RNA-Seq data. BMC Bioinformatics 12, 480 (2011).
1000 Genomes Project Consortium et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56â65 (2012).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297â1303 (2010).
Sharon, D., Tilgner, H., Grubert, F. & Snyder, M. A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol. 31, 1009â1014 (2013).
Smyth, G.K. in Bioinformatics and Computational Biology Solutions Using R and Bioconductor (eds. Gentleman, R., Carey, V., Huber, W., Irizarry, R. & Dudoit, S.) 397â420 (Springer New York, 2005).
Cui, P. et al. A comparison between ribo-minus RNA-sequencing and polyA-selected RNA-sequencing. Genomics 96, 259â265 (2010).
Leek, J.T., Johnson, W.E., Parker, H.S., Jaffe, A.E. & Storey, J.D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882â883 (2012).
Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A. & Nielsen, H. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16, 412â424 (2000).
Shi, L. et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat. Biotechnol. 28, 827â838 (2010).
Li, S. & Mason, C. E. The pivotal regulatory landscape of RNA modifications. Annu. Rev. Genomics Hum. Genet. 10.1146/annurev-genom-090413-025405 (2 June 2014).
Haas, B.J. & Zody, M.C. Advancing RNA-Seq analysis. Nat. Biotechnol. 28, 421â423 (2010).
Wenger, Y. & Galliot, B. RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome. BMC Genomics 14, 204 (2013).
Pipes, L. et al. The non-human primate reference transcriptome resource (NHPRTR) for comparative functional genomics. Nucleic Acids Res. 41, D906âD914 (2013).
Krupp, M. et al. RNA-Seq Atlasâa reference database for gene expression profiling in normal tissue by next-generation sequencing. Bioinformatics 28, 1184â1185 (2012).
Van Peer, G., Mestdagh, P. & Vandesompele, J. Accurate RT-qPCR gene expression analysis on cell culture lysates. Sci. Rep. 2, 222 (2012).
Hellemans, J., Mortier, G., De Paepe, A., Speleman, F. & Vandesompele, J. qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biol. 8, R19 (2007).
Bustin, S.A. et al. The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin. Chem. 55, 611â622 (2009).
Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139â140 (2010).
Robinson, M.D. & Smyth, G.K. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23, 2881â2887 (2007).
Robinson, M.D. & Smyth, G.K. Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics 9, 321â332 (2008).
Leek, J.T. & Storey, J.D. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 3, 1724â1735 (2007).
Wang, L., Wang, S. & Li, W. RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184â2185 (2012).
Canales, R.D. et al. Evaluation of DNA microarray results with quantitative gene expression platforms. Nat. Biotechnol. 24, 1115â1122 (2006).
Dvinge, H. & Bertone, P. HTqPCR: high-throughput analysis and visualization of quantitative real-time PCR data in R. Bioinformatics 25, 3325â3326 (2009).
Acknowledgements
We greatly appreciate the contribution and distribution of reference sample RNA from L. Shi (FDA) and his valuable interactions to assist in the planning of this study. This work was supported with funding from the National Institutes of Health (NIH), including R01HG006798, R01NS076465, R24RR032341, as well as funds from the Irma T. Hirschl and Monique Weill-Caulier Charitable Trusts and the STARR Consortium (I7-A765).
We thank the following contributors for their technical wisdom, including laboratory expertise, data analysis and bioinformatics contributions, and technical design guidance and consultation. Without their help, this study would not have been possible: D. Stopka (Memorial Sloan-Kettering Cancer Institute), G. Grove (Penn State Univ.), D. Hannon (Penn State Univ.), K. Jones (NIH/NCI/SAIC), C. Raley (NIH/NCI/SAIC), H. O'Geen (UC Davis), D. Zheng (Univ. Illinois-Urbana), O. Nguyen (UC Davis), Z.-W. Lu (UC Davis), J. Spisak (Cornell Univ.), D. Lin (NIH/NIAID), J. Pillardy (Cornell Univ.), P.-Y. Wu (Georgia Institute of Technology), J. Phan (Emory Univ.), D. Oschwald (New York Genome Center), H. Arnold (PerkinElmer), S. Tyndale (Univ. Southern California), H. Truong (Univ. Southern California), Y. Zhang (Univ. Florida), N. Panayotova (Univ. Florida), D. Moraga (Univ. Florida), S. Shanker (Univ. Florida), and N. Barker (US Army Environmental Quality Research Program).
We would also like to thank the platform vendors, Illumina, Life Technologies, Pacific Biosciences and Roche Life Sciences, for their support of this study, and their distinguished scientists for providing technical expertise and assistance in study designs, protocols, new methods development and significant contributions of reagents and sequencing kits. In particular, alphabetically by vendor: G. Schroth (Illumina); M. Gallad, J. Smith, T. Bittick, R. Setterquist and G. Scott (Life Technologies); J. Korlach, S. Turner and E. Tseng (Pacific Biosciences); and K. Fredrickson and C. Teiling (Roche Life Sciences).
We are sincerely appreciative of the Association of Biomolecular Resource Facilities (ABRF) for supporting this study and the contributing ABRF Research Groups. Special thanks to our ABRF executive board liaison A. Perera (Stowers Institute for Medical Research).
Author information
Authors and Affiliations
Contributions
All authors are members of the Association of Biomolecular Resource Facilities Next-Generation Sequencing (ABRF-NGS) Consortium. S.W.T., C.M.N., D.A.B., G.S.G. and C.E.M. managed the project. S.W.T., C.M.N., D.G., S.L., W.F., A.V., C.W., P.A.S., Y.G., D.K., J.B., B.H., R.K., N.J., N.R., J.G., N.G.-R., C.H., D.R., J.R., T.S., J.G.U., C.E.M. and P.Z. performed sequencing. S.L., S.W.T., C.M.N., D.A.B., G.S.G. and C.E.M. designed the analyses. S.L., P.A.S., J.G.U., P.Z., C.E.M. and D.K. performed the data analyses. S.L., P.Z., M.W., D.K., J.G.U. and C.E.M. made the figures. S.L., S.W.T., C.M.N., D.A.B., G.S.G. and C.E.M. wrote and revised the manuscript. The ABRF-NGS Consortium members contributed to the design and execution of the study.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1-39 and Supplementary Tables 1â8 (PDF 10745 kb)
Rights and permissions
About this article
Cite this article
Li, S., Tighe, S., Nicolet, C. et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nat Biotechnol 32, 915â925 (2014). https://doi.org/10.1038/nbt.2972
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.2972