Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Normalizing single-cell RNA sequencing data: challenges and opportunities

Abstract

Single-cell transcriptomics is becoming an important component of the molecular biologist's toolkit. A critical step when analyzing data generated using this technology is normalization. However, normalization is typically performed using methods developed for bulk RNA sequencing or even microarray data, and the suitability of these methods for single-cell transcriptomics has not been assessed. We here discuss commonly used normalization approaches and illustrate how these can produce misleading results. Finally, we present alternative approaches and provide recommendations for single-cell RNA sequencing users.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Cell- and gene-specific effects in RNA-seq experiments.
Figure 2: Comparison of bulk-based normalization methods in real and simulated data sets.
Figure 3: ERCC spike-ins can be used to estimate mRNA content.

Similar content being viewed by others

References

  1. Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).

    Article  CAS  PubMed  Google Scholar 

  2. Shapiro, E., Biezuner, T. & Linnarsson, S. Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat. Rev. Genet. 14, 618–630 (2013).

    CAS  PubMed  Google Scholar 

  3. Stegle, O., Teichmann, S.A. & Marioni, J.C. Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet. 16, 133–145 (2015).

    CAS  PubMed  Google Scholar 

  4. Saliba, A.-E., Westermann, A.J., Gorski, S.A. & Vogel, J. Single-cell RNA-seq: advances and future challenges. Nucleic Acids Res. 42, 8845–8860 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Gawad, C., Koh, W. & Quake, S.R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016).

    CAS  PubMed  Google Scholar 

  6. Brennecke, P. et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods 10, 1093–1095 (2013).

    CAS  PubMed  Google Scholar 

  7. Kharchenko, P.V., Silberstein, L. & Scadden, D.T. Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740–742 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).

    PubMed  PubMed Central  Google Scholar 

  9. Pierson, E. & Yau, C. ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol. 16, 241 (2015).

    PubMed  PubMed Central  Google Scholar 

  10. Bacher, R. & Kendziorski, C. Design and computational analysis of single-cell RNA-sequencing experiments. Genome Biol. 17, 63 (2016).

    PubMed  PubMed Central  Google Scholar 

  11. Vallejos, C.A., Richardson, S. & Marioni, J.C. Beyond comparisons of means: understanding changes in gene expression at the single-cell level. Genome Biol. 17, 70 (2016).

    PubMed  PubMed Central  Google Scholar 

  12. Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Kolodziejczyk, A.A., Kim, J.K., Svensson, V., Marioni, J.C. & Teichmann, S.A. The technology and biology of single-cell RNA sequencing. Mol. Cell 58, 610–620 (2015).

    CAS  PubMed  Google Scholar 

  14. Lun, A.T., Bach, K. & Marioni, J.C. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 17, 75 (2016).

    PubMed  Google Scholar 

  15. Smyth, G.K. & Speed, T. Normalization of cDNA microarray data. Methods 31, 265–273 (2003).

    CAS  PubMed  Google Scholar 

  16. Bullard, J.H., Purdom, E., Hansen, K.D. & Dudoit, S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11, 94 (2010).

    PubMed  PubMed Central  Google Scholar 

  17. Dillies, M.-A. et al. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. 14, 671–683 (2013).

    CAS  PubMed  Google Scholar 

  18. Hicks, S.C., Teng, M. & Irizarry, R.A. On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data. Preprint at http://biorxiv.org/content/early/2015/08/25/025528 (2015).

  19. Risso, D., Ngai, J., Speed, T.P. & Dudoit, S. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat. Biotechnol. 32, 896–902 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Leek, J.T. svaseq: removing batch effects and other unwanted noise from sequencing data. Nucleic Acids Res. 42, e161 (2014).

    PubMed Central  Google Scholar 

  21. Islam, S. et al. Quantitative single-cell RNA-seq with unique molecular identifiers. Nat. Methods 11, 163–166 (2014).

    CAS  PubMed  Google Scholar 

  22. Grün, D. & van Oudenaarden, A. Design and analysis of single-cell sequencing experiments. Cell 163, 799–810 (2015).

    PubMed  Google Scholar 

  23. Vallejos, C.A., Marioni, J.C. & Richardson, S. BASiCS: Bayesian analysis of single-cell sequencing data. PLoS Comput. Biol. 11, e1004333 (2015).

    PubMed  PubMed Central  Google Scholar 

  24. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

    CAS  PubMed  Google Scholar 

  25. Li, B., Ruotti, V., Stewart, R.M., Thomson, J.A. & Dewey, C.N. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493–500 (2010).

    PubMed  Google Scholar 

  26. Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Robinson, M.D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010).

    PubMed  PubMed Central  Google Scholar 

  28. Klein, A.M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Pollen, A.A. et al. Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat. Biotechnol. 32, 1053–1058 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Zeisel, A. et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).

    CAS  PubMed  Google Scholar 

  31. Macosko, E.Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Buettner, F. et al. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat. Biotechnol. 33, 155–160 (2015).

    CAS  PubMed  Google Scholar 

  34. Haghverdi, L., Büttner, M., Wolf, F.A., Buettner, F. & Theis, F.J. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016).

    CAS  PubMed  Google Scholar 

  35. Ding, B. et al. Normalization and noise reduction for single cell RNA-seq experiments. Bioinformatics 31, 2225–2227 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Katayama, S., Töhönen, V., Linnarsson, S. & Kere, J. SAMstrt: statistical test for differential expression in single-cell transcriptome with spike-in normalization. Bioinformatics 29, 2943–2945 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Bacher, R. et al. SCnorm: a quantile-regression based approach for robust normalization of single-cell RNA-seq data. Nat. Methods http://dx.doi.org/10.1038/nmeth.4263 (2017).

  38. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nat. Methods 14, 309–315 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res. 21, 1160–1167 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Munro, S.A. et al. Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures. Nat. Commun. 5, 5125 (2014).

    CAS  PubMed  Google Scholar 

  41. Goolam, M. et al. Heterogeneity in Oct4 and Sox2 targets biases cell fate in 4-cell mouse embryos. Cell 165, 61–74 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Scialdone, A. et al. Computational assignment of cell-cycle stage from single-cell transcriptome data. Methods 85, 54–61 (2015).

    CAS  PubMed  Google Scholar 

  43. Hardwick, S.A. et al. Spliced synthetic genes as internal controls in RNA sequencing experiments. Nat. Methods 13, 792–798 (2016).

    CAS  PubMed  Google Scholar 

  44. Lovén, J. et al. Revisiting global gene expression analysis. Cell 151, 476–482 (2012).

    PubMed  PubMed Central  Google Scholar 

  45. Cole, M. & Risso, D. scone: Single Cell Overview of Normalized Expression data, R package version 0.99.6 (2016).

  46. Kolodziejczyk, A.A. et al. Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell 17, 471–485 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank several members of the Marioni laboratory (European Molecular Biology Laboratory - European Bioinformatics Institute, EMBL-EBI; Cancer Research UK - Cambridge Institute, CRUK-CI) for support and discussions throughout the preparation of this manuscript. In particular, we are grateful to A. Lun (CRUK-CI) for constructive comments on an earlier version of the manuscript. We are also grateful to UC Berkeley collaborator J. Ngai and his group members. C.A.V., A.S., and J.C.M. acknowledge core EMBL funding. C.A.V. was supported by core MRC funding (MRC MC UP 0801/1) and by The Alan Turing Institute under the EPSRC grant no. EP/N510129/1. J.C.M. acknowledges core support from CRUK. A.S. acknowledges funding from the Wellcome Trust Strategic Award 105031/D/14/Z, “Tracing early mammalian lineage decisions by single-cell genomics.” D.R. and S.D. are supported by the US National Institutes of Health BRAIN Initiative grant no. U01 MH105979 (PI, J. Ngai).

Author information

Authors and Affiliations

Authors

Contributions

C.A.V., D.R., and A.S. performed analyses. C.A.V., D.R., A.S., S.D., and J.C.M. wrote the manuscript. S.D. and J.C.M. supervised the study.

Corresponding authors

Correspondence to Sandrine Dudoit or John C Marioni.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Data 1–3. (PDF 35324 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vallejos, C., Risso, D., Scialdone, A. et al. Normalizing single-cell RNA sequencing data: challenges and opportunities. Nat Methods 14, 565–571 (2017). https://doi.org/10.1038/nmeth.4292

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.4292

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing