Chapter 6 References

1. Love M, Anders S, Kim V, Huber W. 2015. RNA-seq workflow: Gene-level exploratory analysis and differential expression. F1000Research. 4(1070):

2. Love M, Soneson C, Patro R. 2018. Swimming downstream: Statistical analysis of differential transcript usage following salmon quantification. F1000Research. 7(952):

3. Van den Berge K, Hembach KM, Soneson C, Tiberi S, Clement L, et al. 2019. RNA sequencing data: Hitchhiker’s guide to expression analysis. Annual Review of Biomedical Data Science. 2(1):139–73

4. Ewels P, Magnusson M, Lundin S, Käller M. 2016. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 32(19):3047–8

5. King HW, Klose RJ. 2017. The pioneer factor oct4 requires the chromatin remodeller brg1 to support gene regulatory element function in mouse embryonic stem cells. eLife. 6:e22631

6. Patro R, Duggal G, Love M, Irizarry R, Kingsford C. 2017. Salmon provides fast and bias-aware quantification of transcript expression. Nature Methods. 14:417–19

7. Köster J, Rahmann S. 2012. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics. 28(19):2520–2

8. Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, et al. 2015. Orchestrating high-throughput genomic analysis with Bioconductor. Nature Methods. 12(2):115–21

9. Love MI, Soneson C, Hickey PF, Johnson LK, Pierce NT, et al. 2019. Tximeta: reference sequence checksums for provenance identification in RNA-seq. bioRxiv

10. Srivastava A, Malik L, Smith TS, Sudbery I, Patro R. 2019. Alevin efficiently estimates accurate gene abundances from dscRNA-seq data. Genome Biology. 20(65):

11. Frankish A, GENCODE-consoritum, Flicek P. 2018. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Research

12. Soneson C, Love MI, Robinson M. 2015. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Research. 4(1521):

13. Lawrence M, Huber W, Pagès H, Aboyoun P, Carlson M, et al. 2013. Software for Computing and Annotating Genomic Ranges. PLoS Computational Biology. 9(8):e1003118+

14. Love MI, Huber W, Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology. 15(12):550

15. Robinson MD, McCarthy DJ, Smyth GK. 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 26(1):139

16. McCarthy DJ, Chen Y, Smyth GK. 2012. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Research. 40:4288–97

17. Law CW, Chen Y, Shi W, Smyth GK. 2014. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology. 15(2):29

18. Wu H, Wang C, Wu Z. 2012. A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics

19. Ignatiadis N, Klaus B, Zaugg J, Huber W. 2016. Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nature Methods

20. Dudoit S, Yang YH, Callow MJ, Speed TP. 2002. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica. 12(1):111–39

21. Roberts CJ, Nelson B, Marton MJ, Stoughton R, Meyer MR, et al. 2000. Signaling and circuitry of multiple mapk pathways revealed by a matrix of global gene expression profiles. Science. 287(5454):873–80

22. Cox DR, Reid N. 1987. Parameter orthogonality and approximate conditional inference. Journal of the Royal Statistical Society, Series B. 49(1):1–39

23. Tibshirani R. 1988. Estimating transformations for regression via additivity and variance stabilization. Journal of the American Statistical Association. 83:394–405

24. Anders S, Huber W. 2010. Differential expression analysis for sequence count data. Genome Biology. 11:R106

25. Witten DM. 2011. Classification and clustering of sequencing data using a Poisson model. The Annals of Applied Statistics. 5(4):2493–2518

26. Townes FW, Hicks SC, Aryee MJ, Irizarry RA. 2019. Feature Selection and Dimension Reduction for Single Cell RNA-Seq based on a Multinomial Model. bioRxiv

27. Zhu A, Ibrahim JG, Love MI. 2018. Heavy-tailed prior distributions for sequence count data: Removing the noise and preserving large differences. Bioinformatics

28. Stephens M. 2016. False discovery rates: A new deal. Biostatistics. 18(2):

29. Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society B. 57:289–300

30. Soneson C, Matthes KL, Nowicka M, Law CW, Robinson MD. 2016. Isoform prefiltering improves performance of count-based methods for analysis of differential transcript usage. Genome Biology. 17(1):12

31. Anders S, Reyes A, Huber W. 2012. Detecting differential usage of exons from RNA-seq data. Genome Research. 22(10):2008–17

32. Nowicka M, Robinson M. 2016. DRIMSeq: a Dirichlet-multinomial framework for multivariate count outcomes in genomics. F1000Research. 5(1356):

33. Van den Berge K, Soneson C, Robinson MD, Clement L. 2017. stageR: a general stage-wise method for controlling the gene-level false discovery rate in differential expression and differential transcript usage. Genome Biology. 18(1):151

34. Alasoo K, Rodrigues J, Mukhopadhyay S, Knights A, Mann A, et al. 2018. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nature Genetics. 50:424–31

35. Love MI, Hogenesch JB, Irizarry RA. 2016. Modeling of rna-seq fragment sequence bias reduces systematic errors in transcript abundance estimation. Nature Biotechnology. 34(12):1287–91

36. Glaus P, Honkela A, Rattray M. 2012. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics. 28(13):

37. Turro E, Astle WJ, Tavaré S. 2013. Flexible analysis of RNA-seq data using mixed effects models. Bioinformatics. 30(2):180–88

38. Al Seesi S, Temate-Tiagueu Y, Zelikovsky A, Măndoiu II. 2014. Bootstrap-based differential gene expression analysis for RNA-Seq data with and without replicates. BMC Genomics. 15(Suppl 8):

39. Pimentel H, Bray NL, Puente S, Melsted P, Pachter L. 2017. Differential analysis of RNA-seq incorporating quantification uncertainty. Nature Methods. 14(7):687–90

40. Bray NL, Pimentel H, Melsted P, Pachter L. 2016. Near-optimal probabilistic RNA-seq quantification. Nature Biotechnology. 34(5):525

41. Zhu A, Srivastava A, Ibrahim J, Patro R, Love M. 2019. Nonparametric expression analysis using inferential replicate counts. Nucleic Acids Research

42. Li J, Tibshirani R. 2011. Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-Seq data. Statistical Methods in Medical Research. 22(5):519–36

43. Turro E, Su S-Y, Gonçalves Â, Coin LJ, Richardson S, Lewin A. 2011. Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads. Genome Biology. 12(2):R13

44. Storey J, Tibshirani R. 2003. Statistical significance for genome-wide experiments. Proceedings of the National Academy of Sciences. 100(16):9440–5

45. Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, et al. 2020. Orchestrating single-cell analysis with bioconductor. Nature Methods. 17(2):137–45

46. Soneson C, Robinson MD. 2018. Bias, robustness and scalability in single-cell differential expression analysis. Nature Methods. 15(4):255–61

47. Sun S, Zhu J, Ma Y, Zhou X. 2019. Accuracy, Robustness and Scalability of Dimensionality Reduction Methods for Single Cell RNAseq Analysis. bioRxiv

48. Duo A, Robinson M, Soneson C. 2018. A systematic performance evaluation of clustering methods for single-cell RNA-seq data. F1000Research. 7(1141):

49. Van den Berge K, Perraudeau F, Soneson C, Love MI, Risso D, et al. 2018. Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications. Genome Biology. 19(24):

50. Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 34(18):3094–3100

51. Soneson C, Yao Y, Bratus-Neuenschwander A, Patrignani A, Robinson MD, Hussain S. 2019. A comprehensive examination of Nanopore native RNA sequencing for characterization of complex transcriptomes. Nature Communications. 10(1):3359

52. Cruz-Garcia L, O’Brien G, Sipos B, Mayes S, Love M, et al. 2019. Generation of a transcriptional radiation exposure signature in human blood using long-read nanopore sequencing. Radiation research

53. Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q. 2020. Opportunities and challenges in long-read sequencing data analysis. Genome Biology. 21(1):30

54. Castel SE, Levy-Moonshine A, Mohammadi P, Banks E, Lappalainen T. 2015. Tools and best practices for data processing in allelic expression analysis. Genome Biology. 16(1):195

55. Raghupathy N, Choi K, Vincent MJ, Beane GL, Sheppard KS, et al. 2018. Hierarchical analysis of RNA-seq reads improves the accuracy of allele-specific expression. Bioinformatics. 34(13):2177–84

56. Srivastava A, Malik L, Sarkar H, Zakeri M, Almodaresi F, et al. 2019. Alignment and mapping methodology influence transcript abundance estimation. bioRxiv

57. Fernandes AD, Reid JN, Macklaim JM, McMurrough TA, Edgell DR, Gloor GB. 2014. Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome. 2(1):15

58. Calgaro M, Romualdi C, Waldron L, Risso D, Vitulo N. 2020. Assessment of single cell rna-seq statistical methods on microbiome data. bioRxiv

59. Callahan B, Sankaran K, Fukuyama J, McMurdie P, Holmes S. 2016. Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses. F1000Research. 5(1492):

60. Sankaran K, Holmes SP. 2018. Latent variable modeling for the microbiome. Biostatistics. 20(4):599–614

61. Willis AD. 2019. Rarefaction, alpha diversity, and statistics. Frontiers in Microbiology. 10:2407