Michael I Love
michaelisaiahlove at gmail dot com
I use statistical models to infer biologically meaningful patterns from high-throughput sequencing data, and develop open-source statistical software for the Bioconductor Project. One of my main research efforts is in authoring and maintaining the DESeq2 package for statistical analysis of RNA-seq experiments.
From 2013-2016, I was a Postdoctoral Fellow in the group of Rafael Irizarry in the Biostatistics and Computational Biology Department at the Dana-Farber Cancer Institute and the Harvard TH Chan School of Public Health. I completed a PhD in Computational Biology and Scientific Computing (2013) in the Vingron Department at the Max Planck Institute for Molecular Genetics in Berlin and the Mathematics and Informatics Department of the Freie Universität, Berlin. I completed a Statistics M.S. (2010) and Mathematics B.S. (2005) at Stanford University.
My latest work is studying the effect of fragment sequence bias (e.g. fragment-level GC content bias) on RNA-seq transcript abundance estimation. I have developed a novel method and software for correcting fragment sequence bias for the purposes of transcript quantification in RNA-seq, which greatly reduces the technical artifacts and batch effects when compared to state-of-the-art methods.
In addition, I have collaborated with the authors and developers of the Salmon software, to incorporate fragment sequence bias correction into the fast, lightweight transcript abundance estimation methods. A new version of the Salmon manuscript describing the new bias correction methods has been posted to bioRxiv (8/2016).
There is a connection between the work in bias correction, transcript abundance estimation, and gene-level differential expression. Because the biases estimated by Salmon are incorporated as effective transcript lengths (following the method introduced by Roberts 2011), and because effective transcript lengths are incorporated and passed along to the statistical models when using the tximport pipeline, any biases estimated and corrected for by Salmon (or Sailfish or kallisto) will be propogated to the differential expression tools, such as DESeq2. DESeq2 analyzes the estimated counts, but takes into account factors affecting those counts, such as differences in sequencing depth, as well as technical or biological changes in genes’ effective lengths.