Portrait of Prof. David Knowles  

David Knowles


450 Computer Science Building
Mail Code: 0401

Research Interests

probabilistic graphical models, deep learning, convex optimization, genomics, neurological disease, human genetics

David Knowles uses machine learning approaches—probabilistic graphical models, deep learning and convex optimization—to address challenges in understanding large genomic datasets.

Technological innovations in sequencing, genome editing and imaging enable modern biologists to collect massive datasets to address diverse scientific questions from how cells differentiate into different types during development to the molecular mechanisms underlying genetic disease. However, obtaining biological understanding from these varied data presents substantial computational and statistical challenges.

He is particularly interested in data analysis challenges in genomics with the aim to better understand the role of transcriptomic dysregulation across the spectrum from rare to common genetic disease. This entails better characterization of the genetic and environmental factors contributing to mRNA expression and splicing variation. The lab works with diverse research groups that collect large-scale genomics datasets in the context of neurological disease and develop novel genomic technologies, including single cell methods, forward genetic screens and long-read transcriptomics.

Knowles studied Natural Sciences and Information Engineering at the University of Cambridge before obtaining an MSc in Bioinformatics and Systems Biology at Imperial College London. During his PhD studies in the Cambridge University Engineering Department Machine Learning Group under Zoubin Ghahramani he worked on Bayesian nonparametric models for factor analysis, hierarchical clustering and network analysis, as well as on variational inference. He was a postdoctoral researcher at Stanford University with Daphne Koller (Computer Science), Sylvia Plevritis (Center for Computational Systems Biology/Radiology) and Jonathan Pritchard (Genetics/Biology). At Columbia he is an Assistant Professor of Computer Science, an Interdisciplinary Appointee in Systems Biology and an Affiliate Member of the Data Science Institute. He is also a Core Faculty Member at the New York Genome Center.


  • Core Faculty Member, New York Genome Center, 2019 -
  • Assistant Professor, Computer Science, Columbia University, 2019 -
  • Interdisciplinary Appointee, Systems Biology, Columbia University, 2019 -
  • Affiliate Member, Data Science Institute, Columbia University, 2019 -


  • Brielin C. Brown and David A. Knowles. “Phenome-scale causal network discovery with bidirectional mediated Mendelian randomization”. bioRxiv (2020)
  • Andrew Stirn, Tony Jebara, and David A. Knowles. “A New Distribution on the Simplex with Auto-Encoding Applications”. Advances in Neural Information Processing Systems. 2019.
  • David A Knowles, Joe R Davis, Hilary Edgington, Anil Raj, Marie-Julie Favé, Xiaowei Zhu, James B Potash, Myrna M Weissman, Jianxin Shi, Doug Levinson, Philip Awadalla, Sara Mostafavi, Stephen B Montgomery, and Alexis Battle. “Allele-specific expression reveals interactions between genetic variation and environment”. Nature Methods (2017)
  • David A Knowles*, Courtney K Burrows*, John D Blischak, Kristen M Patterson, Daniel J Serie, Nadine Norton, Carole Ober, Jonathan K Pritchard, and Yoav Gilad. “Determining the genetic basis of anthracycline-cardiotoxicity by molecular response QTL mapping in induced cardiomyocytes”. eLife (2018)
  • Yang I. Li*, David A. Knowles*, Jack Humphrey, Alvaro N. Barbeira, Scott P. Dickinson, Hae Kyung Im, and Jonathan K. Pritchard. “Annotation-free quantification of RNA splicing using LeafCutter”. Nature Genetics (2017)
  • Tim Salimans and David A. Knowles. “Fixed-form variational posterior approximation throughstochastic linear regression”. Bayesian Analysis (2013).