David A Knowles

Assistant Professor of Computer Science; Interdisciplinary Appointee in Systems Biology; Core Faculty Member, New York Genome Center

David Knowles uses machine learning approaches—probabilistic graphical models, deep learning and convex optimization—to address challenges in understanding large genomic datasets.

Technological innovations in sequencing, genome editing and imaging enable modern biologists to collect massive datasets to address diverse scientific questions from how cells differentiate into different types during development to the molecular mechanisms underlying genetic disease. However, obtaining biological understanding from these varied data presents substantial computational and statistical challenges.

He is particularly interested in data analysis challenges in genomics with the aim to better understand the causes and consequences of transcriptomic variation across the spectrum from rare to common genetic disease. This entails better characterization of the genetic and environmental factors contributing to mRNA expression and splicing. The lab works with diverse research groups that collect large-scale genomics datasets in the context of neurological disease and develop novel genomic technologies, including single-cell methods, forward genetic screens and long-read transcriptomics.

Knowles studied Natural Sciences and Information Engineering at the University of Cambridge before obtaining an MSc in Bioinformatics and Systems Biology at Imperial College London. During his PhD studies in the Cambridge University Engineering Department Machine Learning Group under Zoubin Ghahramani he worked on Bayesian nonparametric models for factor analysis, hierarchical clustering, and network analysis, as well as on variational inference. He was a postdoctoral researcher at Stanford University with Daphne Koller (Computer Science), Sylvia Plevritis (Center for Computational Systems Biology/Radiology), and Jonathan Pritchard (Genetics/Biology). At Columbia, he is Assistant Professor of Computer Science, an Interdisciplinary Appointee in Systems Biology, and an Affiliate Member of the Data Science Institute. He is also a Core Faculty Member at the New York Genome Center.

Research Areas


  • Computational Biology
  • Artificial Intelligence (AI) and Machine Learning (ML)
  • Computational Biology

Additional information


  • Professional Experience
    • Core Faculty Member, New York Genome Center, 2019 -
    • Assistant Professor, Computer Science, Columbia University, 2019 -
    • Interdisciplinary Appointee, Systems Biology, Columbia University, 2019 -
    • Affiliate Member, Data Science Institute, Columbia University, 2019 -
  • Education
    • PhD, Information Engineering, University of Cambridge
    • MSc, Bioinformatics and Systems Biology, Imperial College London
    • BA & MEng, Information Engineering, University of Cambridge