Eugene Wu


421 S.W. Mudd

Tel(212) 853-8475

Eugene Wu develops systems and algorithms for modern interactive data analysis.  His research focuses on the full interactive data analysis stack: from data cleaning and preparation, to scalable systems for interactive exploration interfaces, to automatic interface generation, to explanation tools that help explain anomalies encountered during data analysis.  His current project is the Data Visualization Management System, which integrates concepts from database research, such as declarative languages, query optimization, and lineage, with interactive visualizations, making it easier to design, architecture, build, and scale rich visual data exploration systems.

Research Interests

Databases, crowdsourcing, data provenance, data visualization, data explanation, data cleaning and preparation.

Wu’s research spans the areas of core database optimization, stream processing systems, crowd sourcing, data visualization, data cleaning, and HCI.  His work includes SASE, one of the first high performance complex event processing systems for high throughput data streams; Scorpion, which introduced a novel analysis feedback system that explains anomalies that analysts find in data visualizations; ActiveClean, the first interactive data cleaning algorithm designed for data science; and Precision Interfaces, the first large-scale automatic interface generation system.  His current work in data visualization management systems draws connections between data visualizations and data processing systems and unifies them under a single system abstraction. The interdisciplinary nature of his research leads Wu to work closely with researchers in information visualization, perception, theory, and machine learning.

Wu received a BS in electrical engineering and computer science from UC Berkeley in 2006, a PhD in electrical engineering and computer science from MIT in 2015, and was a Postdoctoral Fellow at UC Berkeley in 2015.


  • Postdoctoral fellow, U.C. Berkeley, 2015


  • Assistant Professor, Columbia University, 2015-


  • Best Demo Award, SIGMOD 2016
  • Best of Conference Citation, ICDE 2013
  • Best of Conference Citation, VLDB 2013


  • Eugene Wu, Fotis Psallidas, Zhengjie Miao, Haoci Zhang,Laura Rettig, Yifan Wu, Thibault Sellam    Combining Design and Performance in a Data Visualization Management System    CIDR 2017
  • Xiaolan Wang, Alexandra Meliou, Eugene Wu    QFix: Diagnosing errors through query histories    SIGMOD 2017
  • Sanjay Krishnan, Jiannan Wang, Eugene Wu, Michael J. Franklin, Ken Goldberg    ActiveClean: interactive data cleaning for statistical modeling VLDB 2016
  • Eugene Wu, Leilani Battle, Samuel Madden    The Case for Data Visualization Management Systems    VLDB 2014
  • Eugene Wu, Samuel Madden    Scorpion: Explaining Away Outliers in Aggregate Queries    VLDB 2013
  • Eugene Wu, Samuel Madden, Michael Stonebraker    SubZero: a Fine-Grained Lineage System for Scientific Databases    ICDE 2013 
  • Adam Marcus, Eugene Wu, David Karger, Samuel Madden, Robert Miller    Human-powered Sorts and Joins    VLDB 2012
  • Michael Cafarella, Alon Halevy, Daisy Wang, Eugene Wu, Yang Zhang    WebTables: Exploring the Power of Tables on the Web    VLDB 2008
  • Eugene Wu, Yanlei Diao, Shariq Rizvi    High-performance complex event processing over streams    SIGMOD 2006