Jiannan Wang

Jiannan Wang

Assistant Professor
School of Computing Science
Simon Fraser University

Postdoc in the AMPLab at UC Berkeley (2015)
Ph.D. at THU (2013), B.Sc. at HIT (2008)

Research Areas: Database Systems, Big Data Science

Office: TASC 1 9237
Phone: 1-778-782-4288
Email: jnwang@sfu.ca

8888 University Drive
Burnaby, BC

Open positions: If you would like to work with me on big data research in the beautiful Greater Vancouver area, please email me your resume, a copy of undergraduate/graduate transcript, and a short statement of research interests, with the following subject: [PhD/Master/Visiting Application] Name+Major+School.

Research Interests

The mission of our lab is to speed up data science. We develop innovative technologies and open-source tools for data scientists such that they can turn raw data into actionable insights in a more efficient manner. Our current research topics include:
  • Data Cleaning for Machine Learning
  • Crowdsourced Data Cleaning
  • Data Enrichment with Deep Web
  • Interactive Analytics Over Big Data
Our lab's research is generously supported in part by NSERC, Simba, and Vancity.


Want to analyze Big Data interactively? Please check out our recent paper on interactive analytics, entitled "AQP++: Connecting Approximate Query Processing With Aggregate Precomputation for Interactive Analytics.", in SIGMOD 2018!
Our paper, entitled "Preference-driven Similarity Join", won a Best Student Paper Award at the IEEE/WIC/ACM WI 2017 conference.
I visited the UBC database group and gave a talk entitled Speeding Up Data Science: From a Data Management Perspective.
Received an NSERC CRD Grant for our proposal: "Entity Augmentation and Data Cleaning for Machine Learning" (PI).
We gave a tutorial entitled "Crowdsourced Data Management: Overview and Challenges" at the ACM SIGMOD 2017 conference. The slides can be downloaded from here.
Welcome new MSc students, Changbo Qu and Young Wu, to our lab.
Want to improve query performance for your big data systems? Please check out our recent paper on data skipping, entitled "Skipping-oriented Partitioning for Columnar Layouts.", in VLDB 2017!
We are happy to announce the first release of Reprowd! Reprowd facilitates the use of crowdsourcing for Data Labeling and Active Learning. The system was recently demonstrated at HCOMP 2016 and covered by the Reproducible Science.

Recent Publications [DBLP] [Google Scholar]







Graduate Students

Undergraduate Students

Visiting Students

  • Nathan Yan (HKU, 2017.05-)
  • Yongjun He (NJU, 2017.07-)




Profressional Activities

Program Committee

  • WWW (2017)
  • SIGMOD (2018, 2017, 2016), SIGMOD Demo (2016)
  • VLDB (2018), VLDB Demo (2017)
  • CIKM (2017)
  • HCOMP (2016)
  • ICDE/TKDE poster (2017, 2016)
  • WAIM (2017, 2016, 2015, 2014)
  • APWeb (2017, 2016)


  • SIGMOD 2017 Registration Chair
  • WISE 2017 Publicity Co-Chair


  Adapted from a template by Liwen Sun.