Jiannan Wang's Archived News

  • 2017/11/03   
  • Want to analyze Big Data interactively? Please check out our recent paper on interactive analytics, entitled "AQP++: Connecting Approximate Query Processing With Aggregate Precomputation for Interactive Analytics.", in SIGMOD 2018!
  • 2017/08/26   
  • Our paper, entitled "Preference-driven Similarity Join", won a Best Student Paper Award at the IEEE/WIC/ACM WI 2017 conference.
  • 2017/07/24   
  • I visited the UBC database group and gave a talk entitled Speeding Up Data Science: From a Data Management Perspective.
  • 2017/06/07   
  • Received an NSERC CRD Grant for our proposal: "Entity Augmentation and Data Cleaning for Machine Learning" (PI).
  • 2017/05/14   
  • We gave a tutorial entitled "Crowdsourced Data Management: Overview and Challenges" at the ACM SIGMOD 2017 conference. The slides can be downloaded from here.
  • 2017/05/01   
  • Welcome new MSc students, Changbo Qu and Young Wu, to our lab.
  • 2017/01/04   
  • I am teaching CMPT 843: Traditional vs. Modern Database Systems and CMPT 733: Big Data Programming II for Spring Semester 2017.
  • 2016/11/20   
  • Want to improve query performance for your big data systems? Please check out our recent paper on data skipping, entitled "Skipping-oriented Partitioning for Columnar Layouts.", in VLDB 2017!
  • 2016/11/10   
  • We are happy to announce the first release of Reprowd! Reprowd facilitates the use of crowdsourcing for Data Labeling and Active Learning. The system was recently demonstrated at HCOMP 2016 and covered by the Reproducible Science
  • 2016/09/06   
  • Welcome Pei Wang, Mohamad Dolatshah, Jinglin Peng, Mathew Teoh to our lab! Thrilled to be able to work with such a group of talented students!
  • 2016/09/01   
  • I am teaching CMPT 884: Human-in-the-loop Data Management and CMPT 732: Big Data Programming for Fall Semester 2016.
  • 2016/07/15    
  • Want to know how the ActiveClean system works? Please check out our latest paper titled "ActiveClean: Interactive Data Cleaning For Statistical Modeling" in VLDB 2016.
  • 2016/06/30   
  • Our ActiveClean system has won the Best Demonstration Award in the ACM SIGMOD 2016 conference. The SIGMOD attendees were excited to see that the system helps data scientists to train a more reliable machine-learning model with much less time.
  • 2016/06/26    
  • We gave a tutorial entitled "Data Cleaning: Overview and Emerging Challenges" at the ACM SIGMOD 2016 conference. The slides can be downloaded from here
  • 2016/05/11   
  • Want to extract new insights from graph data? Please check out our paper titled "Finding Gangs in War from Signed Networks" in KDD 2016!
  • 2016/04/15   
  • "Analysis of Vancouver's Housing price market", a student project from my "Big Data Programming" course was featured in the Globe and Mail
  • 2016/04/07   
  • Received an NSERC Discovery Grant for my proposal: "Crowdsourced Data Cleaning" (PI)
  • 2016/04/06   
  • One research paper, one demo paper, and one tutorial were accepted by SIGMOD 2016!
  • 2016/02/16   
  • Received an NSERC RTI Grant for our proposal: "Computational Infrastructure for Online Big Data Analytics" (Co-PI)
  • 2016/02/16   
  • Want to know how crowdsourcing can help with data management? Please check out our latest survey on crowdsourced data managment.
  • 2016/01/26   
  • I am teaching CMPT 733: Big Data Programming for Spring Semester 2016.
  • 2016/01/19   
  • I join the School of Computing Science at Simon Fraser University as an Assistant Professor.
  • 2016/01/15   
  • Complete a postdoc journey at UC Berkeley! Cannot believe what I learnt in this period. Thanks all the AMPLab folks!
  • 2015/12/04   
  • I gave a talk entitled My Research Journey on 'Crowdsourced Data Cleaning' in the UW database group meeting.
  • 2015/11/13   
  • I was invited to write a trip report of VLDB 2015 by the Communications of the CCF (China Computer Foundation).
  • 2015/11/05   
  • An overview paper of our SampleClean project was published in the latest issue of the IEEE Data Engineering Bulletin.
  • 2015/10/27   
  • A new paper entitled "CLAMShell: Speeding up Crowds for Low-latency Data Labeling" got accepted by VLDB 2016! If you are complaining that "the crowd is so slow", you will find a solution from our paper.
  • 2015/06/09   
  • One research paper and one demo paper from our SampleClean project got accepted by VLDB 2015!
  • 2015/05/16   
  • We are happy to announce the release of SampleClean 0.1!
  • 2015/03/06   
  • I wrote an AMPLab blog post: When Data Cleaning Meets Crowdsourcing
  • 2015/03/05   
  • Our paper entitled "QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications" got accepted by SIGMOD 2015.
  • 2014/11/20   
  • Our SampleClean system was demonstrated at AMPCamp5 [slides] [video]. The vision of SampleClean is to bring data cleaning and crowdsourcing into the BDAS stack.
  • 2014/11/14   
  • I gave a talk to introduce the SampleClean system at UCI.
  • 2014/10/09   
  • I visited the Database Groups at Brown and MIT, and gave talks on our SampleClean project.
  • 2014/08/29   
  • Since 2011, the research topic of crowdsourced query processing has been gaining increasing attention in the database community. To help people better understand the research progress of this topic, I created a spreadsheet for maintaining the recent papers published in this topic. If you want to be a contributor to the list or if you find some interesting papers missing in the list, please feel free to drop me an email.
  • 2014/04/16   
  • A new paper entitled "A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data" got accepted by SIGMOD 2014. The paper presented SampleClean, a novel framework that marries data cleaning with sampling-based approximate query processing. This framework enables us to achieve accurate query results on dirty data, at significantly reduced cleaning cost. Please visit sampleclean.org for more details.
  • 2014/04/16   
  • A new paper entitled "Towards Dependable Data Repairing with Fixing Rules" was accepted by SIGMOD 2014. We proposed Fixing Rules, a new class of cleaning rules designed for automated and dependable data repairing. The paper shows that we can perform more reliable data repairing using fixing rules than other automated repairing approaches.
  • 2014/01/18   
  • I received the China Computer Federation (CCF) Distinguished Dissertation Award for my PhD work in crowdsourcing entity resolution. [News]
  • 2013/10/08   
  • A new paper entitled "Extending String Similarity Join to Tolerant Fuzzy Token Matching" was accepted by the ACM Transactions on Database Systems (TODS).
  • 2013/08/01   
  • Start a new journey at UC Berkeley!
  • 2013/06/22   
  • Yu Jiang, Jian He, Dong Deng and I (advised by Prof. Guoliang Li and Jianhua Feng) participated in the SIGMOD 2013 Programming Contest. We were selected as one of five finallists, and presented our methods at SIGMOD 2013.
  • 2013/06/03   
  • Defend my PhD dissertation! :)
  • 2013/03/22   
  • Yu Jiang, Dong Deng and I (advised by Prof. Guoliang Li and Jianhua Feng) participated in the String Similarity Search/Join Competition in EDBT 2013. The competition consisted of four tasks, where we won 1st place in the three tasks, and 2nd place in the other task. In particular, our programs ran 10~100x faster than the second best team in the two similarity-join tasks. [Results] [Paper] [News]
  • 2013/02/10   
  • Our joint paper with Brown University and UC Berkeley AMPLab was accepted by SIGMOD 2013. The paper deeply investigated the effect of transitive relations on crowdsourced joins, and presented a hybrid labeling framework that achieved a 95%+ reduction in cost and time over the state-of-the-art approach. [Paper]
  • 2013/02/08   
  • Complete a three-month internship at Qatar Computing Research Institute (QCRI), a lot of fun!