CMPT 882 - Statistical Machine Translation: Fall 2011

  • CMPT 882 (Special Topics in AI) - Statistical Machine Translation
  • Semester: Fall 2011 (1117)
  • Instructor: Anoop Sarkar
  • Meeting Information:
    • Tu 10:30AM - 11:20AM in WMC 2268
    • Th 09:30AM - 11:20PM in WMC 2268
  • Dates: Sep 6, 2011 - Dec 1, 2011

About this course

  • Class Mailing List: cmpt-882@sfu.ca (always use 'cmpt-882:' prefix on Subject line)
  • Location: Burnaby Mountain Campus
  • Class number: 12577
  • Section: D100
  • About the course:

    This special topics course will be focused on statistical machine translation (data-driven approaches that translate speech or text from one human language to another). Three major paradigms will be covered: word-based translation, phrase-based translation, and syntax-based translation. Students will gain hands-on experience with building translation systems and working with real-world data, and they will learn how to formulate and investigate research questions in machine translation.

  • Course Outline: on CS Portal
  • Grading for the course:
    • 3 homeworks: 10% each (total of 30%)
    • Class participation (in-class + email): 15%
    • 2 in-class presentations: 5% each (total of 10%)
    • Project proposal: 5%
    • Final project write-up: 15%
    • Final project results: 25%

Textbook and References

Textbook:

  1. Statistical Machine Translation by Philipp Koehn. Hardcover, 488 pages. Publisher: Cambridge University Press. ISBN-10: 0521874157. ISBN-13: 978-0521874151

    The book also has a webpage. In particular visit it for the Errata.
    We will follow the material in this textbook closely but not in all aspects. We will also read research papers as listed in the Syllabus.

Readings

  1. Introduction to Statistical Machine Translation
  2. Decoding for phrase-based SMT
  3. Evaluation
  4. Minimum Error Rate Training of Log-linear models for SMT
  5. Word Alignment
  6. Hierarchical Phrase-based SMT and LR Decoding
  7. Discriminative Re-ranking
  8. Syntactic Re-ordering for SMT
  9. Discriminative Learning for MT
  10. Discriminative Tuning for MT
  11. Syntax-based SMT
  12. Miscellaneous Topics in SMT

Homeworks

  1. Homework #1. Sep 08 - Sep 29. 10%
  2. Homework #2. Sep 29 - Oct 13. 10%
  3. Homework #3. Oct 14 - Oct 28. 10%

Homework Submission

  • Your homework will be submitted electronically using the department-provided submission server. Connect to the submission server by going to the URL: https://courses.cs.sfu.ca/ (your grades for the homeworks will also be tracked on the same web page)
  • All homeworks are due by 11:45 PM on the homework due date.