|Email:||anoop at cs.sfu.ca|
CMPT 882-3: Fall 2002
|Monday||4:00p - 5:50p in SCB 8662|
|Wednesday||3:30p - 4:20p in SCB 8662|
|Office Hrs||by appointment (send me email)|
In this course we will study basic algorithms that produce state of the art results on tasks involving natural language text. For each of these tasks, we will compare knowledge-rich approaches which use a lot of human supervision to knowledge-poor techniques which use parameter re-estimation or bootstrapping algorithms. We will also compare generative models (models which maximize likelihood of the training data) with discriminative models (models which minimize classification error rate).
For more details: Course Description for CMPT 882-3
Notes: Lecture #01 pdf
Unsupervised Word Sense Disambiguation Rivaling Supervised Methods (1995). David Yarowsky. Proceedings of ACL-95. pp. 189-196
Notes: Lecture #02 pdf
de_VBP training_VBG new_JJ Ukrainian_JJ plant _NN operators_NNS to_TO replace_VB Russi _NNPS who_WP are_VBP leaving_VBG the_DT plant s_NNS in_IN Ukraine_NNP and_CC improving N and_CC safety_NN procedures_NNS at_IN plant s_NNS in_IN both_DT countries_NNS ,_, sa iet-designed_JJ reactors_NNS at_IN a_DT plant _NN in_IN the_DT Czech_NNP Republic_NNP er_JJ to_TO pay_VB to_TO make_VB the_DT plant _NN safer_JJR ._. er_JJ to_TO pay_VB to_TO make_VB the_DT plant _NN safer_JJR ._. ,_, ''_'' she_PRP said_VBD ,_, are_VBP plant _NN moratoriums_NNS ._. _NNS at_IN the_DT Orange_NNP County_NNP plant _NN ._.
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval (1998). David Lewis. Proceedings of ECML-98 (10th meeting). pp. 4-15.
Notes: Lecture #03 pdf
WebKBdatasets available from the CMU Text Learning page.
bayes.tar.gz. Use the data files that come with the package.
Chapters 3 and 4 of Statistical Language Learning. Eugene Charniak. MIT Press. 1993.
Notes: Lecture #04 pdf
Does Baum-Welch Re-estimation help taggers? (1994). David Elworthy. Proceedings of 4th ACL Conf on ANLP, Stuttgart. pp. 53-58.
Notes: Lecture #05 pdf
Nymble: a High-Performance Learning Name-finder (1997). Daniel M. Bikel, Scott Miller, Richard Schwartz, Ralph Weischedel. Proceedings of ANLP-97.
Notes: Lecture #06 pdf
Other Applications of HMMs:
EM for hybrid models
We will look at the use of the EM algorithm (a generalization of the forward-backward algorithm for HMMs) and apply it to the problem of finding the appropriate interpolation weights between a word-based and a part-of-speech based language model.
mixture.pl is a simple Perl script that implements this idea.
Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging (1995). Eric Brill. Computational Linguistics, volume 21, number 4, pp. 543-565.
Notes: Lecture #07 pdf
A simple introduction to maximum entropy models for natural language processing (1997). Adwait Ratnaparkhi. Technical Report 97-08, Institute for Research in Cognitive Science, University of Pennsylvania.
A Maximum Entropy Model for Part-of-Speech Tagging (1996). Adwait Ratnaparkhi. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. pp. 133-142.
Notes: Lecture #08 pdf
Automatic Extraction of Subcategorization from Corpora (1997). Ted Briscoe and John Carroll. In Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP-97).
Notes: Lecture #09 pdf
Minimally supervised morphological analysis by multimodal alignment (2000). Yarowsky, D. and R. Wicentowski. In Proceedings of ACL-2000, pages 207-216.
Unsupervised Learning of the Morphology of a Natural Language (2001). John Goldsmith. Computational Linguistics, Volume 27, Number 2.
Notes: Lecture #10 pdf
Supertagging: An Approach to Almost Parsing (1999). Srinivas Bangalore and Aravind K. Joshi. Computational Linguistics, volume 25, number 2, pages 237-265.
Notes: Lecture #11 pdf
Coping with syntactic ambiguity or how to put the block in the box on the table (1982). Kenneth Church and Ramesh Patil. Computational Linguistics 8:139-49.
Prepositional Phrase Attachment through a Backed-Off Model (1995). Michael Collins and James Brooks. Proceedings of the Third Workshop on Very Large Corpora WVLC-95.
Notes: Lecture #12 pdf
Structural Ambiguity and Lexical Relations (1993). Donald Hindle and Mats Rooth. Computational Linguistics. Volume 19, Number 1, March 1993, Special Issue on Using Large Corpora: I.
Statistical Models for Unsupervised Prepositional Phrase Attachment (1998). Adwait Ratnaparkhi. In Proceedings of COLING-ACL 1998.
Notes: Lecture #13 pdf
Head-Driven Statistical Models for Natural Language Parsing. Michael Collins. PhD Dissertation, University of Pennsylvania, 1999. Read chapters 2 and 3, pages 31-102
Statistical parsing with an automatically-extracted tree adjoining grammar (2000). David Chiang. In Proceedings of ACL 2000, Hong Kong, October 2000, pages 456-463.
Notes: Lecture #14 pdf and additional slides (from Michael Collins' thesis presentation)
Inside-Outside Reestimation from partially bracketed corpora. Fernando Pereira and Yves Schabes. In 30th Annual Meeting of the Association for Computational Linguistics, pages 128-135, Newark, Delaware, 1992.
Applications of stochastic context-free grammars using the Inside-Outside algorithm. K. Lari and S. J. Young. Computer Speech and Language, 4:35-56, 1990.
Combining Labeled and Unlabeled Data with Co-training. Avrim Blum and Tom Mitchell. In Proc. of the Workshop on Computational Learning Theory (COLT98). 1998.
Analyzing the Effectiveness and Applicability of Co-training. Kamal Nigam and Rayid Ghani. In Ninth International Conference on Information and Knowledge Management (CIKM-2000), pp. 86-93. 2000.
Committee-Based Sample Selection for Probabilistic Classifiers. Shlomo Argamon-Engelson and Ido Dagan. in Journal of Artificial Intelligence Research, 1999.
On Minimizing Training Corpus for Parser Acquisition. Rebecca Hwa. In Proc. of Workshop on Computational Natural Language Learning. 2001.
A short introduction to boosting . Y. Freund and R. Schapire. Journal of the Japanese Society for Artificial Intelligence. 14(5), pages 771-780, 1999.
Boosting Applied to Tagging and PP Attachment . Steven Abney, Robert E. Schapire, and Yoram Singer. Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 38-45. 1999.
perl perlsh.pland copy/paste code from the tutorial into the session).