Prospective student? please also read this page for prospective students. If you do not have any significant background in natural language processing, read this high-level introduction to my research (pdf).
My research is focused on machine learning algorithms applied to natural language processing: in the areas of natural language parsing and statistical machine translation. I am interested in unsupervised and weakly supervised learning algorithms and structured outputs such as tag sequences or parse trees.
I have worked with stochastic tree-adjoining grammars, a generalization of context-free grammars which allows for a computationally constrained and linguistically sophisticated analysis of natural language. I am interested in formal language theory, grammar formalisms and their probabilistic variants. I would like to work on basic algorithms based on formal grammars, finite-state and tree transducers that can be useful for NLP tasks.
Important things to consider when deciding to spend time doing research:
Contemporary machine translation relies on unsupervised learning of an alignment between the source and target language. I am interested in extensions of this approach to syntax-aware statistical machine translation, that can be used to capture the content of the source sentence, while also producing a more fluent output target sentence. The challenge of learning more complex syntax-based transfer rules from large amounts of data, machine translation for resource-poor languages, domain adaptation in translation, and translating from English into other languages are some of my current interests in this area. I also work on the formal foundations of transducers and synchronous grammars.
I expect that most of my students will work on machine translation as the target application of their research.
How can we learn from very few examples? How can a machine continue learning forever? Perhaps the learner has not observed all the features needed for future predictions? I am interested in the study of bootstrapping algorithms known to perform well on various natural language processing tasks, and methods for the unsupervised or semi-supervised learning of parsers and taggers.
For more information about my recent research, read some of my recent research papers.
XTAG project, a wide-coverage
grammar and parser for English. He received his BS in Computer Science
from the University of Poona, India in 1991. From 1991 to 1993, he
worked as a research associate at the Centre for Development of Advanced
Computing in Poona, India.