About Me (Last Update: February 2012)

I have worked on a number of topics in different areas: computational learning theory, computation theory, inductive inference, philosophy of science, belief revision, game theory: foundations, algorithms, evolutionary analysis, and machine learning for particle physics. The aim of this note is a high-level, informal introduction to my work that does not assume background in computer science. For detailed descriptions, you can look at the annotations on my publications. See also the thesis topics I've listed for graduate students.

Biography

I was born in Toronto. My family moved to Germany when I was six. After finishing high school there, I came back to the University of Toronto to do a Bachelor's Degree in Cognitive Science. The Cognitive Science program in Toronto combines computer science, philosophy and psychology. For my Ph.D. in Logic and Computation, I went to the philosophy department at Carnegie Mellon University, in Pittsburgh, Pennsylvania. Carnegie Mellon is a private university founded by Andrew Carnegie.

Research Interests

My work is mostly concerned with an issue that receives different formulations and labels in different areas: how to use observations to adapt to an external world. In computer science, we talk about machine learning and optimal design for learning systems. In philosophy of science, this leads to questions about scientific reasoning and method. Epistemology considers how to form beliefs on the basis of observations, especially inductive generalizations. In biology, the topic is adaptation. Mostly I've worked in the context of machine learning, scientific reasoning and epistemology, which aim to find optimal ways of learning and adapting, rather than to describe how humans or animals actually do learn. My guiding principle is that good reasoning methods or algorithms are those that lead us towards our cognitive aims, especially towards theories that get the observations right. Though not everybody agrees with this, it's hardly a new idea. What's new in my work is a systematic attempt to work out the details. For example, what are relevant cognitive goals? What are the most powerful reasoning methods like? How hard can it be to attain some goal? Are some learning aims more difficult to realize than others? Which empirical questions are easy, which are hard, which are impossible, and what makes them so? Together with several other people working on these questions, we have found some precise, systematic and often surprising answers.

In considering the virtues and vices of learning methods, I often resort to general principles of rational choice. The idea is to think of an inference-method as something that you can choose, and you can apply rational choice principles to help you figure out how to do so. After applying ideas from the analysis of rational choice, I've become interested in the foundations of that theory. I've also worked on methods for representing rational choice models in logic and using methods from computational logic to carry out game theoretic reasoning.


For the last five years or so, I have been working on learning for relational data. The structure of relational data is best represented in logic, so learning with such data raises the question of how to combine logic and probability. Logic and probability are the two most powerful formal theories of reasoning that our civilization has developed so far, so combining them is an important foundational topic for Artificial Intelligence and Cognitive Science. It is also a very practical problem because most enterprises actually store their data in relational databases, and because networks generate linked data.

Applications and Projects

There is a traditional German saying "nothing is as practical as a good theory". That's my motto too. So if my theories about learning and inductive inference and empirical inquiry are on the right track, they ought to have concrete applications.

  • My group has been working hard on scalable Bayes net learning for relational data. We usually evaluate these methods on standard benchmark databases. Fairly concrete applications that I've worked on recently include the following.
    • Predicting Link Strength in social networks. The goal is to predict the strength of a friendship link from observed communication patterns (Cecile sends Keertharna 10 email messages a day), and from other known strengths (if Maite is great friends with Olivia, and Olivia is great friends with Jennifer, then Maite is perhaps great friends with Jennifer as well.)
    • Predicting Database Frequencies. In databases you can often observe statistical patterns like "men tend to rate action movies highly". If you can quantify the strength of these correlations, it can be used as a statistical estimate in a query optimization algorithm.
  • I have worked a lot on an application in scientific method and discovery: a computer program that postulates conservation principles and hidden particles for elementary particles. My analysis says that the program is in principle optimal for this problem. We have implemented the program and run it on the current data from particle physics. The program matches exactly the conservation laws that physicists have found, and it rediscovers the need for an unobserved neutrino, the electron anti-neutrino, which is one of the main concerns in current particle research.
  • Another learning problem that I have worked on, together with my student Wei Luo, is the learning-theoretically optimal way to infer Bayes nets (aka causal graphs) from information about statistical correlations between a potentially large set of variables. You can think of Bayes nets as a compact graphical way to represent a probability distribution over a set of variables that makes explicit which variables directly depend on which others. The learning-theoretic analysis has led to a new method for learning Bayes nets. The way Bayes net researchers would describe it is that it is a hybrid method that combines correlation testing with model selection. Generally our method finds Bayes nets that are closer to the true causal model and make more accurate probabilistic predictions. However, it takes more computation time than previous methods.