Matthias Schonlau

I am a professor in statistics at the University of Waterloo in Ontario, Canada. I am spending the 2015/2016 academic year on sabbatical at the University of Auckland. Prior to becoming a professor I was a statistician at RAND and head of the Rand Statistical Consulting Service. Here are my colleagues in the RAND Statistics group. You can email me at schonlau at uwaterloo dot ca. I spent the academic year 2009/2010 on sabbatical at the German Institute for economic analysis (DIW) in Berlin, Germany. The DIW hosts the longest running household panel in Germany, SOEP. The sabbatical was made possible in cooperation with the Max Planck Institute for Human Development (MPIB). Before joining Rand's Pittsburgh office in 2005, I worked at RAND's Santa Monica headquarters. From 1997-1999 I held a joint appointment with the National Institute of Statistical Sciences and AT&T Labs - Research In 1997, I graduated from the University of Waterloo.

Research Interests:
application of text mining to open-ended questions, occupation coding, survey methodology in general, statistical software tools (C++/ Stata)

Current Research Projects:
Semi-automatic analysis of open-ended questions. Text data from open-ended questions in surveys are difficult to analyze and are frequently ignored. Yet open-ended questions are important because they do not constrain respondentsí answer choices. Where open-ended questions are necessary, sometimes multiple human coders hand-code answers into one of several categories. At the same time, computer scientists have made impressive advances in text mining that may allow automation of such coding. Automated algorithms do not achieve an overall accuracy high enough to entirely replace humans. We categorize open-ended questions using text mining for easy-to-categorize answers and humans for the remainder using expected accuracies to guide the choice of the threshold delineating between "easy" and "hard". This 5-year project is funded by the Canadian Social Sciences and Humanities Research Council.
Statistical Software. When the opportunity arises I enjoy programming. Much of my early programming was in C/C++ (e.g. software for the analysis of computer experiments). Because of time constraints I have lately focused on add-on programs in Stata that seamlessly integrate with existing Stata commands. One highlight was a plugin of the machine learning technique "boosting" (programmed from scratch in C++) to Stata. More recently, my Stata program for respondent driven sampling has been very popular.

Current Master's Students at Waterloo:
I have several topics for research essays for Master's level students. These topics are all related to my current research. Some topics require data analysis, others may require implementing an algorithm, all are fun (in my opinion). Please contact me directly to learn more.

Prospective Master's Students:
I have no influence on who gets admitted. Please get admitted first, and then I am happy to talk about supervision.

Prospective Ph.D. Students:
If you are interested in working with me please mention this in your application. Please feel free to contact me directly (This does not increase your chances of getting admitted, but may help you decide whether you would like to work with me). Regular admission procedures apply.

Return to Home Page
Remove navigation bar on the left