Matthias Schonlau

I am a professor in statistics at the University of Waterloo in Ontario, Canada. I am spending the 2015/2016 academic year on sabbatical at the University of Auckland. Prior to becoming a professor I was a statistician at RAND and head of the Rand Statistical Consulting Service. Here are my colleagues in the RAND Statistics group. You can email me at schonlau at uwaterloo dot ca. I spent the academic year 2009/2010 on sabbatical at the German Institute for economic analysis (DIW) in Berlin, Germany. The DIW hosts the longest running household panel in Germany, SOEP. The sabbatical was made possible in cooperation with the Max Planck Institute for Human Development (MPIB). Before joining Rand's Pittsburgh office in 2005, I worked at RAND's Santa Monica headquarters. From 1997-1999 I held a joint appointment with the National Institute of Statistical Sciences and AT&T Labs - Research In 1997, I graduated from the University of Waterloo.

NISS distinguished alumni award
I am thrilled to announce I won the 2018 NISS distinguished alumni award. NISS is the US-based National Institute of Statistical Sciences. The citation reads: "Honoring his distinguished career as a research statistician in both industry and academia, especially his contributions in the area of survey methodology." Full announcement.

Research Interests:
survey methodology, application of text mining to open-ended questions, occupation coding, statistical software (C++/ Stata)

Current Research Projects:
Semi-automatic analysis of open-ended questions. Text data from open-ended questions in surveys are difficult to analyze and are frequently ignored. Yet open-ended questions are important because they do not constrain respondentsí answer choices. Where open-ended questions are necessary, sometimes multiple human coders hand-code answers into one of several categories. At the same time, computer scientists have made impressive advances in text mining that may allow automation of such coding. Automated algorithms do not achieve an overall accuracy high enough to entirely replace humans. We categorize open-ended questions using text mining for easy-to-categorize answers and humans for the remainder using expected accuracies to guide the choice of the threshold delineating between "easy" and "hard". This 5-year project is funded by the Canadian Social Sciences and Humanities Research Council.
Statistical Software. When the opportunity arises I enjoy programming. Much of my early programming was in C/C++ (e.g. software for the analysis of computer experiments). Because of time constraints I have lately focused on add-on programs in Stata that seamlessly integrate with existing Stata commands. This includes plugins for gradient boosting, support vector machines, random forests, ngram variables. Possibly my most popular Stata program is my implementation of respondent driven sampling.

Current Master's Students at Waterloo:
I have several topics for research essays for Master's level students. These topics are all related to my current research. Some topics require data analysis, others may require implementing an algorithm, all are fun (in my opinion). Please contact me directly to learn more.

Prospective Master's Students:
I have no influence on who gets admitted. Please get admitted first, and then I am happy to talk about supervision.

Prospective Ph.D. Students:
If you are interested in working with me please mention this in your application. Please feel free to contact me directly (This does not increase your chances of getting admitted, but may help you decide whether you would like to work with me). Regular admission procedures apply.

Return to Home Page
Remove navigation bar on the left