I am a professor in statistics at the University of Waterloo in Ontario, Canada. I am spending the 2015/2016 academic year on sabbatical at the University of Auckland. Prior to becoming a professor I was a statistician at RAND and head of the Rand Statistical Consulting Service. Here are my colleagues in the RAND Statistics group. You can email me at schonlau at uwaterloo dot ca. I spent the academic year 2009/2010 on sabbatical at the German Institute for economic analysis (DIW) in Berlin, Germany. The DIW hosts the longest running household panel in Germany, SOEP. The sabbatical was made possible in cooperation with the Max Planck Institute for Human Development (MPIB). Before joining Rand's Pittsburgh office in 2005, I worked at RAND's Santa Monica headquarters. From 1997-1999 I held a joint appointment with the National Institute of Statistical Sciences and AT&T Labs - Research In 1997, I graduated from the University of Waterloo.
Elected Fellow of the American Statistical Association
I am thrilled to announce I was elected Fellow of the American Statistical Association (2020). Each year 0.3% of the membership is honored in this way. The citation reads: "For notable contributions to survey methodology in both industry and academia, for serving as a connector between statistics and the social sciences via accessible publications, education, and software, and for service to the profession."
survey methodology, application of natural language processing to open-ended questions, occupation coding, statistical software (Python/ Stata)
Current Research Projects:
Open-ended questions in Surveys Text data from open-ended questions in surveys are difficult to analyze and are frequently ignored. Yet open-ended questions are important because they do not constrain respondents' answer choices. Where open-ended questions are necessary, sometimes multiple human coders hand-code answers into one of several categories. At the same time, computer scientists have made impressive advances in natural language processing that allow automation of such coding. Past work includes semi-automatic categorization of open-ended questions where automated algorithms do not achieve an overall accuracy high enough to entirely replace humans, intercoder disagreements in statistical learning algorithms when training data are double coded, occupation coding ("What is your job?"), analysis of final comments at the end of surveys ("Do you have any other comment?"), and some active learning for text data. Different aspects of this work is continuing, including voice captures of open-ended answers. My research has been funded by the Canadian Social Sciences and Humanities Research Council (SSHRC).
Statistical Software. When the opportunity arises I enjoy programming. Much of my early programming was in C/C++ (e.g. software for the analysis of computer experiments). Because of time constraints I have lately focused on add-on programs in Stata that seamlessly integrate with existing Stata commands. This includes plugins for gradient boosting, support vector machines, random forests, ngram variables. Possibly my most popular Stata program is my implementation of respondent driven sampling.
Prospective Master's Students:
I have no influence on who gets admitted. Please get admitted first, and then I am happy to talk about supervision.
Prospective Ph.D. Students:
If you are interested in working with me please mention this in your application. Please feel free to contact me directly (This does not increase your chances of getting admitted, but may help you decide whether you would like to work with me). Regular admission procedures apply.