Office: 8103 Gates & Hillman Centers
I focus on the use of machine learning to solve real world problems. I am particularly drawn to problems involving natural language, forecasting, or health. I have worked for several decades on statistical modeling of natural language, with direct application to virtually all language technology areas including speech recognition, machine translation, information extraction, summarization, question answering etc. I released the very first publicly available language modeling toolkit back in 1997; it has been used by more than 100 industrial and academic NLP R&D centers in more than 20 countries.
My most recent work is on forecasting epidemics. Our Delphi research group (delphi.midas.cs.cmu.edu) is the only group to have participated (and done very well) in all epidemic forecasting challenges organized by the U.S. government to date (influenza, dengue, chikungunya). Most recently, the U.S. Centers for Disease Control and Prevention (U.S. CDC) named us “Most Accurate Forecaster” for the most recent flu season.
I also have an interest in data science education. I have been teaching machine learning for 21 years to very diverse groups of students and other audiences. I have also developed and been teaching a course on the statistics of natural language and the specific challenges of modeling in sparse and categorical domains. See case studies.