News Archives

Inference of, for and by the Web - Machine Learning Challenges at Google

October 19, 2004

Date: Tuesday October 19, 2004
Time: 11am-12:15pm
Location: Woodward 149

David Cohn <david.cohn@somerandom.com>
Senior Research Scientist at Google (joint work with 1000 other Googlers)

Abstract: The web is one of the largest and most lucrative data sets in the world. It is also, remarkably, free - anyone can access it, and anyone can add to it. These attributes give rise to unique challenges and opportunities for a anyone trying to organize and deliver web-based information. I will discuss a few such challenges faced by Google, including adversarial information retrieval and spelling correction without a "correct" answer. I'll describe how we're applying machine learning and statistics to solve them, what ongoing challenges we face, and what it's like being in the heart of a company searching terabytes of data to serve over 200 million queries a day.

Bio:David Cohn is a senior research scientist at Google, where he works on link analysis and machine learning problems related to search quality. He holds a Ph.D. in Computer Science from the University of Washington. Prior to joining Google, David studied as a postdoc at MIT, served as visiting faculty at University of Oregon and CMU, and worked for a variety of startups doing everything from workflow management to digital music recommendation. But not all at the same time.