Excerpt from an overview of Google’s big data strategy (via ACM TechNews; free New Scientist subscription required)
Norvig is convinced that speech recognition will fall to the "big data, simple algorithms" approach. The problem is finding enough data, as the spoken word is not represented online as comprehensively as text and images. As we discuss this issue, Norvig makes a revealing admission about the launch of Google Voice, which among other things transcribes phone messages and sends them to your email inbox: "One of the reasons we had this phone service is that we wanted to capture lots of interactions; hear different accents and different voices saying different things."
No human is listening to your messages. Norvig simply means that computers are using the data to improve their ability to transcribe speech. But it's this type of routine processing of personal information that makes some people uneasy about Google's reach into our lives - and helps explain the company's clashes with campaigners for online privacy.