Computational Modelling Group

Seminar  21st January 2014 noon  Building 7, Room 3027 - Highfield Campus

Understanding Video and Audio at Google

Tom Walters
Google

Submitter
Luke Goater

The next seminar of the Hearing and Balance Centre in ISVR will be given by Tom Walters from Google.

Title: Understanding Video and Audio at Google

Abstract:

Google’s mission is to organise the world’s information and make it universally accessible and useful. An enormous chunk of the world’s information is in the form of video and audio, so systems that can efficiently search, index and understand these forms of content are crucial. In this talk, I’ll discuss research on video and audio understanding. Starting with some low-level hand-engineered audio features designed for sound search and cover-song detection, I’ll talk about how such features are used in a large-scale video tag suggestion task. I’ll also present some recent work done at Google on distributed training of deep neural networks, and talk about how such systems might make the hand-engineering of audio and video features obsolete.

Bio: Tom is a Research Scientist at Google. He currently works in Zurich on improving YouTube’s Content ID system. Previously he was part of the Machine Perception group in Google Research in Mountain View, CA, where he worked on content-based audio analysis for tasks like auditory scene understanding and music recommendation. Tom received a BA and MSci in Natural Sciences from the University of Cambridge, and did a PhD on Computational Auditory Modes at the Centre for the Neural Basis of Hearing in the Department of Physiology, Development and Neuroscience in Cambridge.