Keynote Speakers

We are pleased to welcome three distinguished keynote speakers at WASPAA 2015:

  • Emmanuel Vincent, INRIA Nancy, France
  • Masataka Goto, National Institute of Advanced Industrial Science and Technology (AIST), Japan
  • Nima Mesgarani, Electrical Engineering Department, Columbia University

 


Vincent

Is audio signal processing still useful in the era of machine learning?

(slides)

Emmanuel Vincent, INRIA Nancy, France

 

Audio signal processing has long been the obvious approach to problems such as microphone array processing, active noise control, or speech enhancement. Yet, it is increasingly being challenged by black-box machine learning approaches based on, e.g., deep neural networks (DNN), which have already achieved superior results on certain tasks. In this talk, I will try to convince that machine learning approaches shouldn’t be disregarded, but that black boxes won’t solve these problems either. There is hence an opportunity for signal processing researchers to join forces with machine learning researchers and solve these problems together. I will provide examples of this multi-disciplinary approach for audio source separation and robust automatic speech recognition.

Biography

Emmanuel Vincent is a Research Scientist with Inria (Nancy, France). He received the PhD degree in music signal processing from IRCAM (Paris, France) in 2004 and worked as a Research Assistant with the Centre for Digital Music at Queen Mary, University of London (London, U.K.) from 2004 to 2006. His research focuses on statistical machine learning for speech and audio signal processing, with application to audio source localization and separation, noise-robust speech recognition, and music language processing. He is a founder of the series of Signal Separation Evaluation Campaigns (SiSEC) and CHiME Speech Separation and Recognition Challenges.


Goto

Frontiers of Music Technologies

Masataka Goto, National Institute of Advanced Industrial Science and Technology (AIST), Japan

 

Music technologies will open the future up to new ways of enjoying music both in terms of music appreciation and music creation. In this keynote speech, I introduce the frontiers of music technologies by showing some practical examples to demonstrate how end users can benefit from music signal processing, music understanding technologies, singing synthesis technologies, and music interfaces. For example, a web service for active music listening, “Songle” (http://songle.jp), has analyzed more than 830,000 songs on music- or video-sharing services and facilitates deeper understanding of music and music-synchronized control of robot dancers. A web service for large-scale music browsing, “Songrium” (http://songrium.jp), allows users to explore music while seeing and utilizing various relations among more than 680,000 music video clips on video-sharing services. Singing synthesis technologies open up new possibilities in music creation. I conclude by discussing grand challenges.

Biography

Masataka Goto received the Doctor of Engineering degree from Waseda University in 1998. He is currently a Prime Senior Researcher and the Leader of the Media Interaction Group at the National Institute of Advanced Industrial Science and Technology (AIST), Japan. In 1992 he was one of the first to start work on automatic music understanding, and has since been at the forefront of research in music technologies and music interfaces based on those technologies. Over the past 23 years, he has published more than 200 papers in refereed journals and international conferences and has received 40 awards, including several best paper awards, best presentation awards, the Tenth Japan Academy Medal, the Tenth JSPS PRIZE, and the Commendation for Science and Technology by the Minister of Education, Culture, Sports, Science and Technology (Young Scientists’ Prize). He has served as a committee member of over 90 scientific societies and conferences, including the General Chair of the 10th and 15th International Society for Music Information Retrieval Conferences (ISMIR 2009 and 2014). In 2011, as the Research Director he began a 5-year research project (OngaCREST Project) on music technologies, a project funded by the Japan Science and Technology Agency (CREST, JST).

 


NimaMesgarani

Reverse engineering the neural mechanisms involved in robust speech processing

Nima Mesgarani, Electrical Engineering Department, Columbia University

 

The brain empowers humans with remarkable abilities to navigate their acoustic environment in highly degraded conditions. This seemingly trivial task for normal hearing listeners is extremely challenging for individuals with auditory pathway disorders, and has proven very difficult to model and implement algorithmically in machines. In this talk, I will present the result of an interdisciplinary research effort where reverse-engineering methodologies are used to determine the computation and organization of neural responses in human auditory cortex leading to new biologically informed models incorporating the functional properties of key neural mechanisms. The neural responses are recorded invasively from electrodes surgically implanted on the cortical surface of epilepsy patients, providing a highly detailed view of the neural activity. A better understanding of the neural mechanisms involved in speech processing can greatly impact the current models of speech perception and lead to human-like automatic speech processing technologies.

Biography

Nima Mesgarani is an assistant professor of Electrical Engineering at Columbia University. He received his Ph.D. from University of Maryland where he worked on neuromorphic speech technologies and neurophysiology of mammalian auditory cortex. He was a postdoctoral scholar in Center for Language and Speech Processing at Johns Hopkins University, and the neurosurgery department of University of California San Francisco before joining Columbia in fall 2013.

Comments are closed.