Skip to content

Technical Program & Schedule

10/22 Sun 10/23 Mon 10/24 Tue 10/25 Wed
7:00 – 8:00 Breakfast
(Founders)
Breakfast
(Founders)
Breakfast
(Founders)
8:00 – 8:50 Keynote by Emily Provost
(Location: Conference House)
Keynote By Kristen Grauman
(Location: Conference House)
Open Panel Discussion
(Location: Conference House)
8:50 – 10:10 Oral Session 1:
Spatial Audio 
(Location: Conference House)
Oral Session 3:
Music Signal Processing

(Location: Conference House)
Oral Session 5: 
Perception and Coding
(Location: Conference House)
10:10 – 10:30 Coffiee Break
(Location: Parlor)
Coffee Break
(Location: Parlor)
Coffee Break
(Location: Parlor)
10:30 – 12:30 Poster Session 1:
Music and Audio Signal Processing
(Location: Parlor)
Poster Session 2:
Microphone Arrays and Audio Coding
(Location Parlor)

Demo Session
(Location: Sunset Lounge)
Poster Session 3:
Signal Enhancement and Separation
(Location: Parlor)
12:30 – 14:00 Lunch
(Location: Founders)
Lunch
(Location: Founders)
Lunch / Closing
(Location: Founders)
14:00 – 16:00 Afternoon Break Afternoon Break
(AASP TC Meeting at Conference House)
16:00 – 18:00 Registration/check-in
(Location: Mountain View)
Oral Session 2:
Source Separation and Enhancement
(Location: Conference House)
Oral Session 4:
Synthesis and Classification
(Location: Conference House)
18:00 – 18:15 Break Break Break
18:15 – 20:00 Opening Remark
& Dinner
(Location: Founders)
Sponsored by Adobe
Dinner
(Location: Founders)
Sponsored by Eventide
Award Ceremony
& Dinner
(Location: Founders)
Sponsored by Reality Labs (Meta)
20:00 – 22:00 Welcome Reception
(Location: Parlor)
Cocktails
(Location: Parlor)
Cocktails
(Location: Parlor)

10/23 Monday

Keynote Speech 1 (8:00 am—8:50 am)

Speaker: Emily Provost
Title: “From Speech to Emotion to Mood: Mental Health Modeling in Real-World Environments
Abstract: Emotions provide critical cues into our health and wellbeing.  This is of particular importance in the context of mental health, where changes in emotion may signify changes in symptom severity.  However, information about emotion and how it varies over time is often accessible only using survey methodology (e.g., ecological momentary assessment, EMA), which can become burdensome to participants over time.  Automated speech emotion recognition systems could provide an alternative, providing quantitative measures of emotion using acoustic data captured passively from a consented individual’s environment.  However, speech emotion recognition systems often falter when presented with data collected from unconstrained natural environments due to issues with robustness, generalizability, and invalid assumptions.  In this talk,  I will discuss our journey in speech-centric mental health modeling, explaining whether, how, and when emotion recognition can be applied to natural unconstrained speech data to measure changes in mental health symptom severity.

Oral Session 1: Spatial Audio (8:50 am—10:10 am)
Session Chairs: Mark Thomas and Daniele Giacobello
  • Paper ID:101
    Title: Distribution of Modal Damping in Absorptive Shoebox Rooms
    Authors: Maximilian Schäfer, Karolina Prawda, Rudolf Rabenstein, Sebastian J. Schlecht
  • Paper ID: 106
    Title: Wide-area 6DOF rendering of multi-point Ambisonic recordings based on interpolation of spatial parameters
    Authors: Archontis Politis, Lauros Pajunen, Jussi Leppänen, Sujeet Mate, Antti Eronen
  • Paper ID: 148
    Title: Low-complexity Higher Order Scattering Delay Networks
    Authors: Leny Vinceslas, Matteo scerbo, Zoran Cvetkovic, Hüseyin Hacıhabiboğlu, Enzo De Sena
  • Paper ID: 169
    Title: MITIGATING CROSS-DATABASE DIFFERENCES FOR LEARNING UNIFIED HRTF REPRESENTATION
    Authors: Yutong Wen, You Zhang, Zhiyao Duan
Poster Session 1: Music and Audio Signal Processing (10:30 am — 12:30 pm)
Session Chair: Kazuyoshi Yoshii
  • Paper ID: 10
    Title: A novel method to detect instrumental music in a large scale music catalog
    Authors: Wo Jae Lee, Emanuele Coviello
  • Paper ID: 20
    Title: Histogram Layer Time Delay Neural Networks for Passive Sonar Classification
    Authors: Jarin Ritu, Ethan Barnes, Riley E Martell, Alexandra Van Dine, Joshua Peeples
  • Paper ID: 23
    Title: Unsupervised Improvement of Audio-Text Cross-Modal Representations
    Authors: Zhepei Wang, Cem Subakan, Krishna Subramani, Junkai Wu, Tiago F Tavares, Fábio Ayres, Paris Smaragdis
  • Paper ID: 25
    Title: Music De-limiter Networks via Sample-wise Gain Inversion
    Authors: Chang-Bin Jeon, Kyogu Lee
  • Paper ID: 27
    Title: Convolutive Block-Matching Segmentation Algorithm with Application to Music Structure Analysis
    Authors: Axel Marmoret, Jeremy Cohen, Frédéric Bimbot
  • Paper ID: 40
    Title: Representation Learning for Audio Privacy Preservation using Source Separation and Robust Adversarial Learning
    Authors: Diep N Luong, Minh Tran, Shayan Gharib, Konstantinos Drossos, Tuomas Virtanen
  • Paper ID: 41 (virtual)
    Title: Neural Networks for Interference Reduction in Multi-track Recordings
    Authors: Rajesh R, Padmanabhan Rajan
  • Paper ID: 44
    Title: SINGLE-CHANNEL SPEAKER DISTANCE ESTIMATION IN REVERBERANT ENVIRONMENTS
    Authors: Michael Neri, Archontis Politis, Daniel A. Krause, Marco Carli, Tuomas Virtanen
  • Paper ID: 58
    Title: Perceptual Musical Similarity Metric Learning with Graph Neural Networks
    Authors: Cyrus Vahidi, Shubhr Singh, George Fazekas, Emmanouil Benetos, Dan Stowell, Huy Phan, Mathieu Lagrange
  • Paper ID: 80
    Title: Kernel Interpolation of Incident Sound Field in Region Including Scattering Objects
    Authors: Shoichi Koyama, Masaki Nakada, Juliano G. C. Ribeiro, Hiroshi Saruwatari
  • Paper ID: 83
    Title: Multi-source direction-of-arrival estimation using group-sparse fitting of steered response power maps
    Authors: Elisa Tengan, Thomas Dietzen, Filip Elvander, Toon van Waterschoot
  • Paper ID: 93
    Title: AUDIO INPUTS FOR ACTIVE SPEAKER DETECTION AND LOCALIZATION VIA MICROPHONE ARRAY
    Authors: Davide Berghi, Philip JB Jackson
  • Paper ID: 113 (virtual)
    Title: ARRAY CONFIGURATION MISMATCH IN DEEP DOA ESTIMATION: TOWARDS ROBUST TRAINING
    Authors: Ayal Schwartz, Elior Hadad, Sharon Gannot, Shlomo E. Chazan
  • Paper ID: 126
    Title: Hyperbolic Unsupervised Anomalous Sound Detection
    Authors: François G Germain, Gordon Wichern, Jonathan LeRoux
  • Paper ID: 130
    Title: Towards on-device keyword spotting using low-footprint Quaternion neural models
    Authors: Aryan Chaudhary, Vinayak Abrol
  • Paper ID: 133
    Title: Pretraining Respiratory Sound Representations Using Metadata and Contrastive Learning
    Authors: ILYASS MOUMMAD, Nicolas Farrugia
  • Paper ID: 147
    Title: Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries
    Authors: Julia Wilkins, Justin Salamon, Magdalena Fuentes, Juan P Bello, Oriol Nieto
  • Paper ID: 150
    Title: Annotating Jazz Recordings using Lead Sheet Alignment with Deep Chroma Features
    Authors: Ivan Shanin, Simon Dixon
  • Paper ID: 155
    Title: Sound source distance estimation in diverse and dynamic acoustic conditions
    Authors: Saksham Singh Kushwaha, Iran R Roman, Magdalena Fuentes, Juan P Bello
  • Paper ID: 172
    Title: Learning Sub-Dimensional HRTF Representations Towards Individualization Applications – Traditional And Deep Learning Approaches
    Authors: Devansh Zurale, Shlomo Dubnov
  • Paper ID: 181
    Title: Automatic Detection of Poor Tone Quality in Classical Guitar Playing Using Deep Anomaly Detection Method
    Authors: Kenta Ogawa, Hidefumi Ohmura, Kouichi Katsurada, Shun Sawada
Oral Session 2: Source Separation and Enhancement (16:00 pm—18:00 pm)
Session Chairs: Timo Gerkmann and Jesper Rindom Jensen
  • Paper ID: 11
    Title: Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
    Authors: Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani
  • Paper ID: 109
    Title: Relative Transfer Function Vector Estimation for Acoustic Sensor Networks Exploiting Covariance Matrix Structure
    Authors: Wiebke Middelberg, Henri Gode, Simon Doclo
  • Paper ID: 186
    Title: AECSQI: Referenceless Acoustic Echo Cancellation Measures using Speech Quality and Intelligibility Improvement
    Authors: Jin Woo Lee, Hyeong-Seok Choi, Kyogu Lee
  • Paper ID: 78
    Title: Time-Domain Audio Source Separation Based on Gaussian Processes with Deep Kernel Learning
    Authors: Aditya Arie Nugraha, Diego Di Carlo, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii
  • Paper ID: 105
    Title: Exploring the integration of speech separation and recognition with self-supervised learning representation
    Authors: Yoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-Qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe
  • Paper ID: 165
    Title: Complete and separate: Conditional separation with missing target source attribute completion
    Authors: Dimitrios Bralios, Efthymios Tzinis, Paris Smaragdis

10/24 Tuesday

Keynote Speech 2 (8:00 am—8:50 am)

Speaker: Kristen Grauman
Title: “Audio-Visual Learning
Abstract: Perception systems that can both see and hear have great potential to unlock problems in video understanding, augmented reality, and embodied AI.  I will present our recent work in audio-visual (AV) perception.  First, we explore how audio’s spatial signals can augment visual understanding of 3D environments.  This includes ideas for self-supervised feature learning from echoes, AV floorplan reconstruction, and active source separation, where an agent intelligently moves to hear things better in a busy environment.  Throughout this line of work, we leverage our open-source SoundSpaces platform, which allows state-of-the-art rendering of highly realistic audio (alongside visuals) in real-world scanned environments.  Next, building on these spatial AV ideas, we introduce new ways to enhance the audio stream – making it possible to transport a sound to a new physical environment observed in a photo, or to dereverberate speech so it is intelligible for machine and human ears alike.  Finally, I will overview Ego4D, a massive new egocentric video dataset built via a multi-institution collaboration that supports an array of exciting multimodal tasks.

Oral Session 3: Music Signal Processing (8:50 am—10:10 am)
Session Chairs: Magdalena Fuentes, Gaël Richard
  • Paper ID: 55
    Title: Diff-Pitcher: Diffusion-based Singing Voice Pitch Correction
    Authors: Jiarui Hai, Mounya Elhilali
  • Paper ID: 87
    Title: Leveraging synthetic data for improving chamber ensemble separation
    Authors: Saurjya Sarkar, Louise Thorpe, Emmanouil Benetos, Mark B Sandler
  • Paper ID: 108
    Title: All-In-One Metrical And Functional Structure Analysis With Neighborhood Attentions on Demixed Audio
    Authors: Taejun Kim, Juhan Nam
  • Paper ID: 127
    Title: A Differentiable Acoustic Guitar Model for String-Specific Polyphonic Synthesis
    Authors: Andrew F Wiggins, Youngmoo Kim
Poster Session 2: Microphone Arrays and Audio Coding (10:30 am — 12:30 pm)
(held in parallel with the Demo Session 1)
Session Chairs: Shoko Araki
  • Paper ID: 2
    Title: SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
    Authors: Yinghao A Li, Cong Han, Nima Mesgarani
  • Paper ID: 7
    Title: Yet Another Generative Model For Room Impulse Response Estimation
    Authors: Sungho Lee, Hyeong-Seok Choi, Kyogu Lee
  • Paper ID: 15 (virtual)
    Title: Region-of-Interest Oriented Constant-Beamwidth Beamforming with Rectangular Arrays
    Authors: Gal Itzhak, Israel Cohen
  • Paper ID: 38
    Title: Hybrid noise shaping for audio coding using perfectly overlapped window
    Authors: Byeongho Jo, Seung-Kwon Beack
  • Paper ID: 46 (virtual)
    Title: Blind Room Acoustic Parameters Estimation using Mobile Audio Transformer
    Authors: Shivam Saini, Jürgen Peissig
  • Paper ID: 59 (virtual)
    Title: DESIGN OF FREQUENCY-INVARIANT BEAMFORMERS WITH SPARSE CONCENTRIC CIRCULAR ARRAYS
    Authors: Yaakov Buchris, Israel Cohen, Alon Amar
  • Paper ID: 79
    Title: Perceptual Quality Enhancement of Sound Field Synthesis Based on Combination of Pressure and Amplitude Matching
    Authors: Keisuke Kimura, Shoichi Koyama, Hiroshi Saruwatari
  • Paper ID: 97
    Title: Temporal Noise Shaping on MDCT Subband Signals for Transform Audio Coding
    Authors: Richard Füg, Bernd Edler
  • Paper ID: 110
    Title: Optimizing Higher-Order Directional Audio Coding with Adaptive Mixing and Energy Matching for Ambisonic Compression and Upmixing
    Authors: Christoph Hold, Leo McCormack, Archontis Politis, Ville T Pulkki
  • Paper ID: 117
    Title: Computing Acoustic Onsets via an Eikonal Solver
    Authors: Samuel F Potter, Monte Hoover, Dmitry Zotkin, Ramani Duraiswami
  • Paper ID: 123
    Title: A Differentiable Image Source Model for Room Acoustics Optimization
    Authors: Bowen Zhi, Alisha Sharma, Dmitry Zotkin, Ramani Duraiswami
  • Paper ID: 128
    Title: Quaternion Anti-Transfer Learning for Speech Emotion Recognition
    Authors: Eric Guizzo, Tillman Weyde, Giacomo Tarroni, Danilo Comminiello
  • Paper ID: 134
    Title: A High-rate Extension to SoundStream
    Authors: Hong-Goo Kang, Jan Skoglund, Andrew Storus, Hengchin Yeh, Bastiaan Kleijn
  • Paper ID: 135
    Title: ESTIMATING THE DIRECTION OF ARRIVAL OF A SPOKEN WAKE WORD USING A SINGLE SENSOR ON AN ELASTIC PANEL
    Authors: Tre DiPassio, Michael Heilemann, Benjamin Thompson, Mark F Bocko
  • Paper ID: 146
    Title: Covariance Blocking and Whitening Method for Successive Relative Transfer Function Vector Estimation in Multi-Speaker Scenarios
    Authors: Henri Gode, Simon Doclo
  • Paper ID: 151
    Title: Robust Audio Anti-spoofing System Based on Low-frequency Sub-band Information
    Authors: Menglu Li, Xiao-Ping Zhang
  • Paper ID: 163
    Title: Inverted Cardioid Topology for Multi-Radius Spherical Microphone Arrays
    Authors: Mark R Thomas, Jan-Hendrik Hanschke
Demo Session 1 (10:30 am — 12:30 pm)
(held in parallel with the Poster Session 2)
Session Chair: Kazuyoshi Yoshii
  • Paper ID: 200
    Title: Spatially Selective Deep Non-linear Filters for Real-time Multi-channel Speech Enhancement
    Authors: Kristina Tesch, Timo Gerkmann
  • Paper ID: 201
    Title: TOTAL VARIATION IN VOCALS OVER TIME
    Authors: Elena Georgieva, Brian McFee, Pablo Ripollés
  • Paper ID: 202
    Title: Novel Instrumental Sound Creation Using Creative Adversarial Networks
    Authors: Hiroki Ito, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada
  • Paper ID: 203
    Title: Music Ensemble Separation using Permutation Invariant Training
    Authors: Saurjya Sarkar, Emmanouil Benetos, Mark B Sandler
  • Paper ID: 204
    Title: Music Dissector: An Interactive Music Structure Visualizer
    Authors: Taejun Kim, Juhan Nam
  • Paper ID: 205
    Title: Recognise and Notify Sound Events using a Raspberry PI based Standalone Device
    Authors: Gabriel Bibbó, Arshdeep Singh, Mark D. Plumbley
  • Paper ID: 206
    Title: Acoustic Experiments with 3D Printed Heads
    Authors: Austin Lu, Ethaniel Moore, Kanad Sarkar, Manan Mittal, Ryan M Corey, Andrew C Singer, Paris,Smaragdis
  • Paper ID: 207
    Title: Improved portable multiple sound spot synthesis system with a baffled circular array of 16 loudspeakers
    Authors: Takuma Okamoto, Katsushi Ueno, Tsukasa Okabe, Kentaro Tani, Yasuhiko Yoshikata, Miyuki Sudo, Manae Kuwahara, Keita Hikita
  • Paper ID: 212
    Title: Demo of the em64 Eigenmike®
    Authors: mh acoustics
Oral Session 4: Synthesis and Classification (16:00 pm—18:00 pm)
Session Chairs: François Germain and Mark Plumbley
  • Paper ID: 13
    Title: Neural Audio Decorrelation Using Generative Adversarial Networks
    Authors: Carlotta Anemüller, Oliver Thiergart, Emanuel Habets
  • Paper ID: 86
    Title: Differentiable Representation of Warping based on Lie Group Theory
    Authors: Atsushi Miyashita, Tomoki Toda
  • Paper ID: 88
    Title: CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models
    Authors: Hao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan Serra, Taylor Berg-Kirkpatrick, Julian McAuley
  • Paper ID: 136
    Title: General Purpose Audio Effect Removal
    Authors: Matthew Rice, Christian J. Steinmetz, George Fazekas, Joshua D. Reiss
  • Paper ID: 107
    Title: COMPRESSING AUDIO CNNS WITH GRAPH CENTRALITY BASED FILTER PRUNING
    Authors: James A King, Arshdeep Singh, Mark D. Plumbley
  • Paper ID: 182
    Title: Class Activation Mapping-Driven Data Augmentation: Masking Significant Regions for Enhanced Acoustic Scene Classification
    Authors: Pil Moo Byun, Jeong-Hwan Choi, Joon-Hyuk Chang

10/25 Wednesday

Open Panel Discussion (8:00 am—8:50 am)

Title: “Future of the audio research community in the era of big AI models”

We are collecting questions from the participants. Please join the #open-panel-discussion channel in the WASPAA 2023 Slack workspace and share your thoughts! The organizers will summarize them and designate a first answerer per each question, and then also welcome follow-up discussions.

Oral Session 5: Perception and Coding (8:50 am—10:10 am)
Session Chairs: Jan Skoglund and Simon Doclo
  • Paper ID: 8
    Title: AN IMPROVED METRIC OF INFORMATIONAL MASKING FOR PERCEPTUAL AUDIO QUALITY MEASUREMENT
    Authors: Pablo M Delgado, Juergen Herre
  • Paper ID: 103
    Title: Predicting thresholds in an auditory overshoot paradigm using a computational subcortical model with efferent feedback
    Authors: Afagh Farhadi, Laurel H. Carney
  • Paper ID: 61
    Title: LACE: A light-weight, causal model for enhancing coded speech through adaptive convolutions
    Authors: Jan Buethe, Jean-Marc Valin, Ahmed Mustafa
  • Paper ID: 183
    Title: Fitting Auditory Filterbanks with Multiresolution Neural Networks
    Authors: Vincent Lostanlen, Daniel Haider, Han Han, Mathieu Lagrange, Peter Balazs, Martin Ehler
Poster Session 3: Signal Enhancement and Separation (10:30 am — 12:30 pm)
Session Chair: Ante Jukić
  • Paper ID: 9
    Title: Multichannel Subband-Fullband Gated Convolutional Recurrent Neural Network For Direction-Based Speech Enhancement With Head-Mounted Arrays
    Authors: Benjamin Stahl, Alois Sontacchi
  • Paper ID: 16 (virtual)
    Title: Deep Adaptation Control for Stereophonic Acoustic Echo Cancellation
    Authors: Amir Ivry, Israel Cohen, Baruch Berdugo
  • Paper ID: 32
    Title: Efficient Deep Acoustic Echo Suppression with Condition-Aware Training
    Authors: Ernst Seidel, Pejman Mowlaee, Tim Fingscheidt
  • Paper ID: 37
    Title: Extending Audio Masked Autoencoders Toward Audio Restoration
    Authors: Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji
  • Paper ID: 43
    Title: Directional Target Speaker Extraction under Noisy Underdetermined Conditions through Conditional Variational Autoencoder with Global Style Tokens
    Authors: Rui Wang, Tomoki Toda
  • Paper ID: 45
    Title: Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase
    Authors: Yoshiki Masuyama, Natsuki Ueno, Nobutaka Ono
  • Paper ID: 60
    Title: Single Channel Speech Presence Probability Estimation Based on Hybrid Global-Local Information
    Authors: Shuai Tao, Yang Xiang, Himavanth Reddy Pundla, Jesper Rindom Jensen, Mads G. Christensen
  • Paper ID: 63
    Title: Consolidating Compression and Revisiting Expansion: An alternative amplification rule for wide dynamic range compression
    Authors: Alice Sokolova, Baris Aksanli, Fredric Harris, Harinath Garudadri
  • Paper ID: 67
    Title: Low bit rate binaural link for improved ultra low-latency low-complexity multichannel speech enhancement in Hearing Aids
    Authors: Nils L Westhausen, Bernd Meyer
  • Paper ID: 74
    Title: Correlation based glimpse proportion index
    Authors: Ahmed Alghamdi, Leonard Moen, Wai-Yip Geoffrey Chan, Daniel Fogerty, Jesper Jensen
  • Paper ID: 85
    Title: The Effect of Spoken Language on Speech Enhancement using Self-Supervised Speech Representation Loss Functions
    Authors: George L Close, Thomas Hain, Stefan Goetze
  • Paper ID: 91
    Title: Analysis of XLS-R for Speech Quality Assessment
    Authors: Bastiaan Tamm, Rik Vandenberghe, Hugo Van hamme
  • Paper ID: 100
    Title: ADAPTIVE SPARSE LINEAR PREDICTION IN FIXED-FILTER ANC HEADPHONE APPLICATIONS FOR MULTI-SPEAKER SPEECH REDUCTION
    Authors: Yurii Iotov, Sidsel Marie Nørholm, VALIANTSIN BELYI, Mads G. Christensen
  • Paper ID: 114
    Title: An objective evaluation of Hearing Aids and DNN-based speech enhancement in complex acoustic scenes
    Authors: Enric Gusó, Joanna Luberadzka, Martí Baig, Umut Sayin Saraç, Xavier Serra
  • Paper ID: 115
    Title: Slim-TasNet: A Slimmable Neural Network for Speech Separation
    Authors: Mohamed Elminshawi, Srikanth Raj Chetupalli, Emanuel Habets
  • Paper ID: 129
    Title: SEFGAN: Harvesting the Power of Normalizing Flows and GANs for Efficient High-Quality Speech Enhancement
    Authors: Martin Strauss, Nicola Pia, Nagashree K S Rao, Bernd Edler
  • Paper ID: 137
    Title: LOCATION AS SUPERVISION FOR WEAKLY SUPERVISED MULTI-CHANNEL SOURCE SEPARATION OF MACHINE SOUNDS
    Authors: Ricardo Falcon Perez, Gordon Wichern, François G Germain, Jonathan LeRoux
  • Paper ID: 152
    Title: Flexible multichannel speech enhancement for noise-robust frontend
    Authors: Ante Jukić, Jagadeesh Balam, Boris Ginsburg
  • Paper ID: 159
    Title: Mixed-delay distributed beamforming for own-speech separation in hearing devices with wireless remote microphones
    Authors: Ryan M Corey
  • Paper ID: 179
    Title: MASKED FREQUENCY MODELING FOR IMPROVING PACKET LOSS CONCEALMENT IN SPEECH TRANSMISSION SYSTEM
    Authors: Da-Hee Yang, Donghyun Kim, Joon-Hyuk Chang
  • Paper ID: 191
    Title: Diffusion Posterior Sampling for Informed Single-Channel Dereverberation
    Authors: Jean-Marie Lemercier, Simon Welker, Timo Gerkmann