Retrieving Signals in the Frequency Domain with Deep Complex Extractors

Chiheb Trabelsi, Olexa Bilaniuk, Ousmane Amadou Dia, Ying Zhang, Mirco Ravanelli, Jonathan Binas, Negar Rostamzadeh, Christopher Pal

December 2019

Abstract

Recent advances have made it possible to create deep complex-valued neural networks. Despite this progress, the potential power of fully complex intermediate computations and representations has not yet been explored for many challenging learning problems. Building on recent advances, we propose a novel mechanism for extracting signals in the frequency domain. As a case study, we perform audio source separation in the Fourier domain. Our extraction mechanism could be regarded as a local ensembling method that combines a complex-valued convolutional version of Feature-Wise Linear Modulation (FiLM) and a signal averaging operation. We also introduce a new explicit amplitude and phase-aware loss, which is scale and time invariant, taking into account the complex-valued components of the spectrogram. Using the Wall Street Journal Dataset, we compare our phase-aware loss to several others that operate both in the time and frequency domains and demonstrate the effectiveness of our proposed signal extraction method and proposed loss. When operating in the complex-valued frequency domain, our deep complex-valued network substantially outperforms its real-valued counterparts even with half the depth and a third of the parameters. Our proposed mechanism improves significantly deep complex-valued networks’ performance and we demonstrate the usefulness of its regularizing effect.

Type

Workshop

Publication

Workshop at the Neural Information Processing Systems (NeurIPS)

Machine Learning

Christopher Pal

Distinguished Scientist

Distinguished Scientist at AI Research Partnerships & Ecosystem located at Montreal, QC, Canada.