Tutorial 6: Linear and Parametric Microphone Array Processing
Monday, May 27, 9 am-12 noon
Emanuël Habets, Sharon Gannot
A major challenge in modern acoustic communication systems is the acquisition of the sound of interest. In the last decade, spatial signal processing has been shown to provide useful and viable solutions. Especially in adverse acoustic conditions, encountered in reverberant and noisy environments, the use of multiple microphones has been shown to provide significant advantages compared to the use of a single microphone. This tutorial will briefly review room acoustics in order to explain the properties of different sound fields. It will then outline current and emerging techniques for spatial signal processing. In particular, the problem of acquiring an estimate of a desired sound will be addressed. This problem will be tackled from the perspective of (distributed and non-distributed) linear spatial processing and parametric spatial processing and will be supported by numerous audio examples. The tutorial will close with an outlook, highlighting important open questions and promising research directions.
The proposed tutorial is intended to be of relevance to researchers and development engineers working in many aspects of acoustic signal processing, optimal estimation and adaptive filtering. It will include a brief resume of the relevant aspects of room acoustics and then move on to show the state-of-the-art of the various spatial processing approaches available for acoustic communication into two categories, viz. linear spatial processing and parametric spatial processing. This will break down the algorithms into a manageable number of types, enabling participants to quickly gain an overview and understanding of the subject area and the relevant applications. At appropriate points in the tutorial, recent advances beyond the state-of-the-art will also be described and new research results presented.
The following categories will be covered:
I. Linear Spatial Processing
The first part of this tutorial will summarize the state-of-the-art in spatial speech processing. Several optimization criteria will be presented, highlighting their different attributes. Both closed-form and adaptive solutions will be introduced. We then explore blind and semi-blind estimation techniques that enable the design of feasible beamformers utilizing the available microphone signals. We then explore the important aspect of single-channel postfilters, designed to further enhance the beamformer's output. We conclude by discussing robustness and performance issues.
II. Distributed Linear Spatial ProcessingRecent technological advances facilitate the concept of distributed, randomly placed, microphone arrays. A distributed microphone array is composed of multiple sub-arrays (nodes), each of which consists of several microphones, a signal processing unit and a wireless communication module. The large spatial distribution of such microphone constellations increases the probability that a subset of the microphones is close to a relevant sound source. New challenges arise with these new distributed structures. In this part of the tutorial we will explore performance bounds of such distributed microphone arrays and introduce novel algorithms for solving the optimality criteria under the limited communication bandwidth constraints between nodes.
III. Parametric Spatial Processing
Parametric spatial processing is a promising and emerging technique that is fundamentally different from traditional spatial processing techniques. First, a relatively simple sound field model is adopted and the parameters of the model, i.e., the direction of arrival and diffuseness, are estimated in a time-frequency domain. Secondly, the estimated parameters are used to process the received microphone signals. The compact and efficient representation of the sound field can be used to develop algorithms for directional and spatial filtering and source localization. In this tutorial, two distinct sound field models and corresponding parameter estimation techniques will be presented. We will then present an algorithm for directional filtering and dereverberation. Finally, we show how one of the parametric representations can be used to create a virtual microphone signal with a pre-defined pickup pattern.
IV. Joint Linear and Parametric Spatial Processing
In the final part, it is shown how linear and parametric spatial processing can be combined. It will be shown that this type of processing results in an increased robustness compared to pure linear spatial processing techniques and an increased performance (e.g., interference reduction) compared to pure parametric spatial processing techniques.
The presenters are leading researchers in this area and have recently edited/co-authored a textbook titled Speech Processing in Modern Communication: Challenges and Perspectives published by Springer.
The mode of presentation of the tutorial will be to split the material between the two presenters in order to maintain variety and to support slides with a number of illustrative audio demonstrations so as to keep the tutorial lively, enjoyable and informative.
Emanuël Habets is an Associate Professor in the International Audio Laboratories Erlangen (a joint institution of the Friedrich-Alexander University of Erlangen-Nuremberg and Fraunhofer IIS) and Chief Scientist Spatial Audio Processing at Fraunhofer IIS, Germany. He received the B.Sc degree in electrical engineering from the Hogeschool Limburg, The Netherlands, in 1999, and the M.Sc and Ph.D. degrees in electrical engineering from the Technische Universiteit Eindhoven (TU/e), The Netherlands, in 2002 and 2007, respectively.
From March 2007 until February 2009, he was a postdoctoral fellow at the Technion - Israel Institute of Technology and at the Bar-Ilan University in Ramat-Gan, Israel. In 2009, he was awarded a Marie Curie Intra-European Fellowship for Career Development. From February 2009 until November 2010, he was a Member of the Research Staff in the Communication and Signal Processing Group at Imperial College London, United Kingdom.
Dr. Habets is a co-author of the SpringerBriefs "Speech Enhancement in the STFT Domain". He was a member of the organization committee of the 2005 International Workshop on Acoustic Echo and Noise Control (IWAENC) and is a member of the IEEE Signal Processing Society Technical Committee on Audio and Acoustic Signal Processing. He is a general co-chair of the International Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) to be held in Mohonk Mountain House, New Paltz, New York in 2013.
His research interests are in the areas of speech and audio signal processing, and he has worked in particular on speech dereverberation, microphone array processing, echo cancellation and suppression, acoustic system identification and equalization, and localization and tracking of stationary and moving acoustic sources.
Sharon Gannot is an Associate Professor in the Faculty of Engineering at Bar-Ilan University, Israel. He received his B.Sc. degree (summa cum laude) from the Technion - Israel Institute of Technology, in 1986 and the M.Sc. (cum laude) and Ph.D. degrees from Tel-Aviv University, Israel in 1995 and 2000, respectively, all in electrical engineering.
In the year 2001 he held a post-doctoral position at the department of Electrical Engineering (ESAT) at K.U.Leuven, Belgium. From 2002 to 2003 he held a research and teaching position at the Faculty of Electrical Engineering, Technion-Israel Institute of Technology, Haifa, Israel.
Dr. Gannot is the recipient of Bar-Ilan University outstanding lecturer award for the year 2010. He is a coeditor of the Speech Enhancement section of the Springer Handbook of Speech Processing (Springer, 2008), and a coeditor of Speech Processing in Modern Communication: Challenges and Perspectives (Springer, 2010). Dr. Gannot serves as Associate Editor of the IEEE Transactions on Audio, Speech and Language Processing, and a member of the IEEE Audio and Acoustic Signal Processing Technical Committee. He served as an Associate Editor of EURASIP Journal on Advances in signal Processing in 2003-2011, an Editor of two special issues on Multi-microphone Speech Processing of the same journal, and a guest editor of ELSEVIER Speech Communication journal. He has been a member of the Technical and Steering committee of the International Workshop on Acoustic Echo and Noise Control (IWAENC) since 2005 and was the general co-chair of IWAENC 2010 held in Tel-Aviv, Israel. He is a general co-chair of the International Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) to be held in Mohonk Mountain House, New Paltz, New York in 2013.
His research interests include parameter estimation, statistical signal processing and speech processing using either single- or multi-microphone arrays and in particular, speaker extraction, noise reduction, dereverberation and speaker localization in adverse conditions.