Environment-aware Listener-Optimized Binaural Enhancement of Speech (E-LOBES)

Researchers: Mike Brookes, Patrick Naylor, Alastair Moore, Wei Xue, Leo Lightburn, Stuart Rosen (UCL), Mark Huckvale (UCL), Gaston Hilkhuysen (UCL), Tim Green (UCL)

Age-related hearing loss affects over half the UK population aged over 60. Hearing loss makes communication difficult and so has severe negative consequences for quality of life. The most common treatment for mild-to-moderate hearing loss is the use of hearing aids. However even with aids, hearing impaired listeners are worse at understanding speech in noisy environments because their auditory system is less good at separating wanted speech from unwanted noise. One solution for this is to use speech enhancement algorithms to amplify the desired speech signals selectively while attenuating the unwanted background noise.

It is well known that normal hearing listeners can better understand speech in noise when listening with two ears rather than with only one. Differences between the signals at the two ears allow the speech and noise to be separated based on their spatial locations resulting in improved intelligibility. Technological advances now make feasible the use of two hearing aids that are able to share information via a wireless link. By sharing information in this way, it becomes possible for the speech enhancement algorithms within the hearing aids to localize sound sources more accurately and, by jointly processing the signals for both ears, to ensure that the spatial cues that are present in the acoustic signals are retained. It is the goal of this project to exploit these binaural advantages by developing speech enhancement algorithms that jointly enhance the speech received by the two ears.

Most current speech enhancement techniques have evolved from the telecommunications industry and are designed to act only on monaural signals. Many of the techniques can improve the perceived quality of already intelligible speech but binary masking is one of the few techniques that has been shown to improve the intelligibility of noisy speech for both normal and hearing impaired listeners. In the binary masking approach regions of the time-frequency domain that contain significant speech energy are left unchanged while regions that contain little speech energy are muted. In this project we will extend existing monaural binary masking techniques to provide binaural speech enhancement while preserving the inter-aural time and level differences that are critical for the spatial separation of sound sources.

To train and tune our binaural speech enhancement algorithm we will also develop within the project an intelligibility metric that is able to predict the intelligibility of a speech signal for a binaural listener with normal or impaired hearing in the presence of competing noise sources. This metric is the key to finding automatically the optimum settings an individual listener’s hearing aids in a particular environment.

The final evaluation and development of the binaural enhancement algorithm will assess speech perception in noise in a panel of hearing-impaired listeners who will also be asked to assess the quality of the enhanced speech signals.

Relevant publications from SAP group:

  1. A. H. Moore, C. Evers, P. A. Naylor: Direction of Arrival Estimation in the Spherical Harmonic Domain Using Subspace Pseudointensity Vectors. In: IEEE/ACM Trans. Audio Speech Language Process., 25 (1), pp. 178-192, 2017.
  2. C. S. J. Doire, M. Brookes, P. A. Naylor: Robust and efficient Bayesian adaptive psychometric function estimation. In: J. Acoust. Soc. Am., 141 (4), pp. 2501-2512, 2017.
  3. Alastair H. Moore, Mike Brookes, Patrick A. Naylor: Robust spherical harmonic domain interpolation of spatially sampled array manifolds. In: Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2017.
  4. L. Lightburn, E. De Sena, AH. Moore, PA. Naylor, M. Brookes: Improving the Perceptual Quality of Ideal Binary Masked Speech. In: Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2017.
  5. Nikolaos Dionelis, Mike Brookes: Modulation-Domain Speech Enhancement using a Kalman Filter with a Bayesian Update of Speech and Noise in the Log-Spectral Domain. In: Proc. Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), San Francisco, CA, USA, 2017.
  6. C. S. J. Doire, M. Brookes, P. A. Naylor, C. M. Hicks, D. Betts, M. A. Dmour, S. Holdt Jensen: Single-Channel Online Enhancement of Speech Corrupted by Reverberation and Noise. In: IEEE/ACM Trans. Audio Speech Language Process., 25 (3), pp. 572-587, 2016.
  7. L. Lightburn, M. Brookes: A Weighted STOI Intelligibility Metric based on Mutual Information. In: Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2016.
  8. AH. Moore, PA. Naylor: Linear prediction based dereverberation for spherical microphone arrays. In: Proc. Intl. Workshop on Acoustic Signal Enhancement (IWAENC), 2016.
  9. Alastair H. Moore, Christine Evers, Patrick A. Naylor: 2D direction of arrival estimation of multiple moving sources using a spherical microphone array. In: Proc. European Signal Processing Conf. (EUSIPCO), 2016.
  10. Nikolaos Dionelis, Mike Brookes: Active speech level estimation in noisy signals with quadrature noise suppression. In: Proc. European Signal Processing Conf. (EUSIPCO), 2016.

Contact us

Address

Speech and Audio Processing Lab
CSP Group, EEE Department
Imperial College London

Exhibition Road, London, SW7 2AZ, United Kingdom

Email

p.naylor@imperial.ac.uk