LIMA, I. A.; http://lattes.cnpq.br/6865119422219899; LIMA, Ísis de Andrade.
Résumé:
One of the main problems in the development of filters for speech signals is performance
evaluation. It is not possible to evaluate the technique only by the obtained SNR analysis,
because the quality of the filtered signal is related to its intelligibility. Subjective evaluations
are also not conclusive.
This dissertation presents a comparative evaluation of finite impulse response Wiener
optimal and sub-optimal filters, which allows weighting between noise reduction and distortion
insertion by setting a parameter , through the observation of an automatic speech recognition
(ASR) system error rate.
The 20 order filters were implemented with analysis window of 20 ms (for which the
speech signal can be considered stationary). A sub-optimal filter was tested, for = 0:5,
alpha = 0:7 and = 0:8. The large vocabulary decoder Julius was chosen for the ASR system.
Hidden Markov Models (HMMs) and N-gram language model for Brazilian Portuguese were
used for acoustic and linguistic training.
The tests were performed with 20 sentences from different speakers, totaling 146 words.
The percentage of correctly recognized words for the clean speech signals, additive white Gaussian
noise (AWGN) was obtained, for a SNR of 20 dB, 15 dB, 10 dB, 5 dB, 3 dB, 0 dB, and
filtered signals.
To evaluate the distortion effect caused by filtering, the filtered version of clean speech
signals were processed by the recognizer, and it was observed that the error rate decreases with
the reduction of the parameter (the Wiener filter corresponds to = 1).
Based on the analysis of recognition results for different values of SNR, the application
of sub-optimal filter, with = 0:7, produces the best recognition rate for a specified AWGN
among the four designed filters. The observed improvement was 10% for the lowest SNR and
14% for the highest SNR evaluated.