MOREIRA, D. C.; http://lattes.cnpq.br/5264368962812385; MOREIRA, Danilo Coura.
Abstract:
Due to the development of telephone networks, the use of this environment to support
criminals to commit crimes is increasingly common. Based, then, on the possibility of
individualizing one person from their vocal characteristics, this work proposes using
techniques for the semiautomatic vocal identity recognition of speakers in telephone
environment, aiming help in criminal investigations, directing the attribution of voice
authorship and, thus, suiting as evidence in forensic. For that purpose, are used DC
offset, vocal detector activity, spectral subtraction, normalization and pre-emphasis such
as pre-processing techniques of speech signal, which aim to minimize the negative
effects that provides the telephone environment utterances transmitted by these means,
reducing the errors in feature extraction and subsequently, in the patterns creation of
each speaker. In order to optimize the processing efficiency and robustness to noise
compared to other methods for feature extraction, Mel-Frequency Cepstral Coefficients
(MFCC) was employed. To create the speakers patterns and classification, it was used
the Gaussian Mixture Model (GMM), because provide better results when there is no
dependence of text, due to the speakers are non-cooperative. Aiming at finding the best
parameter setting for the semi-automatic system, experiments were performed
considering an automatic vocal identity recognition system. In this way, it was possible
reach to correct identification rate of up to 87.80%, with a confidence level of 98%.
Lastly, the semiautomatic speaker identification system reached the probability of
99.95% that a given utterance belongs to a given speaker from a set of 30 suspects,
using a confidence level of 98%. Thus, the proposed technique has enabled to provide,
with a tax rate close to 100%, a subset of speakers suspects for subsequent forensic
analysis.