CARNEIRO, G. M.; http://lattes.cnpq.br/4271071140769809; CARNEIRO, Guilherme de Melo.
Résumé:
In the context of large-scale software projects, there is an increasing demand for fixes in their
conception process that surpass the tests and quality filters of the Quality Assurance team and
impact the end customers of the product. In order to document these behaviors so that they can be
later analyzed and corrected, software engineering makes use of documents called Bug Reports (BR).
As pointed out by Anvik et al [2], the frequency of new BRs being opened in large projects is high,
exemplified by the Eclipse tool, which already had approximately 190 new BRs being opened daily in
2005. Motivated by this problem, this study proposes and evaluates a BR recommendation system
based on textual similarity, with the differential use of the state-of-the-art text comprehension model
BERT [3] as one of the factors in the similarity calculation. Its objective is to improve suggestions for
BRs with a context close to that provided by the maintainer, which would supposedly increase their
productivity and consequently the number of resolved BRs. As the results obtained attest, there were
gains of approximately 14% in the frequency of relevant BRs for the first 20 recommendations, when
compared to the technique that used only TF-IDF as a textual vectorization model. Finally, the BERT
model added improvements to the evaluated metrics (precision, feedback, and likelihood) when used
in a complementary manner to TF-IDF, but did not perform positively in an isolated manner.