FERREIRA, J. M. S.; http://lattes.cnpq.br/4581901484987925; FERREIRA, José Manoel dos Santos.
Abstract:
Bug reports are critical artifacts in software quality assurance. However, bug reporting, whether by testers or users, is costly; it demands from the reporter a considerable amount of data, such as summary, steps required to reproduce, expected/actual system behavior, severity/priority, and even attachments (screenshots, videos, or log files). Previous research has highlighted how often these data fields are neglected; in response, several guidelines for writing good reports can be found in the literature. Nevertheless, it is reasonable to assess the relative impact of those reported fields on the outcome of the reported bugs, especially the conditions under which they get resolved. As an inquiry, which fields are the most important for helping developers fix a bug? This study investigates a 69k-bugs dataset extracted from the Bugzilla platform. We evaluate five machine learning models to classify the bug resolution status (among FIXED, INVALID, INCOMPLETE, WONTFIX, WORKSFORME, MOVED, DUPLICATED, and INACTIVE), then determine the features that influence the FIXED classification most. The classification process employs standard ML techniques for model optimization, including balancing, grouping, and fine-tuning. Notably, the Random Forest model demonstrated outstanding performance, achieving 71.81% precision, 74.46% accuracy, and 72.32% f-measure, with a remarkable 95% accuracy in classifying FIXED reports. Additionally, this model allowed us to identify the most influential fields for resolution prediction. Among the fields considered, those related to textual data, such as summary, description, and comments, emerged as significant contributors to the field’s importance ranking. Furthermore, attachments added through the comments section showed considerable relevance to bug report resolution, as did the changes made throughout the bug report’s lifecycle. Given these results, filling specific fields in the bug reports can significantly assist in fixing the reported bugs. Consequently, development teams may benefit from considering these findings to establish priorities during the bug-fixing process and allocate resources more effectively for quality assurance. Moreover, communicating the importance of these fields to reporters before submitting bug reports can lead to more focused and informative submissions and help to make better use of their time.