http://lattes.cnpq.br/3845232602852992; COSTA, José Aldo Silva da.
Resumo:
As software evolves, developers perform repetitive edits while adding features and fixing
bugs. Programming-by-Example (PbE) techniques automate repetitive edits by inferring
transformations from examples. However, examples are ambiguous and limited, since users want to provide a minimum of them (preferably 1). Thus, PbE techniques need to rank the inferred transformations to select the ones that best fit the user intent. Common ranking approaches favor the simplest or the shortest transformations, or they assign weights to their specific characteristics, or features. However, the ideal weight of each feature varies according to the problem domain and finding these weights requires manual effort and specific knowledge. We propose a Machine Learning (ML) based approach to reduce the manual effort in finding the weights for efficient ranking functions, which rank the desired transformation using the minimum number of examples. Our approach comprehends a) training/testing database, b) feature extraction, c) model training and testing, and d) ranking instantiation. We also investigate the effect of negative examples on the ranking approaches efficiency, as well as the accuracy of the top-10 rank positions. We compare five approaches: a) Support Vector Machine (SVM), b) Logistic Regression (LR), c) Neural Networks (NN), d) Human-Expert (HE), and e) Random Weights (RW). We evaluate them in 28 scenarios of five C# projects from GitHub using REFAZER technique that learns multiple transformations from examples. We measure the approaches’ efficiency by counting the examples required to put the correct transformation in the first position, adding negative examples to prevent transformations that edit unneeded locations. As a result, LR presented a similar efficiency compared to HE, with example means of 1.67 and 1.64, respectively. Compared to RW, LR provides a statistical difference, with p-value < 0.05. Concerning the effectiveness, LR is similar to HE with both
Precision and NDCG of 0.5 and superior to RW with 0.2. Therefore, the ML-based ranking
approach can be as efficient as HE, while reducing the manual effort in finding weights to
build ranking functions of PbE tool’s designers.