MONGIOVI, M.; http://lattes.cnpq.br/7535849756393864; SABINO, Melina Mongiovi Cunha Lima.
Resumo:
Defining and implementing refactorings is a nontrivial task since it is difficult to define preconditions to guarantee that the transformation preserves the program behavior. There fore, refactoring engines may have overly weak preconditions, overly strong preconditions, and transformation issues related to the refactoring definition. In practice, developers manually write test cases to check their refactoring implementations. We find that 84% of the test suites of Eclipse and JRRT are concerned with identifying these kinds of bugs. However, bugs are still present. Researchers have proposed a number of techniques for testing refactoring engines. Nevertheless, they may have limitations related to the bug type, program generation, time consumption, and number of refactoring engines necessary to evaluate the implementations. In this work, we propose a technique to scale testing of refactoring engines by extending a previous technique. It automatically generates programs as test inputs using Dolly, a Java and C program generator. We add more Java constructs in DOLLY, such abstract classes and methods and interface, and a skip parameter to reduce the time to test the refactoring implementations by skipping some consecutive test inputs. Our technique uses SAFEREFACTORIMPACT to identify failures related to behavioral changes. It generates test cases only for the methods impacted by a transformation. Also, we propose a new oracle to evaluate whether refactoring preconditions are overly strong by disabling a subset of them. Finally, we present a technique to identify transformation issues related to the refactoring definition. We evaluate our technique in 28 refactoring implementations of Java (Eclipse and JRRT) and C (Eclipse) and find 119 bugs related to compilation errors, behavioral changes, overly strong preconditions, and transformation issues. The technique reduces the time in 90% and 96% using skips of 10 and 25 in Dolly while missing only 3% and 6% of the bugs, respectively. Additionally, it finds the first failure in general in a few seconds using skips. Finally, we evaluate our proposed technique by using other test inputs, such as the input programs of Eclipse and JRRT refactoring test suites. We find 31 bugs not detected by the developers.