SANTOS, David Moises Barreto dos.; http://lattes.cnpq.br/5904935830618045; SANTOS, David Moises Barreto dos.
Abstract:
The automated induction process of classification models is composed of the following steps: sampling, splitting and classification. Due to the diversity of techniques that may be used in each one of these steps, the results are much unstable — there is not the best
sampling technique or the best splitting technique and so on. In order to solve the
problem of inducing the best classification model within a search space, one way may
arrange all available techniques for sampling, splitting and classification to search
exhaustively the best classification model. However, this will imply in a high
computational cost. In this work, we describe how to discover heuristics that help to
reduce this computational cost. Also, we discuss the design and implementation of a
framework that was developed based on the heuristics discovered in order to guide the
miner in choosing a classification model which satisfies his or her quality requirements
translated in the accuracy of the classification model induced with acceptable
computational cost.