TOEBE, J.; http://lattes.cnpq.br/3080787144714822; TOEBE, Josué.
Resumo:
A data mining process is a series of steps to obtain knowledge from data. These steps include data preparation, random data sampling, knowledge induction from samples using induction algorithms, and estimation of the exactness of the knowledge induced. The complexity of the process is due to these various related variables. In this work, we propose a data mining process to guide the miner through the mining task. It takes into account the diversity of sample selection and preparation techniques, as well as the existence of various induction algorithms. In order to reduce its costs, the process
is driven by heuristics that help to discover the combination of techniques that fits best
for a given dataset for mining. The process is conceived as an object-oriented processdriven framework, providing a well-designed and well-thought-out infrastructure so that when new pieces are created (a new inducer, or a new preparation technique, or a new sampling technique), they can be incorporated with minimal impact on the other pieces in the framework. To illustrate its application, experiments that follow the process are described.