LIMA JÚNIOR, G. E.; http://lattes.cnpq.br/1297720325243111; LIMA JÚNIOR, Gladson Euler.
Abstract:
Precise control of composition in distillation columns is essential to ensure product quality and process performance. However, PID-type feedback controllers, commonly used in these processes, may have limitations depending on the control loop used. In this regard, data-based controls using reinforcement learning (RL) techniques have become an attractive solution due to the algorithm's adaptability to various control conditions. Recent research indicates a focus on the development of studies aimed at the performance of RL algorithms, often overlooking the robustness in environmental modeling. In this work, the development of an integrated dynamic training algorithm using Python and Aspen Plus Dynamics for evaluating different RL models was proposed. The adoption of Aspen Plus Dynamics for training and validation ensures the complexity, non-linearity, and transient nature of the distillation process. To evaluate the algorithm, the methodologies of Q-Learning and Deep Q-Network (DQN), coupled with a PID controller, were applied. The first stage of the work involved evaluating Q-Learning, exploring two strategies: a fixed control action rate of 0.5% and another with a flexible action rate between 0.1% and 5%. In the second stage, the proposal was to replace Q-Learning with DQN while retaining the best control action rate strategy. Through COM-type communication, it was possible to track the observable variables in the Aspen Plus Dynamics software and make changes to the dynamic model from Python throughout the simulation. The results confirm the limitations of the inferential control loop in maintaining the isobutane composition at the base of the column during disturbances in the propane composition in the feed. The strategy with QLearning and a variable action rate showed approximately twice the accuracy compared to the fixed rate, expanding the regions of correctness. On the other hand, the use of DQN control allowed maintaining the composition within specification in 96% of the evaluated test scenarios, with a 52.9% lower IAE compared to Q-Learning with a variable action rate. In this sense, the DQN approach has proven capable of handling a high-dimensional and nonlinear process more robustly.