JERÔNIMO, C. L. M.; http://lattes.cnpq.br/8814983860861046; JERÔNIMO, Caio Libânio Melo.
Resumen:
Fake news detection methods based on textual features allow early detection of this type of content. This detection strategy does not need information such as the number of likes or the number of shares, informations only available when the news has already been dissemi nated on social networks. Within this scope, the use of lexicons as a resource to assist in the construction of classification features stands out for being a resource capable of adding prior knowledge to the classification process. However, the construction of this type of resource often requires the participation of specialists in the process, which in many contexts makes the process very costly or even unfeasible. In this research, a method for the automatic con struction of fake news lexicons is proposed. The proposed method uses false and real news documents, where terms that help to differentiate these two types of documents are extracted. It is also proposed, from the generated lexicons, a strategy for the construction of classifica tion features based on semantic similarity. In this research, we evaluate and compare models trained with the constructed lexicons and compare them with models trained with lexicons already present in literature. As main results, it was possible to verify that the models that use the generated lexicons were superior in different scenarios, as well as presenting better results when used in conjunction with the lexicons that are present in literature. Finally, an explainable analysis of the models is presented, allowing to reveal nuances of fake news that could only be observed with the help of the lexicons generated in this research.