A study on the Detection of Fake news in Spanish
DOI:
https://doi.org/10.61467/2007.1558.2024.v15i2.467Keywords:
Fake News, Twitter, Natural Language, Machine Learning, Deep Learning, TransformersAbstract
False information published with the intention of misleading social media users is known as fake news. These are created to appear as credible and genuine information and can manipulate opinions and be disseminated for political or financial purposes (Kaliyar et al., 2021). Fake news is especially propagated on Twitter, today X due to its great capacity for interaction with users, as well as the possibility of retweeting and commenting, which allows for greater dissemination of information.
This study proposes a model for detecting fake news in Spanish, which faces challenges such as linguistic diversity and limited resources available for preprocessing. Using a database of approximately 40,000 news extracted from two acquaintances news accounts in Mexico on Twitter, such as “Reforma” and “El Deforma”, from 2019 to 2024, a model based on Natural Language Processing, Machine Learning, Deep Learning, and transformer models were developed. This model allows distinguishing whether a headline of a news article in Spanish published on Twitter is true or fake.
The algorithms used include Logistic Regression, Naïve Bayes, Support Vector Machines, LSTM, Bidirectional LSTM and mBERT and BETO. After comparing their results, the best accuracy of 0.98 was obtained with BETO. Therefore, transformer-based models outperformed the other approaches used in the study in terms of accuracy. This study allowed identifying the words frequently used in the corpus of fake news, concluding that they often use expressions with exaggerated adjectives and words expressing certainty or amazement in a social, political, and entertainment context.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 International Journal of Combinatorial Optimization Problems and Informatics
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.