10th International Congress on Information and Communication Technology in concurrent with ICT Excellence Awards (ICICT 2025) will be held at London, United Kingdom | February 18 - 21 2025.
Authors - Eduardo Puraivan, Pablo Ormeno-Arriagada, Steffanie Kloss, Connie Cofre-Morales Abstract - We are in the information age, but also in the era of disinformation, with millions of fake news items circulating daily. Various fields are working to identify and understand fake news. We focus on hybrid approaches combining machine learning and natural language processing, using surface linguistic features, which are independent of language and enable a multilingual approach. Many studies rely on binary classification, overlooking multiclass problems and class imbalance, often focusing only on English. We propose a methodology that applies surface linguistic features for multiclass fake news detection in a multilingual context. Experiments were conducted on two datasets, LIAR (English) and CLNews (Spanish), both imbalanced. Using Synthetic Minority Oversampling Technique (SMOTE), Random Oversampling (ROS), and Random Undersampling (RUS), we observed improved class detection. For example, in LIAR, the classification of the ‘false’ class improved by 43.38% using SMOTE with Adaptive Boosting. In CLNews, the ROS technique with Random Forest raised accuracy to 95%, representing a 158% relative improvement over the unbalanced scenario. These results highlight our approach’s effectiveness in addressing the problem of multiclass fake news detection in an imbalanced, multilingual context.