Measuring the Performance of Large Language Models with Hyperparameters Calibration and Machine Learning Approaches for Sentiment Analysis in Spanish Texts

Authors

  • Mario A. Cruz-Miguel Systems Department, Autonomous Metropolitan University
  • José A. Reyes-Ortiz Systems Department, Autonomous Metropolitan University
  • Josué Padilla-Cuevas Systems Department, Autonomous Metropolitan University
  • Leonardo D. Sánchez- Martínez Systems Department, Autonomous Metropolitan University

DOI:

https://doi.org/10.61467/2007.1558.2025.v16i4.574

Keywords:

Natural Language Processing, Sentiment Analysis, Large Language Models, Hyperparameter calibration, Machine Learning

Abstract

Social networks have become an important source of information in recent years, offering a platform for visualizing people's opinions on various industries and research topics. These opinions may be expressed in Spanish, providing fresh and updated insights that can be analyzed using computational approaches. This research aims to understand the sentiment conveyed in ideas or opinions within Spanish text. As a result, various methods are evaluated to identify the most effective approach. Computational techniques, especially Natural Language Processing (NLP), have become essential for automating this analysis. While traditional machine learning algorithms have been employed for sentiment analysis, the rise of large language models since 2017 has introduced significant challenges in assessing their impact on various NLP tasks. This paper presents a comparison between machine learning approaches and calibrated large language models for sentiment analysis in Spanish texts, measuring their performance. The calibration process consists of two stages: a coarse calibration using exploratory methods and a fine calibration that involves an algorithm for searching hyperparameter values. The evaluation process showed that the most effective machine learning approach combines unigrams and bigrams with the Bayes algorithm, along with exploratory parameter tuning and feature selection, achieving an accuracy of 72.72% and an F1 score of 72.81%. Furthermore, by applying LoRA, a technique that optimizes the fine-tuning of pre-trained models, it was found a best model that we call and store as twitter-xlm-roberta-base/SentUAM, which reached an accuracy of 71.99% and an F1 score of 71.78% in the sentiment analysis of Spanish texts.

Downloads

Published

2025-10-12

How to Cite

Cruz-Miguel, M. A., Reyes-Ortiz, J. A., Padilla-Cuevas, J., & Sánchez- Martínez, L. D. (2025). Measuring the Performance of Large Language Models with Hyperparameters Calibration and Machine Learning Approaches for Sentiment Analysis in Spanish Texts. International Journal of Combinatorial Optimization Problems and Informatics, 16(4), 515–526. https://doi.org/10.61467/2007.1558.2025.v16i4.574

Issue

Section

Ontologies and Knowledge Graphs