Análisis de emociones en textos en español mediante traducción Automática y modelos BERT Multilingües
Análisis de emociones en textos en español
DOI:
https://doi.org/10.5281/zenodo.17525359Keywords:
Emotion analysis, Natural Language Processing (NLP), Pre-trained models (BERT)Abstract
Emotion analysis in written texts through natural language processing (NLP) techniques is an expanding research area with key applications in mental health, marketing, education, and recommendation systems. This article proposes a systematic approach based on an NLP programming pipeline that enables emotion classification in Spanish texts by leveraging pretrained models originally developed in English. Since the most advanced models for emotion detection—such as BERT (Bidirectional Encoder Representations from Transformers)—have been primarily trained on English datasets, the proposed solution involves automatic translation of Spanish texts into English using the Helsinki-NLP/opus-mt-es-en model. Once translated, the texts are processed using the DistilRoBERTa model fine-tuned for emotion classification (j-hartmann/emotion-english-distilroberta-base), which predicts the emotional category among labels such as joy, sadness, anger, fear, love, and surprise. The pipeline is implemented in Python using specialized libraries such as Hugging Face Transformers for translation and classification tasks, and Scikit-learn for the statistical evaluation of model performance. Predictions are compared to ground truth labels, and evaluation metrics such as the confusion matrix, precision, recall, specificity, accuracy, and F1-scores (macro and weighted) are calculated to assess system effectiveness.
Results show an overall accuracy of 83%, confirming that despite language barriers, the integration of automatic translation with robust pretrained models can produce reliable and replicable results in emotion classification tasks applied to Spanish texts. This study highlights the potential of integrating multilingual NLP tools into real-world affective analysis applications.
References
Cañete, J., Chaperon, G., Fuentes, R., Ho, J.-H., Kang, H., & Pérez, J. (2020). Spanish pre-trained BERT model and evaluation data. Proceedings of the Practical ML for Developing Countries Workshop at ICLR 2020.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of NAACL-HLT 2019, 4171–4186.
Dos Santos, C. N., & Gatti, M. (2014). Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts. Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014), 69–78.
Hartmann, J. (2022). j-hartmann/emotion-english-distilroberta-base. Hugging Face.
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2025 Abraham Jorge Jiménez Alfaro, Norma-Karen , Jhacer-Kharen , Claudia-Teresa

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
