Four Supervised Models for Identifying Suicide Indicators in Text Data

Authors

  • Ricardo Ulises Caballero Hidalgo Faculty of Computer Science, Benemerita Universidad Autonoma de Puebla
  • Mireya Tovar Vidal Faculty of Computer Science, Benemerita Universidad Autonoma de Puebla
  • Meliza Contreras González Faculty of Computer Science, Benemerita Universidad Autonoma de Puebla

DOI:

https://doi.org/10.61467/2007.1558.2025.v16i4.582

Abstract

The increasing prevalence of mental health disorders like depression and anxiety often leads to suicide, with individuals frequently expressing such thoughts on social media. Utilizing machine learning techniques to analyze social media texts would work towards preventing these outcomes, even though predicting suicide risk remains a challenge. In this study machine learning classifiers were developed aiming to detect suicide indicators using the Kaggle Suicide and Depression Detection dataset (Komati, N. Suicide Watch). Four models—Multinomial Naive Bayes, Gradient Boost Machine (GBM), Random Forest and Support Vector Machines—were tested, yielding promising results: Among the four models presented here SVM with a 0.95 Precision and 0.94 F1 score showed the best results.

Keywords: Suicide, Supervised Learning, Machine Learning, Naïve Bayes, Gradient Boosting Machine, Random Forest, Support Vector Machines.

Downloads

Published

2025-10-12

How to Cite

Caballero Hidalgo, R. U., Tovar Vidal, M., & Contreras González, M. (2025). Four Supervised Models for Identifying Suicide Indicators in Text Data . International Journal of Combinatorial Optimization Problems and Informatics, 16(4), 505–514. https://doi.org/10.61467/2007.1558.2025.v16i4.582

Issue

Section

Ontologies and Knowledge Graphs

Most read articles by the same author(s)