Enhancing Explainability, Privacy, and Fairness in Recidivism Prediction through Local LLMs and Synthetic data

Ma. Angelina  Alarcón Romero; José Antonio  Orizaga Trejo; Daniel Hernández Mota; Luis Fernando Baltazar Villalpando; Ma. Hidalia Cruz Herrera

doi:10.61467/2007.1558.2025.v16i2.1074

Enhancing Explainability, Privacy, and Fairness in Recidivism Prediction through Local LLMs and Synthetic data

Authors

Ma. Angelina Alarcón Romero Universidad de Guadalajara
José Antonio Orizaga Trejo Universidad de Guadalajara
Daniel Hernández Mota Instituto Tecnológico y de Estudios Superiores de Occidente
Luis Fernando Baltazar Villalpando Universidad de Guadalajara
Ma. Hidalia Cruz Herrera Universidad de Guadalajara

DOI:

https://doi.org/10.61467/2007.1558.2025.v16i2.1074

Keywords:

Explainable Artificial Intelligence, Trustworthy AI

Abstract

Predictive policing is considered a high-stake context, where the main challenges in employing an AI solution are to ensure the privacy and fairness of the system while preserving high performance. This usually implies specific demands on the technologies used and their explainability. To alleviate the emerging impediments to adopt a recidivism model, this study exploresan approach employing synthetic data in combination with state-of-the-art NLP techniques, such as transformers-based models running locally. This approach enhances the representation of crimes while preserving data privacy. In particular, we focus on comparing several language models for multilabel classification in Spanish language and techniques such as data reduction, data augmentation and in-distribution validation. The resulting methodology shows the benefits and drawbacks of selecting each language model and highlights the ability of identify and alleviate populations where the model performs significantly worse than the average.

Downloads

Published

2025-03-25

How to Cite

Alarcón Romero, M. A. ., Orizaga Trejo, J. A., Hernández Mota, D., Baltazar Villalpando, L. F., & Cruz Herrera, M. H. (2025). Enhancing Explainability, Privacy, and Fairness in Recidivism Prediction through Local LLMs and Synthetic data. International Journal of Combinatorial Optimization Problems and Informatics, 16(2), 60–70. https://doi.org/10.61467/2007.1558.2025.v16i2.1074