Enhancing Explainability, Privacy, and Fairness in Recidivism Prediction through Local LLMs and Synthetic data
DOI:
https://doi.org/10.61467/2007.1558.2025.v16i2.1074Keywords:
Explainable Artificial Intelligence, Trustworthy AIAbstract
Predictive policing is considered a high-stake context, where the main challenges in employing an AI solution are to ensure the privacy and fairness of the system while preserving high performance. This usually implies specific demands on the technologies used and their explainability. To alleviate the emerging impediments to adopt a recidivism model, this study exploresan approach employing synthetic data in combination with state-of-the-art NLP techniques, such as transformers-based models running locally. This approach enhances the representation of crimes while preserving data privacy. In particular, we focus on comparing several language models for multilabel classification in Spanish language and techniques such as data reduction, data augmentation and in-distribution validation. The resulting methodology shows the benefits and drawbacks of selecting each language model and highlights the ability of identify and alleviate populations where the model performs significantly worse than the average.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Combinatorial Optimization Problems and Informatics

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.