Instance Selection for Hybrid and Incomplete Data based on Clustering
DOI:
https://doi.org/10.61467/2007.1558.2025.v16i3.845Keywords:
instance selection, hybrid and incomplete data, clusteringAbstract
This paper presents the HICCS algorithm, a novel clustering approach that handles mixed and incomplete data. HICCS improves clustering by using compact sets as initial clusters, employing holotypes to measure intergroup dissimilarity, and merging clusters based on similarity in an order-independent manner. Additionally, it incorporates a user-defined similarity function, making it adaptable to various real-world domains. Furthermore, we introduce the IS-HICCS algorithm for instance selection, which reduces the instance set without compromising classifier accuracy, highlighting clustering's potential to enhance supervised classification models. We evaluate HICCS and IS-HICCS on synthetic and real-life datasets, showing their statistically superior performance compared to other clustering and instance selection methods, respectively
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Combinatorial Optimization Problems and Informatics

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.