Applied Sciences, Vol. 15, Pages 13020: Credit Evaluation Through Integration of Supervised and Unsupervised Machine Learning: Empirical Improvement and Unsupervised Component Analysis


Applied Sciences, Vol. 15, Pages 13020: Credit Evaluation Through Integration of Supervised and Unsupervised Machine Learning: Empirical Improvement and Unsupervised Component Analysis

Applied Sciences doi: 10.3390/app152413020

Authors:
Rodrigue G. Atteba
Thanda Shwe
Israel Mendonça
Masayoshi Aritsugi

In the financial sector, machine learning has become essential for credit risk assessment, often outperforming traditional statistical approaches, such as linear regression, discriminant analysis, or model-based expert judgment. Although machine learning technologies are increasingly being used, further research is needed to understand how they can be effectively combined and how different models interact during credit evaluation. This study proposes a technique that integrates hierarchical clustering, namely Agglomerative clustering and Balanced Iterative Reducing and Clustering using Hierarchies, along with individual supervised models and a self organizing map-based consensus model. This approach helps to better understand how different clustering algorithms influence model performance. To support this approach, we performed a detailed unsupervised component analysis using metrics such as the silhouette score and Adjusted Rand Index to assess cluster quality and its relationship with the classification results. The study was applied to multiple datasets, including a Taiwanese credit dataset. It was also extended to a multiclass classification scenario to evaluate its generalization ability. The results show that the quality metrics of the cluster correlate with the performance, highlighting the importance of combining unsupervised clustering and self organizing map consensus methods for improving credit evaluation.



Source link

Rodrigue G. Atteba www.mdpi.com