Prediction of Bank Loans Risks Using K-Nearest Neighbor  Classification Algorithm

Win, Hlaing Hlaing

Prediction of Bank Loans Risks Using K-Nearest Neighbor Classification Algorithm

Win, Hlaing Hlaing

URI: https://onlineresource.ucsy.edu.mm/handle/123456789/2592

Date: 2021-11-23

Abstract:

Data mining techniques are extensively used for data analysis in a variety of applications. Especially, credit analyzers often use predictive analysis of credit risk to determine the applicant users are to be trusted or not. Prediction analysis of credit data for financial risk management within the banking industry is critical to extend credit to customers. It is a challenge to build a model well equipped with data mining techniques to estimate the financial risk for credit data. Especially, it is important that classification of credit data to determine the credit risk of a customer is good loans or bad loans. Supervised learning techniques such as K-Nearest Neighbors (KNN) and its improved algorithm, distance-weighted K-Nearest Neighbors (WKNN) algorithms are experimented in this system. In addition, the classification model, KNN is compared with WKNN and evaluated the performance of these classifiers. This thesis intends to predict credit risks using the KNN classification model which will describe the trustworthiness of an individual for getting a loan. The classification model is trained and tested with the Credit data. The Classification performances of KNN and WKNN are compared using banking credit data from UCI repository. The Dataset consists of 1000 instances and twenty-one attributes. The proposed system also aims to help in choosing the relevant classifiers for credit analysis. To evaluate the credit risk, different k values of KNN and WKNN classifiers are tested and analyzed. The experimented results show that the KNN outperforms the WKNN in terms of accuracy. According to the experiments on performance analysis between KNN and WKNN, the better performer classifier is selected for further prediction of credit risk.