Abstract:
Data mining techniques are extensively used for data analysis in a variety of
applications. Especially, credit analyzers often use predictive analysis of credit risk to
determine the applicant users are to be trusted or not. Prediction analysis of credit data for
financial risk management within the banking industry is critical to extend credit to
customers. It is a challenge to build a model well equipped with data mining techniques to
estimate the financial risk for credit data. Especially, it is important that classification of
credit data to determine the credit risk of a customer is good loans or bad loans. Supervised
learning techniques such as K-Nearest Neighbors (KNN) and its improved algorithm,
distance-weighted K-Nearest Neighbors (WKNN) algorithms are experimented in this
system. In addition, the classification model, KNN is compared with WKNN and evaluated
the performance of these classifiers. This thesis intends to predict credit risks using the
KNN classification model which will describe the trustworthiness of an individual for
getting a loan. The classification model is trained and tested with the Credit data. The
Classification performances of KNN and WKNN are compared using banking credit data
from UCI repository. The Dataset consists of 1000 instances and twenty-one attributes.
The proposed system also aims to help in choosing the relevant classifiers for credit
analysis. To evaluate the credit risk, different k values of KNN and WKNN classifiers are
tested and analyzed. The experimented results show that the KNN outperforms the WKNN in terms of accuracy. According to the experiments on performance analysis between KNN and WKNN, the better performer classifier is selected for further prediction of credit risk.