The Comparison of Classification Methods on Software Defect Data Sets

San, Hnin Yi

The Comparison of Classification Methods on Software Defect Data Sets

San, Hnin Yi

URI: http://onlineresource.ucsy.edu.mm/handle/123456789/2251

Date: 2018-12

Abstract:

Nowadays, it is difficult for us to imagine a life without devices that is controlled by software. Software quality has become the main concern during the software development. Software quality is a field of study and practice that describes the desirable attributes of software products. Software quality prediction is a process of utilizing software metrics such as code-level measurements and defect data to build classification models that are able to estimate the quality of program modules. The major problem that affects the quality of datasets is high dimensionality and class imbalanced. A more useful and efficient mechanism is k Nearest Neighbor method, where Nearest Neighbor classify classes of testing dataset based on k nearest neighbor of training dataset. Another mechanism is Class Based Weighted k Nearest Neighbor with BINER Algorithm for classifying classes of testing dataset. By using BINER Algorithm, it narrows down the training dataset range instead of whole training dataset that has the maximum likelihood of occurrence and then CBW k-NN classifies classes of testing dataset based on this range. This thesis is the comparison of two classification methods by classifying classes of testing dataset focuses on NASA MDP (PC1, CM1 and JM1) datasets. The comparison results of two methods based on Accuracy, Reliability, Mean Absolute Error and Root Mean Squared Error.

Show full item record