UCSY's Research Repository

THE CAR INSURANCE CLAIM PREDICTION SYSTEM BY USING MACHINE LEARNING ALGORITHMS ON APACHE SPARK PLATFORM

Show simple item record

dc.contributor.author Ko, Thein Than
dc.date.accessioned 2023-06-04T15:09:16Z
dc.date.available 2023-06-04T15:09:16Z
dc.date.issued 2023-05
dc.identifier.uri https://onlineresource.ucsy.edu.mm/handle/123456789/2795
dc.description.abstract Car insurance companies face a major challenge in dealing with insurance claims, which are prone to fraud and increasing in volume. This makes it difficult for insurers to classify claims during the review process. To address this issue, the aim of this study is to develop four Car Insurance Claim Prediction Classifiers with Random Forest and Logistic regression based on the car insurance claim dataset respectively and supports for comparison which method and attributes are more suitable for car insurance companies. Firstly, this proposed system creates a feature selection model using Variance Threshold Selector method to select the important attributes impact on the accuracy of car insurance claim prediction classifiers. The data set is split into training with 80% and testing sets with 20% randomly and the two classifiers with all attributes, the training dataset is used to create the LR classifier and RF classifier. For two classifiers with the feature selection method, the system creates the new training dataset and new testing dataset by removing low variance value of attributes using Variance Threshold Selector method. After that, two LR classifier and RF classifier are been created by using new datasets. The system has analyzed the different attributes: 30, 32, 34, 36, 38, 40 and 42 to choose the number of attributes and important attributes and tested 10 times for each attribute number because of splitting training and testing datasets randomly. Finally, the system compares the evaluation results with metrics: accuracy and f score. RF classifiers with and without the feature selection method are suitable for the proposed system than LR classifiers. Among different attribute numbers, the classifiers based on 38 attributes and 40 attributes are the best classifiers and classifier based on 42 attributes are the second best classifier. en_US
dc.language.iso en en_US
dc.publisher University of Computer Studies, Yangon en_US
dc.subject APACHE SPARK PLATFORM en_US
dc.subject MACHINE LEARNING ALGORITHMS en_US
dc.title THE CAR INSURANCE CLAIM PREDICTION SYSTEM BY USING MACHINE LEARNING ALGORITHMS ON APACHE SPARK PLATFORM en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository



Browse

My Account

Statistics