Master Thesis

Master Thesis https://onlineresource.ucsy.edu.mm/handle/123456789/2230 2026-07-29T20:26:14Z 2026-07-29T20:26:14Z OPINION MINING SYSTEM OF CUSTOMER REVIEWS BY USING FEATURE EXTRACTION LWIN, NANDAR MOH MOH https://onlineresource.ucsy.edu.mm/handle/123456789/2796 2023-08-09T15:25:24Z 2023-07-01T00:00:00Z

OPINION MINING SYSTEM OF CUSTOMER REVIEWS BY USING FEATURE EXTRACTION LWIN, NANDAR MOH MOH Due to the dramatic improvement of ecommerce, web sources which are important for both potential customers and service providers rapidly emerge in prediction and decision purposes. Opinion mining techniques become popular to automatically process customer reviews by extracting features and user opinions expressed over them. To overcome the task of manual scanning through the large amount of one-by-one review, people have interested to automatically process the various reviews and to provide the information which is useful for customers and service providers. By applying dependency relations, it can properly identify the semantic relationships between features and opinions of each review. It can find the numeric score of all the features using SentiWordNet. This system is intended to collect customer reviews from tourism field and then extract the related features and opinions to rate the services. Finally, it can rank each agency according to the final result of each review sentence. In this thesis, Standard Parser is used to generate the features, opinions and the dependency relations for each trip review at the preprocessing. The two methods of features extraction such as frequency-based feature extraction and dependency grammar-based feature extraction are used to extract the most relevant trip features. Moreover, SentiWordNet 3.0 is used to get the positive score and negative score for each trip feature and then the system calculates the total weight of the trip review by using these numeric scores. The objective of the system is to rank the travel agencies according to the final weight of each travel agency that is collected by adding the total weight of the trip reviews for that agency. Therefore, the system implements efficiency and effectiveness in opinion mining to express the reviewer’s opinion and feeling for next customers’ trip plans by using features extraction. In this system, Tourism Reviews are applied as the case studies to identify what elements of an agency affect sales most and what are the features the customer like or dislike so that trip managers and agency owners can target on those areas. The system is developed using Java language and MySQL to build the database.

2023-07-01T00:00:00Z THE CAR INSURANCE CLAIM PREDICTION SYSTEM BY USING MACHINE LEARNING ALGORITHMS ON APACHE SPARK PLATFORM Ko, Thein Than https://onlineresource.ucsy.edu.mm/handle/123456789/2795 2023-06-04T15:10:22Z 2023-05-01T00:00:00Z

THE CAR INSURANCE CLAIM PREDICTION SYSTEM BY USING MACHINE LEARNING ALGORITHMS ON APACHE SPARK PLATFORM Ko, Thein Than Car insurance companies face a major challenge in dealing with insurance claims, which are prone to fraud and increasing in volume. This makes it difficult for insurers to classify claims during the review process. To address this issue, the aim of this study is to develop four Car Insurance Claim Prediction Classifiers with Random Forest and Logistic regression based on the car insurance claim dataset respectively and supports for comparison which method and attributes are more suitable for car insurance companies. Firstly, this proposed system creates a feature selection model using Variance Threshold Selector method to select the important attributes impact on the accuracy of car insurance claim prediction classifiers. The data set is split into training with 80% and testing sets with 20% randomly and the two classifiers with all attributes, the training dataset is used to create the LR classifier and RF classifier. For two classifiers with the feature selection method, the system creates the new training dataset and new testing dataset by removing low variance value of attributes using Variance Threshold Selector method. After that, two LR classifier and RF classifier are been created by using new datasets. The system has analyzed the different attributes: 30, 32, 34, 36, 38, 40 and 42 to choose the number of attributes and important attributes and tested 10 times for each attribute number because of splitting training and testing datasets randomly. Finally, the system compares the evaluation results with metrics: accuracy and f score. RF classifiers with and without the feature selection method are suitable for the proposed system than LR classifiers. Among different attribute numbers, the classifiers based on 38 attributes and 40 attributes are the best classifiers and classifier based on 42 attributes are the second best classifier.

2023-05-01T00:00:00Z CLASSIFICATION OF BANK MARKETING DATA USING SUPPORT VECTOR MACHINE Khin, Ei Ei https://onlineresource.ucsy.edu.mm/handle/123456789/2794 2023-06-04T14:35:22Z 2023-05-01T00:00:00Z

CLASSIFICATION OF BANK MARKETING DATA USING SUPPORT VECTOR MACHINE Khin, Ei Ei Nowadays, banking system plays an important role of financial sectors all over the world. The more accurate predictive modeling system is required for their services or products in the banking industry. Bank workers can make those predictive models with manually, but this process takes long time and lots of man-hours. For these reasons, machine learning techniques are useful to predict the outcomes with huge amounts of data. Classification is an important technique to analyze and to predict the data. This system will implement the classification of bank marketing data using support vector machine (SVM) to predict the probability of the customers’ subscription to the term deposit whether subscribe or not. Support Vector Machine (SVM) is a supervised learning model used for classification and prediction of data. The purpose of this system is to predict the customers' response to the term 'deposit' using bank marketing data. The precision, recall, and F-Measure confusion matrix is used to gauge the system's correctness. In the first experiment when the training data is used, the accuracy without feature engineering is 86%, the accuracy with feature engineering is 83% and the accuracy with feature engineering of Correlation Matrix and Principal Component Analysis gets 96%. In the second experiment which is used the testing data, the accuracy without feature engineering gets 85%, the accuracy with feature engineering before using PCA is 83% and the accuracy after using PCA is 95%. The system shows the best results in both training data and testing data after using the Principal Component Analysis.

2023-05-01T00:00:00Z MYANMAR ENTITY IDENTIFICATION FOR NATURAL LANGUAGE UNDERSTANDING USING BIDIRECTIONAL LONG SHORT TERM MEMORY (BiLSTM) PHWAY, SAUNG THAZIN https://onlineresource.ucsy.edu.mm/handle/123456789/2790 2023-02-17T06:20:22Z 2023-01-01T00:00:00Z

MYANMAR ENTITY IDENTIFICATION FOR NATURAL LANGUAGE UNDERSTANDING USING BIDIRECTIONAL LONG SHORT TERM MEMORY (BiLSTM) PHWAY, SAUNG THAZIN Entity identification is an exacting function which has commonly appropriate broad chunk of awareness in the course of feature engineering and word list to attain great achievement.. Entity Identification (EI) is indispensable of perceptive article character from basic input and resolve the division the morphemes characterizes. This paper presents every Entities Recognition (ER) for Myanmar language using Bidirectional Long Short Term Memory (BiLSTM), eliminating the need for most feature construction. Entity contains people, location, grouping, date_time_month, numerical values, etc. Myanmar expression is still ambitious to analyze Name Entity (NE) as well as familiar conversation so it bags of geographical instruction towards noticeable items, never barrier explanation among words and none capitalization comparable other languages. Myanmar Natural Language Processing (NLP) is told to be closed growing along with has directly been excruciating to be matured. Considering that logic, Entity Identification (EI) entitled collection for Burma ER analysis is annually explained and built as composing that monograph. The elucidate EI bulk is crucial for Myanmar ER research’s improvement . For planned entity classification research, those entity titled compilation is tested all the while entire the aimed evidence for Burma ER and it will also be determined. By using BiLSTM based network architecture, the best accuracy is achieved with 83.62%. Accordingly, here task dispose of the aspect engineering development and does not demand to acquire not only expression but also territory ability.

2023-01-01T00:00:00Z