Effective Malicious Features Extraction and Classification for Incident Handling Systems

San, Cho Cho

dc.contributor.author	San, Cho Cho
dc.date.accessioned	2019-11-13T02:20:04Z
dc.date.available	2019-11-13T02:20:04Z
dc.date.issued	2019-10
dc.identifier.uri	http://onlineresource.ucsy.edu.mm/handle/123456789/2376
dc.description.abstract	Each and every day, malicious software writers continue to create new variants, new innovation, new infection, and more obfuscated malware by using packing and encrypting techniques. Malicious software classification and detection play an important role and a big challenge for cyber security research. Due to the increasing rate of false alarm, the accurate classification and detection of malware is a big necessity issue to be solved. This research provides the classification system to differentiate malware from benign and classify malicious types. This research contributes the Malicious Sample Names Extraction (MSNE) procedure and Naming Malicious Samples using the Regular Expression (NMS_RE) technique have been contributed to label the malicious samples. This research also contributes the prominent Malware Feature Extraction Algorithm (MFEA) to point out the dominant features based on the generated report files. The features are API, DLL, and PROCESS called by malicious and benign executables through automated analysis. During the experiments, data cleansing for extracted raw data, applying the n-gram technique, and representing and preparing the malicious dataset have been performed to provide the malware classification system. This research work makes use of two malicious datasets for malware classification. The Benign Malware Classification (BMC) dataset is used for binary class classification system to identify malicious or not and Benign Malware Family Classification (BMFC) dataset is used for multi-class classification system to identify malware family. Chi-Square and Principal Component Analysis (PCA) feature selection methods have been applied in this system to select the best features. Classification algorithms like k-Nearest Neighbor (kNN), Random Forest (RF) and Support Vector Classification (SVC) have been used for multi-class and binary class classification. The proposed approach is able to classify the malicious and benign executable files effectively. This research work provides malware classification using Machine Learning (ML) classifiers. The findings from the experiment prove that the extracted API_DLL features provide the best evaluation metrics in terms of accuracy, confusion matrix (CM), True Positive Rate (TPR), False Positive Rate (FPR), and Receiver Operating Characteristic (ROC) curve area.	en_US
dc.language.iso	en_US	en_US
dc.publisher	University of Computer Studies, Yangon	en_US
dc.title	Effective Malicious Features Extraction and Classification for Incident Handling Systems	en_US
dc.type	Thesis	en_US