Text Classification Using Naïve Bayesian Classifier with Bigram

Tin, Thandar

UCSYRR Home
/
Conferences
/
Local Conference on Parallel and Soft Computing
/
Fifth Local Conference on Parallel and Soft Computing
/
View Item

dc.contributor.author	Tin, Thandar
dc.date.accessioned	2019-07-22T03:27:31Z
dc.date.available	2019-07-22T03:27:31Z
dc.date.issued	2010-12-16
dc.identifier.uri	http://onlineresource.ucsy.edu.mm/handle/123456789/1102
dc.description.abstract	Classification is a form of data analysis that can be used to extract models describing important data classes or to predict future data trends. Data classification is a two step process. This system is to study the Naïve Bayesian Classifier and to classify the class labels of data sets. In this system, classifier is built on the training data sets and tests the unknown datasets. And then, calculate the accuracy of classifier by using F1-Measure (F1-score). The Naïve Bayesian (NB) classifiers have been one of the most popular techniques as basis of many classification applications both theoretically and practically. Before the classifier is built, standard text documents are read, remove stop words and punctuations, stemming the words by using Porter Stemming Algorithm and then features are extracted by using Bigram probability based on keywords such as preprocessing step. The experiment is performed on IEEE and ACM standard documents, research documents. This system is determined the kind of document, such as medicine, computer, engineering and agriculture by using Naïve Bayesian Classifier.	en_US
dc.language.iso	en	en_US
dc.publisher	Fifth Local Conference on Parallel and Soft Computing	en_US
dc.title	Text Classification Using Naïve Bayesian Classifier with Bigram	en_US
dc.type	Article	en_US