Abstract:
Text Classification is the task of automatically assigning a set of documents
into certain categories (class or topics) from a predefined set. This also plays an
important role in natural language processing and also crossroads between
information retrieval and machine learning. The dramatic growth of text document in
digital form news website makes the task of text classification more popular over last
ten year. The application of this method can be found in spam filtering, question and
answering, language identification. This book presents the idea of text classification
process in term of using machine learning technique and illustrates how Myanmar
news documents are classified by applying genetic algorithm. The applied system use
Myanmar online news articles from Myanmar news website for the purpose of
training and testing the system. Term Frequency Inverse Document Frequency (TFIDF) algorithm was used to select related feature according to their labelled
documents which are also applied in many text mining methods.