BIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK BASED MYANMAR DIALOGUE ACT RECOGNITION

YEE, SANN SU SU

dc.contributor.author	YEE, SANN SU SU
dc.date.accessioned	2024-07-11T05:32:52Z
dc.date.available	2024-07-11T05:32:52Z
dc.date.issued	2024-06
dc.identifier.uri	https://onlineresource.ucsy.edu.mm/handle/123456789/2807
dc.description.abstract	This research aims to develop the deep learning-based Myanmar Dialogue Act Recognition (MDAR) system to enhance Myanmar Dialogue Systems. Dialogue Act (DA) recognition is a foundational aspect of dialogue understanding, capturing user intent at the sentence level with units such as greeting, question, and inform. By identifying these intents, dialogue systems can interact more naturally and effectively with users. This study explores current approaches to DA recognition, specifically focusing on Myanmar dialogues, a previously underrepresented area in Natural Language Processing (NLP) research. Initially, two machine learning techniques— Naïve Bayes classifier and Support Vector Machine (SVM)—were applied to the MmTravel corpus, a dataset comprising Myanmar travel-related conversations. Both approaches demonstrated moderately good accuracy for Myanmar dialogue tagging, with SVM showing a slightly better performance. Recognizing the critical role of Spoken Language Understanding (SLU) in dialogue systems, this research emphasizes DA recognition as an essential pre- processing step for speech understanding. To further improve DA recognition accuracy, this research proposes a deep learning-based DA model utilizing a Bi- directional Long Short-Term Memory (Bi-LSTM) Recurrent Neural Network (RNN). The proposed model architecture includes a word-encoding layer to transform input text into word embeddings, a Bi-LSTM layer to capture context from both past and future inputs, and a softmax layer for classifying the dialogue acts. The use of word2vec for language modeling in MDAR enhances the system's ability to understand and process Myanmar dialogues more effectively. A significant contribution of this work is the creation and annotation of the MmTravel corpus, which consists of 80,000 utterances from human-human travel domain conversations. The construction of the MmTravel corpus is especially crucial for low-resource languages like Myanmar, providing a robust data foundation necessary for training effective machine learning models. This corpus not only facilitates the development of the MDAR system but also contributes valuable resources to the broader NLP community, promoting further research and development in underrepresented languages. The research reports a detailed analysis and comparison of the proposed Bi- LSTM model with traditional RNN, LSTM, and baseline SVM models. Experimentaliii results demonstrate that the Bi-LSTM model outperforms previous approaches, achieving an accuracy improvement of over 2% compared to the SVM model on the MmTravel corpus. This research not only advances in Myanmar dialogue act recognition but also contributes to the broader field of multilingual NLP by providing robust methodologies and resources for underrepresented languages. The insights gained from this research can be applied to other low-resource languages, paving the way for more inclusive and diverse NLP technologies.	en_US
dc.language.iso	en	en_US
dc.publisher	University of Computer Studies, Yangon	en_US
dc.subject	MYANMAR DIALOGUE ACT RECOGNITION	en_US
dc.title	BIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK BASED MYANMAR DIALOGUE ACT RECOGNITION	en_US
dc.type	Thesis	en_US