<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
<title>Faculty of Computer Science</title>
<link href="https://onlineresource.ucsy.edu.mm/handle/123456789/694" rel="alternate"/>
<subtitle/>
<id>https://onlineresource.ucsy.edu.mm/handle/123456789/694</id>
<updated>2026-06-08T10:29:51Z</updated>
<dc:date>2026-06-08T10:29:51Z</dc:date>
<entry>
<title>Dependency Head Annotation for Myanmar Dependency Treebank</title>
<link href="https://onlineresource.ucsy.edu.mm/handle/123456789/2545" rel="alternate"/>
<author>
<name>Aye, Hnin Thu Zar</name>
</author>
<author>
<name>Pa, Win Pa</name>
</author>
<id>https://onlineresource.ucsy.edu.mm/handle/123456789/2545</id>
<updated>2020-12-30T10:13:44Z</updated>
<published>2020-11-01T00:00:00Z</published>
<summary type="text">Dependency Head Annotation for Myanmar Dependency Treebank
Aye, Hnin Thu Zar; Pa, Win Pa
Complete manual annotation of dependency treebank needs resources like annotators and&#13;
annotation tools and takes long time and has high possibility of inconsistent annotations&#13;
for free word order languages such as Myanmar. This paper describes a dependency head&#13;
annotation scheme with Universal part-of-speech and Universal Dependencies for&#13;
Myanmar dependency treebank. Currently 22,810 sentences and 680,218 tokens were&#13;
annotated from three corpora for Myanmar dependency treebank. Some language specific&#13;
issues are also described with examples. Raw syntactic structures were annotated&#13;
automatically by UDPipe according to the Universal Dependencies based on Universalpart-of-speech tag scheme. Then unsupervised annotated dependency head structures have&#13;
been manually updated in post processing. To be reliable and speedy post process with&#13;
reduced errors for manual updating, selected sentences were added to the training data&#13;
after being updated. After that the model has been retrained and the remaining sentences&#13;
were parsed by UDPipe. Post processing was repeated until all sentences were updated.&#13;
Some specifications of dependency annotation schemes in sentences encountered in post&#13;
processing are presented with examples. For parsing performance of annotated data, cross&#13;
validation tests and parsing experiments were performed. Moreover, annotated treebank&#13;
data have also been evaluated by CoNLL 2017 evaluation script for parsing performance.&#13;
Results of parsing experiments and evaluation are also reported by unlabeled and labeled&#13;
attachment scores and demonstrated that the proposed method is a suitable way for&#13;
building Myanmar dependency trees. Moreover, syntax structures of treebank are also&#13;
analyzed and syntax information is also presented. This dependency head annotation for&#13;
dependency treebank is the first work for Myanmar language as far as we know.
</summary>
<dc:date>2020-11-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Towards Burmese (Myanmar) Morphological Analysis: Syllable-based Tokenization and Part-of-speech Tagging</title>
<link href="https://onlineresource.ucsy.edu.mm/handle/123456789/2544" rel="alternate"/>
<author>
<name>Ding, Chen Chen</name>
</author>
<author>
<name>Aye, Hnin Thu Zar</name>
</author>
<author>
<name>Pa, Win Pa</name>
</author>
<author>
<name>Nwet, Khin Thandar</name>
</author>
<author>
<name>Soe, Khin Mar</name>
</author>
<author>
<name>Utiyama, Masao</name>
</author>
<author>
<name>Sumita, Eiichiro</name>
</author>
<id>https://onlineresource.ucsy.edu.mm/handle/123456789/2544</id>
<updated>2020-12-30T10:13:44Z</updated>
<published>2019-06-01T00:00:00Z</published>
<summary type="text">Towards Burmese (Myanmar) Morphological Analysis: Syllable-based Tokenization and Part-of-speech Tagging
Ding, Chen Chen; Aye, Hnin Thu Zar; Pa, Win Pa; Nwet, Khin Thandar; Soe, Khin Mar; Utiyama, Masao; Sumita, Eiichiro
This article presents a comprehensive study on two primary tasks in Burmese (Myanmar) morphological&#13;
analysis: tokenization and part-of-speech (POS) tagging. Twenty thousand Burmese sentences of newswire&#13;
are annotated with two-layer tokenization and POS-tagging information, as one component of the Asian&#13;
Language Treebank Project. The annotated corpus has been released under a CC BY-NC-SA license, and it is&#13;
the largest open-access database of annotated Burmese when this manuscript was prepared in 2017. Detailed&#13;
descriptions of the preparation, refinement, and features of the annotated corpus are provided in the first half&#13;
of the article. Facilitated by the annotated corpus, experiment-based investigations are presented in the second&#13;
half of the article, wherein the standard sequence-labeling approach of conditional random fields and a long&#13;
short-term memory (LSTM)-based recurrent neural network (RNN) are applied and discussed. We obtained&#13;
several general conclusions, covering the effect of joint tokenization and POS-tagging and importance of&#13;
ensemble from the viewpoint of stabilizing the performance of LSTM-based RNN. This study provides a solid&#13;
basis for further studies on Burmese processing.
</summary>
<dc:date>2019-06-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Unsupervised Dependency Corpus Annotation for Myanmar Language</title>
<link href="https://onlineresource.ucsy.edu.mm/handle/123456789/2543" rel="alternate"/>
<author>
<name>Aye, Hnin Thu Zar</name>
</author>
<author>
<name>Pa, Win Pa</name>
</author>
<author>
<name>Thu, Ye Kyaw</name>
</author>
<id>https://onlineresource.ucsy.edu.mm/handle/123456789/2543</id>
<updated>2020-12-30T10:13:44Z</updated>
<published>2018-01-01T00:00:00Z</published>
<summary type="text">Unsupervised Dependency Corpus Annotation for Myanmar Language
Aye, Hnin Thu Zar; Pa, Win Pa; Thu, Ye Kyaw
Dependency parsing can provide the connection of linguistic&#13;
unit (words) by a directed links. This paper presents annotating a general domain corpus by using unsupervised approach by applying Universal part-of-speech (U-POS) to build&#13;
Treebank for unsupervised dependency parsing of Myanmar&#13;
Language. Up to now it is still hard task to obtain complete&#13;
syntactic structures for Myanmar Language. Dependency&#13;
structures of words in Myanmar sentences are also presented&#13;
of general words and phrases orders and the relations of basic sentence structures. To annotate by using U-POS, UDPipe&#13;
is used. Moreover, the preliminary results of annotated trees&#13;
and parsing experiment are presented. Parsing experiments&#13;
are evaluated by UDPipe in terms of unlabeled and labeled&#13;
attachment scores: (UAS) and (LAS), which are 93.20%,&#13;
and 91.21% in test experiment respectively.
</summary>
<dc:date>2018-01-01T00:00:00Z</dc:date>
</entry>
<entry>
<title>Improving accuracy of part-of-speech (POS) tagging using hidden markov model and morphological analysis  for Myanmar language</title>
<link href="https://onlineresource.ucsy.edu.mm/handle/123456789/2531" rel="alternate"/>
<author>
<name>Cing, Dim Lam</name>
</author>
<author>
<name>Soe, Khin Mar</name>
</author>
<id>https://onlineresource.ucsy.edu.mm/handle/123456789/2531</id>
<updated>2020-12-30T10:13:44Z</updated>
<published>2020-04-01T00:00:00Z</published>
<summary type="text">Improving accuracy of part-of-speech (POS) tagging using hidden markov model and morphological analysis  for Myanmar language
Cing, Dim Lam; Soe, Khin Mar
In Natural Language Processing (NLP), Word segmentation and Part-ofSpeech (POS) tagging are fundamental tasks. The POS information is also&#13;
necessary in NLP’s preprocessing work applications such as machine&#13;
translation (MT), information retrieval (IR), etc. Currently, there are many&#13;
research efforts in word segmentation and POS tagging developed separately&#13;
with different methods to get high performance and accuracy. For Myanmar&#13;
Language, there are also separate word segmentors and POS taggers based on&#13;
statistical approaches such as Neural Network (NN) and Hidden Markov&#13;
Models (HMMs). But, as the Myanmar language's complex morphological&#13;
structure, the OOV problem still exists. To keep away from error and improve&#13;
segmentation by utilizing POS data, segmentation and labeling should be&#13;
possible at the same time.The main goal of developing POS tagger for any&#13;
Language is to improve accuracy of tagging and remove ambiguity in&#13;
sentences due to language structure. This paper focuses on developing word&#13;
segmentation and Part-of- Speech (POS) Tagger for Myanmar Language. This&#13;
paper presented the comparison of separate word segmentation and POS&#13;
tagging with joint word segmentation and POS tagging.
</summary>
<dc:date>2020-04-01T00:00:00Z</dc:date>
</entry>
</feed>
