Abstract:
Text to image synthesis is the translation of images from the input language text. The learning process can become easier when the spoken words can visualize with the images. It is one of the popular research field in combination of NLP and computer vision. Generative Adversarial Networks (GAN) have growth in the generation of images from text descriptions. We build the baseline system of Myanmar text to image synthesis and a type of annotated images dataset because there is not efficient annotated image dataset to be used in this implementation. It was created by using partial part of Oxford-102 flowers dataset. Word2Vec algorithms is used to convert word to vectors for the input sentence to GAN. GAN is applied for generation of images from Myanmar language text. This is the first text to image generation using GAN in Myanmar. The two-evaluation metrics are used to measure the quality of images. The quality of the generated images is evaluated using Inception score. The Fréchet Inception Distance (FID) is used to measure the distance between the real images (images from original dataset) and the generated images from the model.