All Projects → ye-kyaw-thu → myG2P

ye-kyaw-thu / myG2P

Licence: other
Myanmar (Burmese) Language Grapheme to Phoneme (myG2P) Conversion Dictionary for speech recognition (ASR) and speech synthesis (TTS).

Programming Languages

perl
6916 projects

Projects that are alternatives of or similar to myG2P

Hms Ml Demo
HMS ML Demo provides an example of integrating Huawei ML Kit service into applications. This example demonstrates how to integrate services provided by ML Kit, such as face detection, text recognition, image segmentation, asr, and tts.
Stars: ✭ 187 (+334.88%)
Mutual labels:  text-to-speech, asr
MouseTooltipTranslator
chrome extension - When mouse hover on text, it shows translated tooltip using google translate
Stars: ✭ 93 (+116.28%)
Mutual labels:  text-to-speech, dictionary
commonvoice-utils
Linguistic processing for Common Voice
Stars: ✭ 32 (-25.58%)
Mutual labels:  asr, g2p
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (+20.93%)
Mutual labels:  text-to-speech, asr
Asrgen
Attacking Speaker Recognition with Deep Generative Models
Stars: ✭ 31 (-27.91%)
Mutual labels:  text-to-speech, asr
react-native-spokestack
Spokestack: give your React Native app a voice interface!
Stars: ✭ 53 (+23.26%)
Mutual labels:  text-to-speech, asr
spokestack-ios
Spokestack: give your iOS app a voice interface!
Stars: ✭ 27 (-37.21%)
Mutual labels:  text-to-speech, asr
Zerospeech Tts Without T
A Pytorch implementation for the ZeroSpeech 2019 challenge.
Stars: ✭ 100 (+132.56%)
Mutual labels:  text-to-speech, asr
asr24
24-hour Automatic Speech Recognition
Stars: ✭ 27 (-37.21%)
Mutual labels:  asr, g2p
ASR-Audio-Data-Links
A list of publically available audio data that anyone can download for ASR or other speech activities
Stars: ✭ 179 (+316.28%)
Mutual labels:  asr
EN-FA-CS-Dictionary
💬 An English-Persian Dictionary of Computer Science and Artificial Intelligence
Stars: ✭ 97 (+125.58%)
Mutual labels:  dictionary
korean-dict-nikl
국립국어원 사전 / FOSS Korean dictionary by National Institute of Korean Language
Stars: ✭ 71 (+65.12%)
Mutual labels:  dictionary
hawking
The retro text-to-speech bot for Discord
Stars: ✭ 24 (-44.19%)
Mutual labels:  text-to-speech
csv2plist.py
Convert a .csv to .plist array for usage with iOS / OSX app development
Stars: ✭ 29 (-32.56%)
Mutual labels:  dictionary
bing dictionary
Bing comand line dictionary
Stars: ✭ 25 (-41.86%)
Mutual labels:  dictionary
strongs-dictionary-xml
Strong's Greek Dictionary in XML with real Greek
Stars: ✭ 65 (+51.16%)
Mutual labels:  dictionary
The-Code-Bending-Dictionary
🧚🏽‍♀️ learn tech vocab in a friendly way 🧚🏽‍♀️ CONTRIBUTIONS WELCOME! 🔥
Stars: ✭ 19 (-55.81%)
Mutual labels:  dictionary
jyut-dict
A free, open-source, offline Cantonese Dictionary for Windows, Mac, and Linux. Qt, SQLite. C++ and Python.
Stars: ✭ 67 (+55.81%)
Mutual labels:  dictionary
brasiltts
Brasil TTS é um conjunto de sintetizadores de voz, em português do Brasil, que lê telas para portadores de deficiência visual. Transforma texto em áudio, permitindo que pessoas cegas ou com baixa visão tenham acesso ao conteúdo exibido na tela. Embora o principal público-alvo de sistemas de conversão texto-fala – como o Brasil TTS – seja formado…
Stars: ✭ 34 (-20.93%)
Mutual labels:  text-to-speech
priority-queue-dictionary
A Pythonic indexed priority queue
Stars: ✭ 74 (+72.09%)
Mutual labels:  dictionary

myG2P

Myanmar (Burmese) Language Grapheme to Phoneme (myG2P) Conversion Dictionary for speech recognition (ASR) and speech synthesis (TTS).

မြန်မာလိုဖတ်မယ်ဆိုရင် --> README in Myanmar Language

Lincense

Creative Commons Attribution-NonCommercial-Share Alike 4.0 International (CC BY-NC-SA 4.0) License
Details Info of License

Contact email: wasedakuma[at]gmail.com

Introduction

We developed this myG2P (Myanmar Grapheme-to-Phoneme) dictionary for VoiceTra (Multilingual Speech Translation Application) Myanmar language project of NICT, Japan (during 2014-2015). We mainly used MLC (Myanmar Language Commission) dictionary words. Please cite the ICCA 2015 paper and/or COLING 2016 paper, if you use myG2P dictionary. Please cite PACLING 2015 paper, if you are talking about sentence level grapheme-to-phoneme conversion of Myanmar language.

Grapheme to Phoneme Mapping

The Myanmar Language Commission (MLC) Pronunciation Dictionary can be used as a basis for pronunciation mapping. We found it necessary to extend the dictionary with foreign pronunciations. In the proposed mapping table there are 23 phonetic symbols for 33 consonants (some consonants share the same pronunciation, for example, “ဒ”, “ဓ”, “ဍ” and “ဎ” in Table1), 87 vowels combinations and 20 special symbols for foreign word pronunciations. Characters are grouped according to their pronunciation; the groups are unaspirated, aspirated, voiced and nasal and are shown in Table 1. Many Myanmar syllables containing unaspirated and aspirated consonants are pronounced as voiced consonants depending on the neighboring context. Some foreign pronunciations have to be expressed by special vowel combinations because Myanmar pronunciations do not include some pronunciations. See Table 3. MLC dictionary was extended by defining 26 more symbols to include phoneme mappings for foreign words for example, the Myanmar phonetic representation of the foreign name “Alex” “အဲလက်(စ်)” is e:le’S (here, S is for (စ်)) and “Swift” “ဆွစ်(ဖ်)(ထ်)” is hswi’HPHT (here, HP is for (ဖ်) and HT is for (ထ်)).

Table 1: Groups of Myanmar consonants and their pronunciations

Contextually Independent Pronunciation

This section explains how the pronunciation of Myanmar syllables is normally derived from orthographic structure. Myanmar syllables are generally composed of consonants and (zero or more) vowel combinations starting with a consonant. Here, vowel combinations can be a single vowel, sequences of vowels starting with a consonant that modifies the pronunciation of the first vowel. The pronunciations of consonants when they are combined with vowels are shown in Table 2.

Table 2: Examples of vowel combinations and their pronunciations

Contextually Dependent Pronunciations

Some Myanmar syllables do not conform to these standard rules of pronunciation. The pronunciation of the syllables can depend on the context of syllables. Differences between standard pronunciations and correct pronunciations of some words are shown in Table 3 as examples.

Tagle 3: Examples of contextually dependent pronunciations of some Myanmar words

Dictionary Format

The dictionary format is distributed as a plain text file with one entry to a line in the format as follow:

Word-ID<TAB>Word<TAB>Syllable-Breaked-Word<TAB>Pronunciation<TAB>IPA

Example:

19663	သုတ	သု တ	thu. ta.	θṵ ta̰
19664	သုတစာပေ	သု တ စာ ပေ	thu. ta. sa pei	θṵ ta̰ sà pè
19665	သုတိ	သု တိ	thu. ti.	θṵ tḭ
19666	သုတေသန	သု တေ သ န	thu. tei tha- na.	θṵ tè θə na̰
19667	သုတေသီ	သု တေ သီ	thu. tei thi	θṵ tè θì
19668	သုဓမ္မာဇရပ်	သု ဓမ် မာ ဇ ရပ်	thu. da- ma za- ja'	θṵ də mà zə jaʔ
19669	သုဓာဘုတ်	သု ဓာ ဘုတ်	thou' da bou'	θoʊʔ dà boʊʔ
19670	သုနာပရန္တတိုင်း	သု နာ ပ ရန် တ တိုင်း	thu. na pa- ran ta. tain:	θṵ nà pə ɹàɴ ta̰ táɪɴ
19671	သုဘရာဇာ	သု ဘ ရာ ဇာ	thu. ba. ja za	θṵ ba̰ jà zà
19672	သုမင်္ဂလ	သု မင် ဂ လ	thu. min ga- la.	θṵ mɪ̀ɴ ɡə la̰

Versions

Version.1.0, Released Date: May 30, 2017
Version.1.1, Released Date: Feb 25, 2019
Version.2.0, Released Date: Feb 15, 2021

Development and Support

Contributors for developing myG2P dictionary are as follows:

for myG2P (Version 1.0)

Win Pa Pa
Ye Kyaw Thu

for myG2P (version 2.0)

  • Honey Htun (Ph.D. Candidate, Yangon Technological University, Myanmar)
  • Ni Htwe Aung (Ph.D. Candidate,Yangon Technological University, Myanmar)
  • Shwe Sin Moe (a Master's student, Yangon Technological University, Myanmar)
  • Wint Theingi (a Master's student, Yangon Technological University, Myanmar)
  • Ye Kyaw Thu (National Electronics and Computer Technology Center, Thailand)

Acknowledgement

We would like to express our gratitude to Ms. Aye Mya Hlaing and Ms. Hay Mar Soe Naing for checking G2P mappings. We also would like to thanks our NICT colleagues especially to Dr. Jinfu Ni and Dr. Yoshinori Shiga for their valuable suggestions on myG2P development.

To Do

-to add new Myanmar words from various domain

Publication

Ye Kyaw Thu, Win Pa Pa, Andrew Finch, Aye Mya Hlaing, Hay Mar Soe Naing, Eiichiro Sumita and Chiori Hori, "Syllable Pronunciation Features for Myanmar Grapheme to Phoneme Conversion", In Proceedings of the 13th International Conference on Computer Applications (ICCA 2015), February 5~6, 2015, Yangon, Myanmar, pp. 161-167. Paper [Best Paper Award]

Ye Kyaw Thu, Win Pa Pa, Andrew Finch, Jinfu Ni, Eiichiro Sumita and Chiori Hori, 2015, "The Application of Phrase Based Statistical Machine Translation Techniques to Myanmar Grapheme to Phoneme Conversion", In Proceedings of the Pacific Association for Computational Linguistics Conference (PACLING 2015), May 19~21, 2015, Legian, Bali, Indonesia, pp. 170-176. Paper (revised paper has been published in Springer Communication in Computer and Information Science (CCIS), ISSN:1865-0929, pp. 238-250)
☝️ We used myG2P dictionary + extracted 5,276 sentences of BTEC corpus for this PACLING 2015 conference paper

Ye Kyaw Thu, Win Pa Pa, Yoshinori Sagisaka, Naoto Iwahashi, "Comparison of Grapheme–to–Phoneme Conversion Methods on a Myanmar Pronunciation Dictionary", In Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP), COLING 2016, December 11-17, 2016, Osaka, Japan, pp. 11–22. Paper

Workshop Presentation

Title: Grapheme-to-IPA Phoneme Conversion for Burmese (myG2P Version 2.0)
Workshop: the 2nd joint Workshop on NLP/AI R&D, iSAI-NLP 2020, Bangkok, Thailand.
Authors: Honey Htun (YTU, Myanmar), Ni Htwe Aung (YTU, Myanmar), Shwe Sin Moe (YTU, Myanmar), Wint Theingi (YTU, Myanmar), Nyein Nyein Oo (YTU, Myanmar), Thepchai Supnithi (NECTEC, Thailand) and Ye Kyaw Thu (NECTEC, Thailand)

Journal Paper

to appear

Reference

  1. Myanmar-English Dictionary (1993), Department of the Myanmar Language Commission, Ministry of Education, Union of Myanmar.
  2. https://en.wikipedia.org/wiki/International_Phonetic_Alphabet
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].