A Grapheme to Phoneme Based Text to Speech Conversion Technique in Unicode Language
DOI:
https://doi.org/10.56294/dm2023191Keywords:
CNN, Grapheme to Phoneme, NLP, TTS Conversion, UnicodeAbstract
Text-to-speech conversion can be done with two approaches: dictionary-based (database) approach and grapheme-to-phoneme (G2P) mapping. One of the drawbacks of this approach is its performance depends on the size of the dictionary or database. In the case of domain specific conversion, a simple rule -based technique is used to play pre-recorded audio for each equivalent token. It is easy to design but its limitation is mapping with the sound database and availability of the audio file in the database. In general, grapheme to phoneme conversion can be used in any domain. Advantages are the limited size of the database required, ease of mapping and compliance with domain. However, G2P suffers from pronounce ambiguity (formation of audio output). This paper will discuss about the grapheme-to -phoneme mapping and its application in text to speech conversion system. In this work, Assamese (an Indian scheduled Unicode language) is used as the experimental language and its performance is analysis with another Unicode language (Hindi). English (ASCII) language will be used as a benchmark to compare with the target language
References
1. Arora A, Gessler Luke, Schneider N, (2020), Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi, in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7791–7795.
2. Nath C, Sarma B. (2023), Analysis of Inflectional Behavior in Indian Languages using Features Extraction Techniques, 2023 International Conference on Advancement in Computation & Computer Technique, IEEE, 8 June 2023, DOI: 1109/InCACCT57535.2023.10141783
3. Alok Parlikar, Sunayana Sitaram, Andrew Wilkinson and Alan W Black (2016), The Festvox Indic Frontend for Grapheme-to-Phoneme Conversion, Carnegie Mellon University Pittsburgh, USA, https://www.cs.cmu.edu/~awb/papers/LREC16_parlikar.pdf, WILDRE3, W3RD WORKSHOP ON Indian language data: resources and evaluation.
4. Kumar C.S.,Govind.D.Menon, Nijil Chalil, Sethunath R. and Narwaria M (2006), Grapheme to phone conversion for Hindi, Conference on Oroiiental COCOSDA, 2006,Amrita Vishwa Vidyapeetham, Ettimadai, Coimbatore, Tamil Nadu, INDIA.
5. Srikanth Ronanki, Siva Reddy, BajibabuBollepalli (2016), DNN-based Speech Synthesis for Indian Languages from ASCII text , 9th ISCA Speech Synthesis Workshop, September 2016, DOI: 10.21437/SSW.2016-12
6. Caero L, Libertelli J. Relationship between Vigorexia, steroid use, and recreational bodybuilding practice and the effects of the closure of training centers due to the Covid-19 pandemic in young people in Argentina. AG Salud 2023;1:18-18.
7. Ogolodom MP, Ochong AD, Egop EB, Jeremiah CU, Madume AK, Nyenke CU, et al. Knowledge and perception of healthcare workers towards the adoption of artificial intelligence in healthcare service delivery in Nigeria. AG Salud 2023;1:16-16.
8. Mousmi A. (2016), Grapheme-to-phoneme conversion scheme for sentence-by-sentence learning of korean manuscript using joint sequence statistical model, International journal of current engineering and scientific research (ijcesr) issn (print): 2393-8374, (online): 2394-0697, volume-3, issue-7, 2016, DOI:10.21276/ijcesr
9. Singh A. K. (2016), A Computational Phonetic Model for Indian Language Scripts, Language Technologies Research Centre IIIT, Hyderabad, India. http://cdn.iiit.ac.in/cdn/ltrc.iiit.ac.in/anil/papers/cpms-long-iwlc-06.pdf
10. Pathak N, Talukdar P. H., (2013), The Basic Grapheme to Phoneme (G2P) Rules for Bodo Language, International Journal of Computing, Communications and Networking, Available Online at http://warse.org/pdfs/2013/ijccn06212013.pdf
11. Aliya Deri, Knight K., (2016), Grapheme-to-Phoneme Models for (Almost) Any Language, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pages 399–408, Berlin, Germany, August 7-12, 2016. c 2016 Association for Computational Linguistics.
12. Chourasia V, Samudravijaya K (2005), Phonetically Rich Hindi Sentence Corpus for Creation of Speech Database, Available at: mc.iet@dauniv.ac.in. https://www.iitg.ac.in/clst/visitors/samudravijaya/publ/05phoneticallyRichSentHindi.pdf
13. Magdum D, Patil T, Suman M., (2019), Schwa Deletion in Hindi Language Speech Synthesis, International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-8 Issue-6S, August 2019
14. Auza-Santivañez JC, Lopez-Quispe AG, Carías A, Huanca BA, Remón AS, Condo-Gutierrez AR, et al. Improvements in functionality and quality of life after aquatic therapy in stroke survivors. AG Salud 2023;1:15-15.
15. Castillo-González W. Kinesthetic treatment on stiffness, quality of life and functional independence in patients with rheumatoid arthritis. AG Salud 2023;1:20-20.
16. Yolchuyeva S., Géza Németh and Bálint Gyires-Tóth (2019), Grapheme-to-Phoneme Conversion with Convolutional Neural Networks, Application Science; Published: 18 March 2019, Appl. Sci. 2019, 9, 1143; DOI:10.3390/app9061143 available at: www.mdpi.com/journal/applsci
17. Nath C., Sarma B., (2021), A New Concept of Sound Database for Development of Spelling Generator, Journal of Biological Engineering Research and Review, 2021; 8(2): 01-425 ISSN: 2349-3232, Conference Proceeding ADVANCEMENT IN ARTIFICIAL INTELLIGENCE THEORIES AND APPLICATIONS IN BIOMEDICAL ENGINEERING, NIT Patna, 2021
18. Choudhury M (2003), Rule Based Grapheme to Phoneme Mapping for Hindi Speech Synthesis, Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur, Corpus ID: 2392234https://play.ht/text-to-speech-voices/indian-hindi/
19. Basu J, Basu T, Mitra M, Mandal SKD (2009), Grapheme to phoneme (G2P) conversion for Bangla. In: 2009 Oriental COCOSDA international conference on speech database and assessments, Urumqi, China, Aug 2009, pp. 66–71, doi.org/10.1109/ICSDA.2009.5278373
20. Badhon S, Rahaman MdH, Rupon FR, Abujar S (2020), State of art research in Bengali speech recognition. In: 2020 11thInternational conference on computing, communication and networking technologies (ICCCNT), Kharagpur, India, July 2020, pp. 1–6.
Published
Issue
Section
License
Copyright (c) 2023 Chandamita Nath, Bhairab Sarma (Author)
This work is licensed under a Creative Commons Attribution 4.0 International License.
The article is distributed under the Creative Commons Attribution 4.0 License. Unless otherwise stated, associated published material is distributed under the same licence.