SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 630
Text to Speech Synthesis for Hindi Language using Festival Framework.
Mrs. Mangal Joshi1, Samridhi Agarwal2, Shabnam Shaikh3, Priya Pitale4
1Professor, Dept. of Electronics and Telecommunication (E&TC) Engineering, Cummins College of Engineering,
Maharashtra, India
2,3,4Student, Dept. of E&TC Engineering, Cummins College of Engineering, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - The paper is based on developing a complete
system that takes text file as input from the user and gives
output in audio form. There exist many text to speech
synthesizer but very few have theregionaltoningandsoothing
voices that are natural sounding. A visually impaired person
using speech synthesis as a platform to hear the data instead
of reading and a tongue-tied person’s ability to express
through speech synthesizer as a surrogate voice is one of the
motives. Syllable units are chosen mainly because Indian
languages are syllable-centric in nature. Festival is aplatform
which serves as text to speech synthesizerformanylanguages.
The system uses the festival softwarewhichis basedonsyllable
segmentation method. Extraction of syllables and the
concatenation constitute to the process of converting Hindi
text into speech form.
Key Words: Hindi TTS (Text to Speech), Syllable
segmentation, festival, speech synthesis, Hindi
Language, etc.
1. INTRODUCTION
Speech research aims to build the systems that have
human-like capabilities in generating, understanding and
encoding speech for the range of machine to human
interactions. Speech is one the primary medium for
communication, so it is natural for human being to expect to
have communication in spoken form with computerdevices.
Hindi is an Indo-Aryan language spoken by 545 million
people, 425 million of them arenativespeakers.Hindihas13
vowels and 33 consonants and it is spoken using a
combination of both.
Text-to-Speech synthesis has the potential to make ICT
(Information and Communication Technology) based
services accessible to people which is very beneficial.
However, good quality Hindi TTS systems that can be used
potentially are not yet existing. None of the existing TTS
systems are of a quality that can be compared to TTS
systems in languages like English, German and French. The
main reason for this is to develop a TTS system in a new
language like Hindi needs inputs for resolving language-
specific issues. We are choosing the Festival framework for
developing Hindi TTS. As Festival does not provide the
complete language processing support specific to some
languages, there is a need for augmentation to facilitate the
development of TTS systems in certain new languages.
This syllable-based TTS system aims to work using
concatenative speech processing technique. It will be boon
for Hindi speakers if the user interfaces withthecomputer is
in Hindi and that too in the form of speech. One of the
greatest applications of text to speech converter is Natural
language interface to the user. The synthesizer will act as an
automatic text reader for blind or specially-abled people.
Another important application of it can be reading web
pages, emails as well as newspapers. According to the
previous researches in 2016, the [1] concatenative method
for speech synthesis was used. The pitch frequencies of the
sound signals were extracted by cepstral pitch detection
algorithm in the noiseless environment.Inthesameyear[2],
the system so developed accepted the written text of any
language of Devanagari script via MS word utility through
MATLAB which then converted into Romanizedscriptunder
text analysis and tokenization was used in order to map the
respective phoneme.
1.1 Text to Speech Conversion
The text to speech synthesis has two stages:
1.) Training phase: Hindi words are segmented into syllable
sized units using segmentation techniques. Each segment is
given a unique label. An audio file database of the unique
labels is then provided to the controller.
2.) Synthesis phase: The text file (to be synthesized) is
imported to the program. Here, the logic is provided to the
controller for segmentation oftheHindiwordsintosyllables.
The syllables will be matched with the database provided
initially in the training phase. Using concatenation,theword
from the dictionary (database) is created. Through audio
amplifiers and speakers, the speech will be generated. The
characteristics like fluency, softness, accuracy will betested.
Accurate speech is expected accordingtotheinputprovided.
Fig. -1: Basic TTS system
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 631
1.2 Festival Framework
The Festival Speech Synthesis System is a general multi-
lingual speech synthesis system which was developed by
Alan W. Black Centre for Speech Technology Research
(CSTR) at the University of Edinburgh. Festival is designed
so that it can support multiple languages, it comes with
support for the English language (British and American
pronunciation), Welsh and Spanish. Voice packages exist for
several other languages, such as Spanish, Finnish, Italian,
Polish and Russian. Some of the Festival Speech tools are
festival-2.4, festvox-2.1, speechtools-2.4, festlexCMU. As
Festival does not provide the complete language processing
support specific to some languages there is a need for
augmentation to facilitate the development of TTS systems
in certain new languages. For this application, we used
Festvox as our basic framework. Festvox 2.7.0 is an open
source text to speech architecture.
2. METHODOLOGY
The festival toolbox operateson a commandlineinterface
in a LINUX operating system. To get Festvox on LINUX OS,
1. Obtain the Festvox version 2.7.0 fromthefestival software
website. Create a directory dedicated to the software. Keep
all the setup related files in the same folder such as (festival-
2.4-release.tar, speech_tools-2.4 release.tar, festvox-
2.7.0release.tar,festlex_OALD.tar.gz,andfestlex_POSLEX.tar)
2. Unpack all the tar.gz files using the tar commands.
3. Compile the speech tools by following the steps in the
installation guide using gmake test and gmake install
commands.
4. Compile the festival by following the installation guide in
the festival folder. (commands -./configure , gmake test,
gmake install)
5. After the successful installation of a festival, Compilation
of festvox is needed.
6. To run the festival, Export the three variables named
PATH, FESTVOXDIR, ESTDIR.
7. Next, execute the festival command to go to the festival
shell.
For Introducing a new language voice in festival or to
make your own festival enhanced voice simulator, template
is to be designed.
1. Firstly create a directory to hold the voice inside the
festival folder.
2. Build the basic structure of the new voice to be added.
Define the phone set for the language which is the set of
symbols defining whether it is a consonant or vowel. The
places of vowels as per each nasal sound is located and
given. While we made the basic template a schema file for
phone set is created, where we have to define this.
Considering each parameter for the Hindi language
corresponding schema files (.scm) are created which will
generate more natural sounding audio output.
The database is created by recording Hindi words and
labeling them according to their contents. Store all the
recorded .wav files under one directory [3]. Phonesynthesis
can be tested after this step and if any labeling errors
present can be corrected.Pronunciationcanbedefinedusing
a large set of databases (lexicons) or using a letter to sound
rules. Festival uses lexicon structureforpronunciation.After
adding various intonation models (.scm files) basic
synthesizer can be tested in the festival.
3. CHALLENGES IN TEXT TO SPEECH SYNTHESIS
Text to speech synthesis is theapproachofconvertingtext
input into an audio form. There are many TTS synthesizers
available for many languages.Thereisalsosomanysoftware
available for Indian languages. Text to speech conversion is
totally dependent on language. Thus, it is very simple for
some languages and complicated to others. A large set of
different rules and their exceptions is needed to produce
correct pronunciation and prosody for synthesized speech.
Text processing is the initial stage of converting text into
speech. Challenges arising in textprocessingiswhile reading
the numbers, units, abbreviations, dates, special characters,
and symbols, etc. For example, the number 1990 canberead
as nineteen ninety if it's a year and one thousand nine
hundred ninety if it's number. Also, Kg. should be read as
kilograms. To process such texts accurately, a very large set
of rules need to be applied. Once the text is processed
another difficult task is to produce correct pronunciationfor
the word which requires a large database. Amount of stress
applied for a particular word in a sentence, finding proper
duration for pronunciation and correct intonation is also
needs to be taken into consideration. A timing at sentence
level or grouping of words into phrases correctly is difficult
because prosodic phrasing is not always marked in the text
by punctuation, and phrasal accentuation is almost never
marked.
Most of the languages have some special features which
make the development of speech either easier or difficult.
Letter to sound rules are also differ from language to
language. There is a lot of work has been done already on
these parameters to improve the synthesizedspeechoutput.
For some of the synthesizers, theoutputisquitegood butthe
naturalness of the speech still needs to be improved. The
synthesized speech sounds more like machine-generated
than human-like. This may be irritating after a point for a
person who is using TTS for reading a book or information
from the internet. The proposed work is focused on the
improvement of the naturalness of Hindi TTS output.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 632
4. CONCLUSION
Speech Synthesissystems areusedinvariousapplications.
These systems can be useful as an assistant to visually
impaired person and are currentlyusedinmanyeducational
institutes as a learning machine for kids. Speech Synthesis
and recognition techniques are also used by many research
companies and web browsers. Using Festival Framework
speech synthesizer for many languages are developed. Here
we are mainly working on the Hindi language. We are
considering a few parameters to improve the naturalness of
the utterance created. The parameter likeschwa deletion[4]
can be considered to make the synthesized voice more
human-like. To include these parameters for the natural
sound schema files are created.
REFERENCES
[1] G. D.Ramteke, R. J.Ramteke, “Hindi Spoken Signals
for Speech Synthesizer” , 2nd International
Conference on Next Generation Computing
Technologies (NGCT-2016), 2016.
[2] Shilpi Kannojia, Ghanpriya singh, Dr.SanjayMathur,
“A Text to Speech synthesizer using acoustic unit
based concatenation for any Indian Language of
Devanagari Script”, 11th international conference
on Indudtrial and Information System, 2016.
[3] Somnath Roy, “A Technical Guide to Concatenative
Speech Synthesis for Hindi using Festival”,
International Journal of Computer Applications
(0975 – 8887) Volume 86 – No 8, January 2014.
[4] Kalika Bali, Partha Pratim Talukdar N. Sridhar
Krishna, A.G. Ramakrishnan, “Tools for the
development of a hindi speech synthesis system ”.

More Related Content

PDF
Implementation of Marathi Language Speech Databases for Large Dictionary
iosrjce
 
PDF
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET Journal
 
PDF
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
ijnlc
 
PPTX
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
University of Southern Denmark
 
PDF
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
kevig
 
PDF
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
kevig
 
PDF
K AMBA P ART O F S PEECH T AGGER U SING M EMORY B ASED A PPROACH
ijnlc
 
PDF
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
kevig
 
Implementation of Marathi Language Speech Databases for Large Dictionary
iosrjce
 
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET Journal
 
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
ijnlc
 
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
University of Southern Denmark
 
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
kevig
 
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
kevig
 
K AMBA P ART O F S PEECH T AGGER U SING M EMORY B ASED A PPROACH
ijnlc
 
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
kevig
 

What's hot (18)

PDF
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...
kevig
 
PDF
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
IOSR Journals
 
PDF
551 466-472
idescitation
 
PDF
Ey4301913917
IJERA Editor
 
PDF
B034205010
inventionjournals
 
PDF
Speech Recognition Application for the Speech Impaired using the Android-base...
TELKOMNIKA JOURNAL
 
PDF
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
IJERA Editor
 
PDF
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ijnlc
 
PDF
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
ijnlc
 
PDF
Marathi Isolated Word Recognition System using MFCC and DTW Features
IDES Editor
 
PDF
An Application for Performing Real Time Speech Translation in Mobile Environment
Association of Scientists, Developers and Faculties
 
PDF
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
cscpconf
 
PDF
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
kevig
 
PDF
Efficient Speech Emotion Recognition using SVM and Decision Trees
IRJET Journal
 
PDF
Hindi digits recognition system on speech data collected in different natural...
csandit
 
PDF
Development of text to speech system for yoruba language
Alexander Decker
 
DOC
B tech project_report
abhiuaikey
 
PDF
IRJET - Storytelling App for Children with Hearing Impairment using Natur...
IRJET Journal
 
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...
kevig
 
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
IOSR Journals
 
551 466-472
idescitation
 
Ey4301913917
IJERA Editor
 
B034205010
inventionjournals
 
Speech Recognition Application for the Speech Impaired using the Android-base...
TELKOMNIKA JOURNAL
 
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
IJERA Editor
 
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ijnlc
 
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
ijnlc
 
Marathi Isolated Word Recognition System using MFCC and DTW Features
IDES Editor
 
An Application for Performing Real Time Speech Translation in Mobile Environment
Association of Scientists, Developers and Faculties
 
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
cscpconf
 
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
kevig
 
Efficient Speech Emotion Recognition using SVM and Decision Trees
IRJET Journal
 
Hindi digits recognition system on speech data collected in different natural...
csandit
 
Development of text to speech system for yoruba language
Alexander Decker
 
B tech project_report
abhiuaikey
 
IRJET - Storytelling App for Children with Hearing Impairment using Natur...
IRJET Journal
 
Ad

Similar to IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework (20)

PDF
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
IJERA Editor
 
PPTX
visH (fin).pptx
tefflontrolegdy
 
PDF
Intern Presentation
Apurva Singh
 
PDF
Implementation of Text To Speech for Marathi Language Using Transcriptions Co...
IJERA Editor
 
DOC
12EEE032- text 2 voice
Nsaroj kumar
 
PDF
A Short Introduction To Text-To-Speech Synthesis
Cynthia King
 
PDF
Ceis 2
Alexander Decker
 
PDF
Tutorial - Speech Synthesis System
IJERA Editor
 
PDF
Speech to text conversion for visually impaired person using µ law companding
iosrjce
 
PDF
H010625862
IOSR Journals
 
PDF
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
iosrjce
 
PDF
The main-principles-of-text-to-speech-synthesis-system
Cemal Ardil
 
PDF
G1803013542
IOSR Journals
 
PDF
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
ravi sharma
 
PPT
Improvement in Quality of Speech associated with Braille codes - A Review
inscit2006
 
PPTX
SAP (SPEECH AND AUDIO PROCESSING)
dineshkatta4
 
PDF
A Marathi Hidden-Markov Model Based Speech Synthesis System
iosrjce
 
PPTX
Speech Synthesis.pptx
Subramanian Mani
 
PPT
Gujarati Text-to-Speech Presentation
samyakbhuta
 
PDF
An expert system for automatic reading of a text written in standard arabic
ijnlc
 
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
IJERA Editor
 
visH (fin).pptx
tefflontrolegdy
 
Intern Presentation
Apurva Singh
 
Implementation of Text To Speech for Marathi Language Using Transcriptions Co...
IJERA Editor
 
12EEE032- text 2 voice
Nsaroj kumar
 
A Short Introduction To Text-To-Speech Synthesis
Cynthia King
 
Tutorial - Speech Synthesis System
IJERA Editor
 
Speech to text conversion for visually impaired person using µ law companding
iosrjce
 
H010625862
IOSR Journals
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
iosrjce
 
The main-principles-of-text-to-speech-synthesis-system
Cemal Ardil
 
G1803013542
IOSR Journals
 
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
ravi sharma
 
Improvement in Quality of Speech associated with Braille codes - A Review
inscit2006
 
SAP (SPEECH AND AUDIO PROCESSING)
dineshkatta4
 
A Marathi Hidden-Markov Model Based Speech Synthesis System
iosrjce
 
Speech Synthesis.pptx
Subramanian Mani
 
Gujarati Text-to-Speech Presentation
samyakbhuta
 
An expert system for automatic reading of a text written in standard arabic
ijnlc
 
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
PDF
Kiona – A Smart Society Automation Project
IRJET Journal
 
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
PDF
Breast Cancer Detection using Computer Vision
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
PDF
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
PDF
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
Kiona – A Smart Society Automation Project
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 

Recently uploaded (20)

PDF
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PPTX
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 
PPTX
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
PDF
Unit I Part II.pdf : Security Fundamentals
Dr. Madhuri Jawale
 
PDF
EVS+PRESENTATIONS EVS+PRESENTATIONS like
saiyedaqib429
 
PDF
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
PDF
Natural_Language_processing_Unit_I_notes.pdf
sanguleumeshit
 
PDF
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PPTX
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
DOCX
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
PDF
All chapters of Strength of materials.ppt
girmabiniyam1234
 
PPTX
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
PPTX
IoT_Smart_Agriculture_Presentations.pptx
poojakumari696707
 
PDF
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
PDF
Zero Carbon Building Performance standard
BassemOsman1
 
PPTX
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 
PPTX
Information Retrieval and Extraction - Module 7
premSankar19
 
PPTX
MSME 4.0 Template idea hackathon pdf to understand
alaudeenaarish
 
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
Unit I Part II.pdf : Security Fundamentals
Dr. Madhuri Jawale
 
EVS+PRESENTATIONS EVS+PRESENTATIONS like
saiyedaqib429
 
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
Natural_Language_processing_Unit_I_notes.pdf
sanguleumeshit
 
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
All chapters of Strength of materials.ppt
girmabiniyam1234
 
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
IoT_Smart_Agriculture_Presentations.pptx
poojakumari696707
 
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
Zero Carbon Building Performance standard
BassemOsman1
 
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 
Information Retrieval and Extraction - Module 7
premSankar19
 
MSME 4.0 Template idea hackathon pdf to understand
alaudeenaarish
 

IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 630 Text to Speech Synthesis for Hindi Language using Festival Framework. Mrs. Mangal Joshi1, Samridhi Agarwal2, Shabnam Shaikh3, Priya Pitale4 1Professor, Dept. of Electronics and Telecommunication (E&TC) Engineering, Cummins College of Engineering, Maharashtra, India 2,3,4Student, Dept. of E&TC Engineering, Cummins College of Engineering, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - The paper is based on developing a complete system that takes text file as input from the user and gives output in audio form. There exist many text to speech synthesizer but very few have theregionaltoningandsoothing voices that are natural sounding. A visually impaired person using speech synthesis as a platform to hear the data instead of reading and a tongue-tied person’s ability to express through speech synthesizer as a surrogate voice is one of the motives. Syllable units are chosen mainly because Indian languages are syllable-centric in nature. Festival is aplatform which serves as text to speech synthesizerformanylanguages. The system uses the festival softwarewhichis basedonsyllable segmentation method. Extraction of syllables and the concatenation constitute to the process of converting Hindi text into speech form. Key Words: Hindi TTS (Text to Speech), Syllable segmentation, festival, speech synthesis, Hindi Language, etc. 1. INTRODUCTION Speech research aims to build the systems that have human-like capabilities in generating, understanding and encoding speech for the range of machine to human interactions. Speech is one the primary medium for communication, so it is natural for human being to expect to have communication in spoken form with computerdevices. Hindi is an Indo-Aryan language spoken by 545 million people, 425 million of them arenativespeakers.Hindihas13 vowels and 33 consonants and it is spoken using a combination of both. Text-to-Speech synthesis has the potential to make ICT (Information and Communication Technology) based services accessible to people which is very beneficial. However, good quality Hindi TTS systems that can be used potentially are not yet existing. None of the existing TTS systems are of a quality that can be compared to TTS systems in languages like English, German and French. The main reason for this is to develop a TTS system in a new language like Hindi needs inputs for resolving language- specific issues. We are choosing the Festival framework for developing Hindi TTS. As Festival does not provide the complete language processing support specific to some languages, there is a need for augmentation to facilitate the development of TTS systems in certain new languages. This syllable-based TTS system aims to work using concatenative speech processing technique. It will be boon for Hindi speakers if the user interfaces withthecomputer is in Hindi and that too in the form of speech. One of the greatest applications of text to speech converter is Natural language interface to the user. The synthesizer will act as an automatic text reader for blind or specially-abled people. Another important application of it can be reading web pages, emails as well as newspapers. According to the previous researches in 2016, the [1] concatenative method for speech synthesis was used. The pitch frequencies of the sound signals were extracted by cepstral pitch detection algorithm in the noiseless environment.Inthesameyear[2], the system so developed accepted the written text of any language of Devanagari script via MS word utility through MATLAB which then converted into Romanizedscriptunder text analysis and tokenization was used in order to map the respective phoneme. 1.1 Text to Speech Conversion The text to speech synthesis has two stages: 1.) Training phase: Hindi words are segmented into syllable sized units using segmentation techniques. Each segment is given a unique label. An audio file database of the unique labels is then provided to the controller. 2.) Synthesis phase: The text file (to be synthesized) is imported to the program. Here, the logic is provided to the controller for segmentation oftheHindiwordsintosyllables. The syllables will be matched with the database provided initially in the training phase. Using concatenation,theword from the dictionary (database) is created. Through audio amplifiers and speakers, the speech will be generated. The characteristics like fluency, softness, accuracy will betested. Accurate speech is expected accordingtotheinputprovided. Fig. -1: Basic TTS system
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 631 1.2 Festival Framework The Festival Speech Synthesis System is a general multi- lingual speech synthesis system which was developed by Alan W. Black Centre for Speech Technology Research (CSTR) at the University of Edinburgh. Festival is designed so that it can support multiple languages, it comes with support for the English language (British and American pronunciation), Welsh and Spanish. Voice packages exist for several other languages, such as Spanish, Finnish, Italian, Polish and Russian. Some of the Festival Speech tools are festival-2.4, festvox-2.1, speechtools-2.4, festlexCMU. As Festival does not provide the complete language processing support specific to some languages there is a need for augmentation to facilitate the development of TTS systems in certain new languages. For this application, we used Festvox as our basic framework. Festvox 2.7.0 is an open source text to speech architecture. 2. METHODOLOGY The festival toolbox operateson a commandlineinterface in a LINUX operating system. To get Festvox on LINUX OS, 1. Obtain the Festvox version 2.7.0 fromthefestival software website. Create a directory dedicated to the software. Keep all the setup related files in the same folder such as (festival- 2.4-release.tar, speech_tools-2.4 release.tar, festvox- 2.7.0release.tar,festlex_OALD.tar.gz,andfestlex_POSLEX.tar) 2. Unpack all the tar.gz files using the tar commands. 3. Compile the speech tools by following the steps in the installation guide using gmake test and gmake install commands. 4. Compile the festival by following the installation guide in the festival folder. (commands -./configure , gmake test, gmake install) 5. After the successful installation of a festival, Compilation of festvox is needed. 6. To run the festival, Export the three variables named PATH, FESTVOXDIR, ESTDIR. 7. Next, execute the festival command to go to the festival shell. For Introducing a new language voice in festival or to make your own festival enhanced voice simulator, template is to be designed. 1. Firstly create a directory to hold the voice inside the festival folder. 2. Build the basic structure of the new voice to be added. Define the phone set for the language which is the set of symbols defining whether it is a consonant or vowel. The places of vowels as per each nasal sound is located and given. While we made the basic template a schema file for phone set is created, where we have to define this. Considering each parameter for the Hindi language corresponding schema files (.scm) are created which will generate more natural sounding audio output. The database is created by recording Hindi words and labeling them according to their contents. Store all the recorded .wav files under one directory [3]. Phonesynthesis can be tested after this step and if any labeling errors present can be corrected.Pronunciationcanbedefinedusing a large set of databases (lexicons) or using a letter to sound rules. Festival uses lexicon structureforpronunciation.After adding various intonation models (.scm files) basic synthesizer can be tested in the festival. 3. CHALLENGES IN TEXT TO SPEECH SYNTHESIS Text to speech synthesis is theapproachofconvertingtext input into an audio form. There are many TTS synthesizers available for many languages.Thereisalsosomanysoftware available for Indian languages. Text to speech conversion is totally dependent on language. Thus, it is very simple for some languages and complicated to others. A large set of different rules and their exceptions is needed to produce correct pronunciation and prosody for synthesized speech. Text processing is the initial stage of converting text into speech. Challenges arising in textprocessingiswhile reading the numbers, units, abbreviations, dates, special characters, and symbols, etc. For example, the number 1990 canberead as nineteen ninety if it's a year and one thousand nine hundred ninety if it's number. Also, Kg. should be read as kilograms. To process such texts accurately, a very large set of rules need to be applied. Once the text is processed another difficult task is to produce correct pronunciationfor the word which requires a large database. Amount of stress applied for a particular word in a sentence, finding proper duration for pronunciation and correct intonation is also needs to be taken into consideration. A timing at sentence level or grouping of words into phrases correctly is difficult because prosodic phrasing is not always marked in the text by punctuation, and phrasal accentuation is almost never marked. Most of the languages have some special features which make the development of speech either easier or difficult. Letter to sound rules are also differ from language to language. There is a lot of work has been done already on these parameters to improve the synthesizedspeechoutput. For some of the synthesizers, theoutputisquitegood butthe naturalness of the speech still needs to be improved. The synthesized speech sounds more like machine-generated than human-like. This may be irritating after a point for a person who is using TTS for reading a book or information from the internet. The proposed work is focused on the improvement of the naturalness of Hindi TTS output.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 632 4. CONCLUSION Speech Synthesissystems areusedinvariousapplications. These systems can be useful as an assistant to visually impaired person and are currentlyusedinmanyeducational institutes as a learning machine for kids. Speech Synthesis and recognition techniques are also used by many research companies and web browsers. Using Festival Framework speech synthesizer for many languages are developed. Here we are mainly working on the Hindi language. We are considering a few parameters to improve the naturalness of the utterance created. The parameter likeschwa deletion[4] can be considered to make the synthesized voice more human-like. To include these parameters for the natural sound schema files are created. REFERENCES [1] G. D.Ramteke, R. J.Ramteke, “Hindi Spoken Signals for Speech Synthesizer” , 2nd International Conference on Next Generation Computing Technologies (NGCT-2016), 2016. [2] Shilpi Kannojia, Ghanpriya singh, Dr.SanjayMathur, “A Text to Speech synthesizer using acoustic unit based concatenation for any Indian Language of Devanagari Script”, 11th international conference on Indudtrial and Information System, 2016. [3] Somnath Roy, “A Technical Guide to Concatenative Speech Synthesis for Hindi using Festival”, International Journal of Computer Applications (0975 – 8887) Volume 86 – No 8, January 2014. [4] Kalika Bali, Partha Pratim Talukdar N. Sridhar Krishna, A.G. Ramakrishnan, “Tools for the development of a hindi speech synthesis system ”.