Limit search to available items
Book Cover
E-book
Author Hinterleitner, Florian

Title Quality of synthetic speech : perceptual dimensions, influencing factors, and instrumental assessment / Florian Hinterleitner
Published Singapore : Springer, [2017]
Online access available from:
Springer eBooks    View Resource Record  

Copies

Description 1 online resource
Series T-labs series in telecommunication services
T-labs series in telecommunication services.
Contents Acknowledgements; Contents; Acronyms; Abstract; 1 Introduction; 1.1 Motivation; 1.2 Outline; References; 2 Speech Synthesis; 2.1 Setup of a Speech Synthesizer; 2.1.1 Natural Language Processing (NLP); 2.1.2 Prosody Generation; 2.1.3 Concatenation and Generation of Speech-Signal Parameters; 2.1.4 Speech Signal Generation; 2.2 The Mary Text-to-Speech System (MaryTTS); References; 3 Auditory and Instrumental Quality Evaluation Metrics; 3.1 What Is Perceptual Quality?; 3.2 Taxonomy for the Quality Assessment of Synthetic Speech; 3.2.1 Glass Box Versus Black Box
3.2.2 Laboratory Versus Field Studies3.2.3 Linguistic Versus Acoustic; 3.2.4 Auditory Versus Instrumental; 3.3 Auditory Quality Evaluation Metrics; 3.3.1 Functional TestsThe content of this section has previously been published in a slightly different version in [6].; 3.3.2 Judgment TestsParts of the content of this section have previously been published in a slightly different version in [13] and [6].; 3.4 Instrumental Quality Evaluation Metrics; 3.4.1 Reference-Based MeasuresParts of the content of this section have previously been published in a slightly different version in [21]
3.4.2 Reference-Free MeasuresReferences; 4 Perceptual Quality Dimensions; 4.1 State-of-the-Art Perceptual Quality DimensionsParts of the content of this section have previously been published in a slightly different version in [1].; 4.1.1 Study: Kraft and Portele (Kraft1995); 4.1.2 Study: Mayo et al. I (Mayo2005); 4.1.3 Study: Viswanathan and Viswanathan (Vis2005); 4.1.4 Study: Seget (Seget2007); 4.1.5 Study: Hinterleitner (Hint2010); 4.1.6 Study: Mayo et al. II (Mayo2011); 4.1.7 Restrictions of Discussed Studies
4.2 Semantic Differential and Factor AnalysisParts of the content of this section have previously been published in a slightly different version in [13].4.2.1 Experimental Setup; 4.2.2 Statistical Analysis; 4.3 Sorting Task and Multidimensional ScalingParts of the content of this section have previously been published in a slightly different version in [16].; 4.3.1 Experimental Setup; 4.3.2 Statistical Analysis; 4.4 Summary of the SD/FA and ST/MDS StudiesParts of the content of this section have previously been published in a slightly different version in [16]
4.5 4.5 Universal Perceptual Quality Dimensions4.5.1 Naturalness of Voice; 4.5.2 Prosodic Quality; 4.5.3 Fluency and Intelligibility; 4.5.4 Absence of Disturbances; 4.5.5 Calmness; 4.5.6 Instructions for TTS Quality Assessment; 4.6 Summary; References; 5 Influencing Factors on Perceptual Quality; 5.1 Influence of the ApplicationParts of the content of this section have previously been published in a slightly different version in [1].; 5.1.1 Pretest; 5.1.2 Main TestThe content of this section has previously been published in a slightly different version in [10].; 5.1.3 Conclusions
Summary This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) absence of disturbances, and (v) calmness. Moreover, a test protocol for the efficient indentification of those dimensions in a listening test is introduced. Furthermore, several factors influencing these dimensions are examined. In addition, different techniques for the instrumental quality assessment of TTS signals are introduced, reviewed and tested. Finally, the requirements for the integration of an instrumental quality measure into a concatenative TTS system are examined
Bibliography Includes bibliographical references
Notes Vendor-supplied metadata
Subject Speech processing systems.
Speech synthesis.
Telecommunication.
Text-to-speech software.
Form Electronic book
ISBN 9789811037344 (electronic bk.)
9811037345 (electronic bk.)