HMM-Based Smoothing for Concatenative Speech Synthesis December 1, ..

Corpus-based approaches to speech synthesis have been advocated to overcome the limitations of concatenative synthesis from a xed acoustic unit inventory.

Abstract: In speech synthesis, concatenative data-driven synthesis methods prevail.

This paper presents a new approach to speech synthesis in which a set of cross-word decision-tree state-clustered context-dependent hidden Markov models are used to define a set of subphone units to be used in a concatenation synthesizer.


High Quality Arabic Concatenative Speech Synthesis - …

Black,``Unit selection in a concatenative speech synthesis system using a large speech database,''in Proc.

Models of segmental coarticulation and other phonetic factors are an important part of a text-to-speech system. The control part of a synthesis system calculates the parameter values at each time frame. Two main types of approaches can be distinguished: rule-based methods that use an explicit formulation of existing knowledge and library-based methods that replace rules by a collection of segment combinations. Clearly, each approach has its advantages. If the data are coded in terms of targets and slopes, we need methods to calculate the parameter tracks. The efforts of Holmes et al. (1964) and the filtered square wave approach by Liljencrants (1969) provide some classical examples in this context.


Models of Speech Synthesis | Voice Communication …

Olive, J. P. (1990), "A new algorithm for a concatenative speech synthesis system using an augmented acoustic inventory of speech sounds," Proceedings of the ESCA Workshop on Speech Synthesis, Autrans, France.

Employing speech models in concatenative speech synthesis

However, for musicalcomposition, this doesn't matter so much, because certain units caninteractively be forced to appear or not to appear.

The system proves that applying the concept of data-drivenconcatenative synthesis based on non-uniform unit selection to musicalsynthesis is a valid approach.



The author wishes to thank Xavier Rodet, Kelly Fitz, Ross Ballany, andGuillaume Lemaitre for their valuable support in writing this article.

Using 5 ms segments in concatenative speech synthesis

TTS applications such as YAKiToMe! and Speakonia are often used to add synthetic voices to YouTube videos for comedic effect, as in Barney Bunch videos. YAKiToMe! is also used to convert entire books for personal podcasting purposes, RSS feeds and web pages for news stories, and educational texts for enhanced learning.

Concatenative Speech Synthesis - CORE

Speech synthesis techniques are used as well in the entertainment productions such as games, anime and similar. In 2007, Animo Limited announced the development of a software application package based on its speech synthesis software FineSpeech, explicitly geared towards customers in the entertainment industries, able to generate narration and lines of dialogue according to user specifications. The application reached maturity in 2008, when NEC announced a web service that allows users to create phrases from the voices of Code Geass: Lelouch of the Rebellion R2 characters.