Applying an analysis of acted vocal emotions to improve the simulation of synthetic speech

Iain R. Murray, John L. Arnott

    Research output: Contribution to journalArticlepeer-review

    47 Citations (Scopus)

    Abstract

    All speech produced by humans includes information about the speaker, including conveying the emotional state of the speaker. It is thus desirable to include vocal affect in any synthetic speech where improving the naturalness of the speech produced is important. However, the speech factors which convey affect are poorly understood, and their implementation in synthetic speech systems is not yet commonplace. A prototype system for the production of emotional synthetic speech using a commercial formant synthesiser was developed based on vocal emotion descriptions given in the literature. This paper describes work to improve and augment this system, based on a detailed investigation of emotive material spoken by two actors (one amateur, one professional). The results of this analysis are summarised, and were used to enhance the existing emotion rules used in the speech synthesis system. The enhanced system was evaluated by naive listeners in a perception experiment, and the simulated emotions were found to be more realistic than in the original version of the system. (C) 2007 Elsevier Ltd. All rights reserved.

    Original languageEnglish
    Pages (from-to)107-129
    Number of pages23
    JournalComputer Speech and Language
    Volume22
    Issue number2
    DOIs
    Publication statusPublished - 2008

    Keywords

    • Speech analysis
    • Speech perception
    • Emotion
    • Affect
    • Synthesis-by rule
    • System evaluation

    Fingerprint

    Dive into the research topics of 'Applying an analysis of acted vocal emotions to improve the simulation of synthetic speech'. Together they form a unique fingerprint.

    Cite this