Hidden conditional random fields for visual speech recognition

Adrian Pass, Jianguo Zhang, Darryl Stewart

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    In this paper we present the application of hidden conditional random fields (HCRFs) to modeling speech for visual speech recognition. HCRFs may be easily adapted to model long range dependencies across an observation sequence. As a result visual word recognition performance can be improved as the model is able to take more of a contextual approach to generating state sequences. Results are presented from a speaker-dependent, isolated digit, visual speech recognition task using comparisons with a baseline HMM system. We firstly illustrate that word recognition rates on clean video using HCRFs can be improved by increasing the number of past and future observations being taken into account by each state. Secondly we compare model performances using various levels of video compression on the test set. As far as we are aware this is the first attempted use of HCRFs for visual speech recognition.
    Original languageEnglish
    Title of host publicationProceedings 13th International Machine Vision and Image Processing Conference, 2009
    Subtitle of host publicationIMVIP '09.
    EditorsKen Dawson-Howe, Rozenn Dahyot, Anil Kokaram, Gerard Lacey
    Place of PublicationLos Alamitos, Calif.
    PublisherIEEE
    Pages117-122
    Number of pages6
    ISBN (Electronic)9780769537962
    ISBN (Print)9781424448753
    DOIs
    Publication statusPublished - 2009
    Event13th International Machine Vision and Image Processing Conference - Dublin, Ireland
    Duration: 2 Sept 20094 Sept 2009

    Conference

    Conference13th International Machine Vision and Image Processing Conference
    Country/TerritoryIreland
    CityDublin
    Period2/09/094/09/09

    Fingerprint

    Dive into the research topics of 'Hidden conditional random fields for visual speech recognition'. Together they form a unique fingerprint.

    Cite this