This paper describes a new simple, but effective, approach to speaker verification using video sequences of lip movements. We use motion history images (MHI) to provide a biometric template of a spoken word for each speaker. Class-dependent correlation filters are then created by a weighted optimization of training MHI samples. Feature extraction is performed by correlating a test MHI against each correlation filter. A Bayesian classifier is deployed for classification. We carry out an extensive performance evaluation of our approach with respect to the number of training samples and different words. Results clearly show the efficacy of our method.
|Title of host publication
|Proceedings 2008 International Machine Vision and Image Processing Conference
|Subtitle of host publication
|Bryan Scotney, Philip Morrow
|Place of Publication
|Los Alamitos, Calif.
|Number of pages
|Published - 2008
|IMVIP 2008 International Machine Vision and Image Processing Conference - University of Ulster, Portrush, Coleraine, Ireland
Duration: 3 Sept 2008 → 5 Sept 2008
|IMVIP 2008 International Machine Vision and Image Processing Conference
|3/09/08 → 5/09/08