Research Committee | Current Projects

Can a voice impersonator challenge a speaker verification system?

Elisabeth Zetterholm, Department of Linguistics and Phonetics, Lund University

There is acoustic and perceptual evidence to put forward the view that professional impersonators can successfully imitate a wide range of target voices. Such evidence suggests that security-demanding services protected by computer-based speaker verification systems may be vulnerable to mimics of a true client's voice. This possibility raises questions related to how sensitive the systems are and what can be done to improve their immunity to this type of fraud.

Over a fixed-network telephone connection, one impersonator spoke a (preset) 4-digit sequence to a computer-based speaker verification system. Phonetic-acoustic measurements were made, and a perception test was conducted. The significantly increased verification scores combined with supporting phonetic-acoustic evidence showed that the impersonator did successfully change his natural voice and speech in his imitations. These results are encouraging but insufficient.

A more challenging experiment will thus be performed to give further substance to our findings, which arose from using the KTH text-dependent system of 2001. We will extend our results by using a speaker verification system (also telephone-based), which affords the flexibility of prompting the client with a new password phrase for each new test. The prototype version of the system, called PER ('Prototype Entrance Receptionist', developed by Håkan Melin, KTH), is currently in use at the entrance of the department of Speech, Music and Hearing (TMH), KTH. In the proposed experiments, the impersonator will perform the login dialogue over the telephone (ISDN connection) instead of being physically present. This is possible since all users have enrolled in both these conditions. He will select identities from a list of around 50 registered users (staff and students at TMH). The results will generalise those of a previous study, which used fixed phrase and three target speakers, to a wider phonetic coverage and to a larger number of target speakers. A professional impersonator has extensive experience in how to imitate other voices. Thus, he provides an upper limit for the false accept rate of non-professionals. We have previously studied non-professional impersonators and a professional impersonation will complement those results.