Difference between revisions of "Automatic speaker identification"
(from Utrecht Lexicon of Linguistics) |
m (cross reference) |
||
(One intermediate revision by one other user not shown) | |||
Line 3: | Line 3: | ||
===Comments=== | ===Comments=== | ||
The two largest factors affecting automatic speaker identification performance are the size of the population to be distinguished among and the degradations introduced by noise (e.g. telephone transmission). Automatic speaker recognition is fundamental in all systems that deliver services or reserved information, particularly when an high degree of security is necessary. | The two largest factors affecting automatic speaker identification performance are the size of the population to be distinguished among and the degradations introduced by noise (e.g. telephone transmission). Automatic speaker recognition is fundamental in all systems that deliver services or reserved information, particularly when an high degree of security is necessary. | ||
+ | |||
Possible applications are represented by retrieval of private information, automatic financial transactions, control of access to security or reserved areas, etc. Another range of possible applications lies in the area of crime: e.g. identifing telephone speakers in sexual harassment cases, bomb threats, etc. | Possible applications are represented by retrieval of private information, automatic financial transactions, control of access to security or reserved areas, etc. Another range of possible applications lies in the area of crime: e.g. identifing telephone speakers in sexual harassment cases, bomb threats, etc. | ||
− | The usual approach to speaker recognition is based on the classification of acoustic parameters derived from the speech signal. Generally, the parameters are obtained via short time spectral analysis and contain both phonetic information, related to the uttered text, and individual information, related to the speaker. Since the task of separating the phonetic information from the individual one is not yet solved, many speaker recognition systems behave in a text dependent way (i.e. the user must utter a predefined sentence). | + | |
+ | The usual approach to speaker recognition is based on the classification of acoustic parameters derived from the speech signal. Generally, the parameters are obtained via short time spectral analysis and contain both [[Phonetics|phonetic]] information, related to the uttered text, and individual information, related to the speaker. Since the task of separating the phonetic information from the individual one is not yet solved, many speaker recognition systems behave in a text dependent way (i.e. the user must utter a predefined sentence). | ||
===Link=== | ===Link=== |
Latest revision as of 14:09, 23 May 2013
According to the application area, speaker recognition systems can be divided into speaker identification systems and speaker verification systems. Speaker identification consists in assigning the input speech signal to one person of a known group, while speaker verification consists in confirming or not the identity of the user of the system.
Comments
The two largest factors affecting automatic speaker identification performance are the size of the population to be distinguished among and the degradations introduced by noise (e.g. telephone transmission). Automatic speaker recognition is fundamental in all systems that deliver services or reserved information, particularly when an high degree of security is necessary.
Possible applications are represented by retrieval of private information, automatic financial transactions, control of access to security or reserved areas, etc. Another range of possible applications lies in the area of crime: e.g. identifing telephone speakers in sexual harassment cases, bomb threats, etc.
The usual approach to speaker recognition is based on the classification of acoustic parameters derived from the speech signal. Generally, the parameters are obtained via short time spectral analysis and contain both phonetic information, related to the uttered text, and individual information, related to the speaker. Since the task of separating the phonetic information from the individual one is not yet solved, many speaker recognition systems behave in a text dependent way (i.e. the user must utter a predefined sentence).