FORENSIC COMMUNICATION ASSOCIATES |
|
|---|---|
"Offering expert services for 20+ years" |
|
|
|
|
|
|
|
|
| Senior-
J. Koster, Ph.D. P. French, Ph.D.
|
Speech decoding Transcript: Verification Preparation |
Analysis of electronic signatures: Engines Unknown Modifications |
Dialect identification Analysis of: conversation |
Polygraph Effects of: |
Hearing loss Damage analysis |
-or- |
Vocal Behaviors
A number of psychological or physiological states can be deduced (sometimes anyway) by speech and voice analysis. Of these, two have been selected for brief review. They are psychological stress and intoxication.
While determining how a person is feeling just from hearing their voice is not something which is very easy to do, there are times when a person has little else to go on. Hence, this area is quite important.
First, it should be noted that the term "stress" denotes a negative psychological state but it is fear, anger or anxiety? A reasonable definition of stress would appear to be that it is a "psychological state which results as a response to a perceived threat and is accompanied by the specific emotions of fear and anxiety" (10, 12, 30).
It has long since been accepted that listeners can identify some emotions (including stress) from speech samples alone and do so very well. If this is true, what are some of the vocal correlations of this psychological state? First, increases in pitch or speaking fundamental frequency appear to correlate with stress increments. However, if this relationship is to be functional, subject's baseline data ordinarily should be available as it (the behavior) actually results as a shift from the norm. Second, while frequency variability is often cited as a correlate of stress, it actually is a poor predictor. Third, vocal intensity is another acoustic parameter that may correlate with psychological stress; however, the data here are a little "mixed" also. Nevertheless, the best evidence is that vocal intensity tends to increase with stress. Fourth, while identification of the prosodic speaking characteristics related to stress is a fairly complex process, the temporal pattern of fewer speech bursts appears to correlate with it. Finally, an important recent finding is that speaker nonfluencies appear to increase sharply with stress.
A predictive model of the vocal correlates of stress has been developed; it may be found in Figure 8. As may be seen, changes occur in speaking fundamental frequency, nonfluencies, vocal intensity, speech rate and the number of speech bursts. However, it should be remembered that information of this type will be of greatest value when it can be contrasted with reference profiles for that person's normal speech.
The PSE
It would be ill advised to leave this area without some reference to "voice stress analyzers." These devices, it is claimed, can be used to detect both stress and lying. It is without question but that the legal, law enforcement, intelligence and related agencies would benefit greatly from the availability of an effective method for the detection of stress and, especially, deception. Taking deception first, can lies be detected by any means at all; is there any such thing as a lie response? Perhaps Lykken (28) has articulated the key concept here. He argues that, if lies are to be detected, there must be some sort of a "lie response," a measurable physiological or psychological event which always occurs. He correctly suggests that, until a lie response has been identified and its validity and reliability have been established, no one can claim to be able to measure, detect and/or identify falsehoods on anything remotely approaching an absolute level. But, has such a lie response been isolated? Simple logic can be used to test this possibility. For example, consider what would happen if it were possible to determine the beliefs and intent of politicians simply from hearing them speak. There would be no need for trials by jury as the guilt or innocence of anyone accused of a crime could be determined simply by asking them: "Did you do it?". Consider also the impact an infallible lie detection system would have on family relationships! The answer seems clear.
Yet the voice analysts claim they can detect falsehoods and do it with certainty. They market a number of "systems" for that purpose. How do these devices work? Unfortunately, it is almost impossible to answer this question as their claims are quite vague. One explanation is that they utilize the micro-tremors of a human's muscles. Such micro-tremors do exist in the long muscles of the body, however, there is very little chance that they either exist in (or can affect) the antagonistic actions of the numerous and complexly interacting respiratory, laryngeal and vocal tract muscles. Indeed, there is substantial evidence that they do not (31). And, is the presence of stress equivalent to lying in the first place? A myriad of such questions can be asked but, at present, there are virtually no valid data to support the claims of the "voice" stress evaluators; rather the great preponderance demonstrates that they are quite invalid (12, 18). Indeed, it appears that the "PSE" is an even greater fraud than are "voiceprints."
Almost anyone who is asked to do so probably will describe the speech of an inebriated talker as "slurred," "misarticulate," or "confused." But, do commonly held stereotypes of this type square with the results of reality and/or research? More importantly, are there data which suggest that it is possible to determine a person's sobriety solely from analysis of his or her speech? Only limited research has been reported; a good general review may be found in Chin and Pisoni (5).
The rationale for an intoxication-speech link is clear cut. Since, cognitive function and sensory-motor performance can be both impaired (1, 11, 33), so too can the speech act which results from operation of a number of high-level integrated systems (sensory, cognitive, motor). Moreover, it can be argued (from research) that articulation is degraded, speech rate slowed and perception of impairment raised as intoxication increases. Degradations in morphology and/or syntax, also have been reported as have articulatory problems. Perhaps more important, it has been found that: a) speaking fundamental frequency level is changed and its variability increased, b) speaking rate often is slowed, c) the number and length of pauses is often increased, d) amplitude or intensity levels are sometimes reduced and e) nonfluencies are markedly increased (see 5, 23 for reviews).
The basic problem with virtually all of the relationships cited is that they are quite variable and the reasons for this are not clear. Moreover, any number of other behavioral states -- stress, fatigue, depression, effort, emotions and speech/voice disorders -- can complicate attempts to determine intoxication level from speech analysis. And, such determinations might not be possible in the first place unless the target utterances can be compared to that person's speech when sober.
Having recognized the confusions and contradictions associated with the intoxication-speech dilemma, a team at the University of Florida developed a research program focused on resolution of these conflicts. New approaches designed to induce acute alcohol intoxication were employed; here, subjects received doses of 80 proof rum or vodka mixed with both a soft drink (orange juice, cola) and Gatorade. The subjects drank at their own pace but breath concentration levels (BRAC) were measured at 10-1 5 minute intervals. The approach was efficient with nausea and discomfort sharply reduced; serial measurements were permitted and intoxication level highly controlled. Moreover, large groups could be (and were) studied with subjects participating in all procedures related to their experiment. Data were taken at "windows" or intoxication levels (ascending or descending) including (among others) BRAC 0.00 (sober), BRAC 0.04-0.05 (mild), BRAC 0.08-0.09 (legal) and BRAC 0.12-0.13 (severe). Subjects were carefully selected on the basis of 27 behavioral and medical criteria. After training, they were required to produce four types of speech at each intoxication level. Included were: a) a standard 98 word oral reading passage, b) articulation test sentences, c) a set of diadochokinetic gestures and d) extemporaneous speech. As may be deduced from these descriptions, very careful and precise procedures were carried out for all conditions and levels. Analysis included auditory processing by listeners (drunk-sober, intoxication level, etc.), acoustic analysis of the signal, and various classification/sorting (behavioral) tests.
A number of relationships already have emerged. First, it appears that auditors tend to overestimate speaker impairment for individuals who are only mildly (to moderately) intoxicated. On the other hand, they tend to underestimate the level of involvement for subjects who are severely intoxicated (see Figure 9). Second, it appears possible to accurately simulate rather severe levels of intoxication and, even reduce the precept of intoxication if an inebriated individual attempts to sound sober (17). Moreover (and surprisingly), there seem to be only minor gender differences and few-to-none for drinking level (light, moderate, heavy). Perhaps the most powerful data so far are those observed for large groups of subjects in the "primary" experiments (see Figure 10). As can be seen, they show shifts for all of the speaking characteristics measured excepting vocal intensity. Note also that speaking fundamental frequency (heard pitch) is raised with increases in intoxication level; a relationship which was noted by clinicians but not by previous researchers. Perhaps the most striking relationship of all is that between nonfluencies and intoxication level. The correlation here is a very high one and the pattern seen in the figure has been confirmed. While some variability exists, the predictable relationships are that speech is slowed down as intoxication increases and the number of nonfluencies sharply rises for the same conditions.
Copyright 1999, Forensic Communications Associates
Web design by StickMan Productions.