Toward a Meaningful Model of Speech under Stress

D Finan, J Hansen

 

In the field of speech signal processing, the ability to understand and characterize stress represents a critical challenge in both modeling human speech production as well as developing effective algorithms for recognition of both speech and speakers.  Historically, the field of speech signal processing has considered a traditional plane wave air propagation model for speech production, which lends itself nicely to the development of many linear predictive model based algorithms for feature estimation in speech recognition and speech coding.  However, these methods ignore some of the fundamental scientific properties found in human speech production based on airflow dynamics.  In this study, we consider an overview to recent research in the field of analysis of speech under stress.  In particular, we consider a number of aspects of changes in speech production under stress.  We believe that nonlinear energy based processing schemes based on the Teager Energy Operator offer an attractive alternative to linear analysis procedures as they are more firmly grounded in the physics of speech production.  Finally, we consider the potential of multi-sensor physiologic and acoustic recordings which we feel would offer promising directions for future research and model building for both speech production and speech signal processing.