A Generalizable Speech Emotion Recognition Model Reveals Depression and Remission

Abstract

Affective disorders are associated with atypical voice patterns; however, automated voice analyses suffer from small sample sizes and untested generalizability on external data. We investigated a generalizable approach to aid clinical evaluation of depression and remission from voice using transfer learning: we train machine learning models on easily accessible non-clinical datasets and test them on novel clinical data in a different language. A generalizable speech emotion recognition model can effectively reveal changes in speaker depressive states before and after remission in patients with MDD. Data collection settings and data cleaning are crucial when considering automated voice analysis for clinical purposes.

Publication
Acta Psychiatrica Scandinavica

Related