Transferability Evaluation of Speech Emotion Recognition Between Different Languages

Iosifov, Ievgen and Iosifova, Olena and Romanovskyi, O. and Sokolov, V. Y. and Sukaylo, I. (2022) Transferability Evaluation of Speech Emotion Recognition Between Different Languages Lecture Notes on Data Engineering and Communications Technologies (134). pp. 413-426. ISSN 2367-4512

[thumbnail of Iosifov_I_Iosifova_O_Romanovskyi_O_Sokolov_V_Sukaylo_I_LNDECT_134.pdf] Text
Iosifov_I_Iosifova_O_Romanovskyi_O_Sokolov_V_Sukaylo_I_LNDECT_134.pdf - Supplemental Material

Download (86kB)


Advances in automated speech recognition significantly accelerated the automation of contact centers, thus creating a need for robust Speech Emotion Recognition (SER) as an integral part of customer net promoter score measuring. However, to train a specific language, a specifically labeled dataset of emotions should be available, a significant limitation. Emotion detection datasets cover only English, German, Mandarin, and Indian. We have shown by results difference between predicting two and four emotions, which leads us to narrow down datasets to particular practical use cases rather than train the model on the whole given dataset. We identified that if emotion transfers good enough from source language to target language, it reflects the same quality of transferability in vice verse direction between languages. Hence engineers can not expect the same transferability in the mirror direction. Chinese language and datasets are the hardest to transfer to other languages for transferability purposes. English dataset transferability is one of the lowest, hence for a production environment, engineers cannot rely on a training model on English for their language. This paper conducted more than 140 experiments for seven languages to evaluate and show the transferability of speech recognition models trained on different languages to have a clear framework which starting dataset to use to achieve good accuracy for practical implementation. The novelty of this study lies in the fact that models for different languages have not yet been compared with each other.

Item Type: Article
Additional Information: DOI: 10.1007/978-3-031-04812-8_35 EID: 2-s2.0-85129602152
Uncontrolled Keywords: Emotion detection; Engagement analysis; Sentiment analysis; Speech emotion recognition
Subjects: Це архівна тематика Київського університету імені Бориса Грінченка > Статті у наукометричних базах > Scopus
Divisions: Це архівні підрозділи Київського університету імені Бориса Грінченка > Факультет інформаційних технологій та математики > Кафедра інформаційної та кібернетичної безпеки імені професора Володимира Бурячка
Depositing User: Volodymyr Sokolov
Date Deposited: 20 May 2022 08:32
Last Modified: 20 May 2022 08:32

Actions (login required)

View Item View Item