University of Leicester
Browse

Automatic voice emotion recognition of child-parent conversations in natural settings

Download (2.09 MB)
journal contribution
posted on 2020-07-16, 16:05 authored by Effie Lai-Chong Law, Samaneh Soleimani, Dawn Watkins, Joanna Barwick
While voice communication of emotion has been researched for decades, the accuracy of automatic voice emotion recognition (AVER) is yet to improve. In particular, the intergenerational communication has been under-researched, as indicated by the lack of an emotion corpus on child–parent conversations. In this paper, we presented our work of applying Support-Vector Machines (SVMs), established machine learning models, to analyze 20 pairs of child–parent dialogues on everyday life scenarios. Among many issues facing the emerging work of AVER, we explored two critical ones: the methodological issue of optimising its performance against computational costs, and the conceptual issue on the state of emotionally neutral. We used the minimalistic/extended acoustic feature set extracted with OpenSMILE and a small/large set of annotated utterances for building models, and analyzed the prevalence of the class neutral. Results indicated that the bigger the combined sets, the better the training outcomes. Regardless, the classification models yielded modest average recall when applied to the child–parent data, indicating their low generalizability. Implications for improving AVER and its potential uses are drawn.

History

Citation

Behaviour and Information Technology, 2020, https://doi.org/10.1080/0144929X.2020.1741684

Author affiliation

School of Informatics

Version

  • AM (Accepted Manuscript)

Published in

Behaviour and Information Technology

Publisher

Taylor and Francis

issn

0144-929X

eissn

1362-3001

Acceptance date

2020-03-03

Copyright date

2020

Available date

2021-03-17

Language

English