STUDY OF SPEECH EMOTION RECOGNITION USING BLSTM WITH ATTENTION

Dalia Sherman, Gershon Hazan, Sharon Gannot

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We present a study of a neural network-based method for speech emotion recognition that uses audio-only features. In the studied scheme, the acoustic features are extracted from the audio utterances and fed to a neural network that consists of convolutional neural networks (CNN) layers, bidirectional long short-term memory (BLSTM) combined with an attention mechanism layer, and a fully-connected layer. To illustrate and analyze the classification capabilities of the network, we used the t-distributed stochastic neighbor embedding (t-SNE) method. We evaluate our model using Ryerson audio-visual dataset of emotional speech and song (RAVDESS) and interactive emotional dyadic motion capture (IEMOCAP) datasets achieving weighted accuracy (WA) of 80% and 66%, respectively.

Original languageEnglish
Title of host publication31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings
PublisherEuropean Signal Processing Conference, EUSIPCO
Pages416-420
Number of pages5
ISBN (Electronic)9789464593600
DOIs
StatePublished - 2023
Event31st European Signal Processing Conference, EUSIPCO 2023 - Helsinki, Finland
Duration: 4 Sep 20238 Sep 2023

Publication series

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491

Conference

Conference31st European Signal Processing Conference, EUSIPCO 2023
Country/TerritoryFinland
CityHelsinki
Period4/09/238/09/23

Bibliographical note

Publisher Copyright:
© 2023 European Signal Processing Conference, EUSIPCO. All rights reserved.

Funding

This project has received funding from the European Union's Horizon 2020 Research and Innovation Programme under Grant Agreement No. 871245. This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No. 871245.

FundersFunder number
Horizon 2020 Framework Programme
Horizon 2020871245

    Keywords

    • Attention Mechanism
    • Deep Neural Network
    • Speech Emotion Recognition

    Fingerprint

    Dive into the research topics of 'STUDY OF SPEECH EMOTION RECOGNITION USING BLSTM WITH ATTENTION'. Together they form a unique fingerprint.

    Cite this