||In this paper, we challenge INTERSPEECH 2009 Emotion open performance sub-Challenge of 5 class problem. Our research evaluates on the well-known FAU Aibo database. We use OpenSMILE toolkit to extract low-level descriptors and compute the delta coefficients. Gaussian Mixture Model (GMM) is popular approach in speaker identification and speaker verification, we use GMM systems to speech emotion recognition. It contains four systems, the first one is simple GMM system. The second one is GMM-UBM system, it resolve the insufficiency of training data. The third is GMM-SVM system, it uses GMM super-vectors as new |
input feature. The fourth is Identity Vectors system (or i-vector system), it uses factor analysis (FA) for GMM super-vectors.
In the dynamic modeling classifier, we achieve an unweighted average (UA) recall rate of 39.2% in GMM system, and 39.3% in GMM-UBM system, over a baseline of 35.5%. In the static modeling classifier, we use SMOTE and Under-sampling to solve the problem of unbalance data,
then we achieve the 38.9% UA in GMM-SVM system, and 40.5% in Identity system, it also over a baseline of 38.2%. This paper confirmed that the system in speaker recognition can also use to speech emotion recognition, and it also can improve the result of emotion recognition accuracies.