The Audio/Visual Emotion Challenge and Workshop (AVEC 2012) will be the second competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and audiovisual emotion analysis, with all participants competing under strictly the same conditions. The goal of the challenge is to provide a common benchmark test set for individual multimodal information processing and to bring together the audio and video emotion recognition communities, to compare the relative merits of the two approaches to emotion recognition under well-defined and strictly comparable conditions and establish to what extent fusion of the approaches is possible and beneficial. A second motivation is the need to advance emotion recognition systems to be able to deal with naturalistic behavior in large volumes of un-segmented, non-prototypical and non-preselected data as this is exactly the type of data that both multimedia retrieval and human-machine/human-robot communication interfaces have to face in the real world.
We are calling for teams to participate in emotion recognition from acoustic audio analysis, linguistic audio analysis, video analysis, or any combination of these. As benchmarking database the SEMAINE database of naturalistic dialogues will be used. Emotion will have to be recognized in terms of continuous time, continuous valued dimensional affect in four dimensions: arousal, expectation, power and valence. Besides participation in the Challenge we are calling for papers addressing the overall topics of this workshop, in particular works that address the differences between audio and video processing of emotive data, and the issues concerning combined audio-visual emotion recognition.
Please visit our website for more information: http://sspnet.eu/avec2012/challenge-guidelines/.
|