Multimodal Analysis of Human Nonverbal Behaviour in Real-World Settings

European Research Council Starting Grant (FP7) MAHNOB
Project lifespan: 2008 – 2013

Existing tools for human interactive behaviour analysis typically handle only deliberately displayed, exaggerated expressions. As they are usually trained only on series of such exaggerated expressions, they lack models of human expressive behaviour found in real-world settings and cannot handle subtle changes in audiovisual expressions typical for such spontaneous behaviour.

The main aim of MAHNOB project is to address this problem and to attempt to build automated tools for machine understanding of human interactive behaviour in naturalistic contexts. MAHNOB technology will represent a set of audiovisual spatiotemporal methods for automatic analysis of human spontaneous (as opposed to posed and exaggerated) patterns of behavioural cues including head pose, facial expression, visual focus of attention, hands and body movements, and vocal outbursts like laughter and yawns.

As a proof of concept, MAHNOB technology will be developed for two specific application areas: automatic analysis of mental states like fatigue and confusion in Human-Computer Interaction contexts and non-obtrusive deception detection in standard interview settings.

A team of 5 Research Assistants (RAs), led by the PI and having the background in signal processing and machine learning will develop MAHNOB technology. The expected result after 5 years is MAHNOB technology with the following capabilities:

  • analysis of human behaviour from facial expressions, hand and body movements, gaze, and non-linguistic vocalizations like speech rate and laughter
  • interpretation of user behaviour with respect to mental states, social signals, dialogue dynamics, and deceit/veracity
  • near real-time, robust, and adaptive processing by means of incremental processing, robust observation models, and learning person-specific behavioural patterns
  • provision of a large, annotated, online dataset of audiovisual recordings providing a basis for benchmarks for efforts in machine analysis of human behaviour.