Machine Perception Laboratory FACS
Machine Perception Laboratory FACS
HOME PAGE
Projects
Software
Databases
Standards Figure 1.
Demos Fully automated facial action coding system, using machine learning
techniques
People
Publications a. Results on the DFAT-504 dataset
We developed a system for fully automatic facial action coding of a subset of FACS. This prototype
Tech Reports system recognizes all eighteen upper face AU’s (1,2,4,5,6,7 and 9) and is 100% automated. Face
Tutorials images are detected and aligned automatically in the video frames and sent directly to the
recognition system. The system was trained on Cohn and Kanade's DFAT-504 dataset (Kanade,
Presentations Cohn, & Tian, 2000). This is a dataset of video sequences of university students posing facial
Address & expressions. In addition to being labeled for basic emotion, the dataset was coded for FACS scores
Directions by two certified FACS coders. There were 313 sequences from 90 subjects, with 1 to 6 emotions
MPLab in the per subject. All faces in this dataset were successfully detected by the automatic face tracking
News system. The automatically detected faces were then passed to the expression analysis system (As
shown in Figure 1).
Twiki
MPbugs Seven support vector machines, one for each AU, were trained to detect the presence of a given
AU, regardless of whether it occurred alone or in combination with other AU’s. The expression
recognition system was trained on the last frame of each sequence, which contained the highest
magnitude of the target expression (AU peak). Negative examples consisted of all peak frames that
did not contain the target AU, plus 313 neutral images consisting of the first frame of each
sequence. A nonlinear radial basis function kernel was employed. Generalization to new subjects
was tested using leave-one-out cross-validation (Tukey, 1951). The results are shown in Table 1.
Table 1
https://inc.ucsd.edu/mplab/grants/project1/research/Fully-Auto-FACS-Coding.html 1/3
10/10/2022 23:15 Machine Perception Laboratory
System outputs for full image sequences are shown in Figure 2. These are results for test image
sequences, which are sequences not used for training.
Figure 2
Fully automated FACS measurements for image sequences. a. System
outputs for full image sequences of surprise expressions from four subjects,
scored by the human coder as containing AU’s 1, 2, and 5 (inner brow raise,
outer brow raise, and upper lid raise). Curves show automated system output
for AU’s 1, 2, and 5. b. System outputs for full image sequences of disgust
expressions from four subjects, scored by the human coder as containing
AU’s 4,7, and 9 (brow lower, lower lid raise, nose wrinkle). Curves show
system output for AU’s 4,7, and 9.
https://inc.ucsd.edu/mplab/grants/project1/research/Fully-Auto-FACS-Coding.html 2/3
10/10/2022 23:15 Machine Perception Laboratory
The system obtained a mean of 93.6% agreement with human FACS labels for fully automatic
recognition of 18 upper facial actions. This is an exciting result, as performance rates are equal to
or better than other systems tested on this dataset that employed manual registration. (Tian, Kanade
& Cohn, 2001; Kapoor, Qi, & Picard, 2003). Kapoor et al. obtained 81.2% correct on this dataset,
using hand marked pupil positions for alignment. Tian et al. obtained a similar level of performance
to ours, but hand-marked a set of feature points in neutral expression images immediately
preceding each movement. The high performance rate obtained by our system is the result of many
years of systematic comparisons investigating which image features (representations) are most
effective (Bartlett et al., 1999, Donato et al., 1999), which classifiers are most effective (Littlewort
et al., 2004), optimal resolution and spatial frequency (Donato et al., 1999; Littlewort et al.,
submitted), feature selection techniques (Littlewort et al., submitted), and comparing flow-based to
texture-based recognition (Bartlett et al., 1999, Donato et al., 1999).
We are presently testing fully automatic FACS recognition in the continuous video stream. Faces
images were automatically detected. Alignment in the 2D plane was then refined using
automatically detected eye locations. The resulting images were then ported to the AU detectors
trained on the DFAT-504 database. The figure below shows example system outputs for a video
sequence that contains an AU 1 and AU 2. Preliminary results based on 1 subject show a mean
agreement rate of 88.7% between the automated system and the human FACS codes for the four
actions with sufficient data to test (AU’s 1, 2, 6, and 7). (Results by action are AU 1, 87.3% for
AU2, 92.4% for AU 6, and 86.4% for AU 7).Here ‘agreement’ is the percent of frames above or
below threshold in accordance with the human codes. In the coming months, we will train AU
detectors directly on the spontaneous expression samples from Rutgers as the data becomes
available.
Fully automated FACS coding on a sample subject from the RU-FACS-1 dataset. b. Outputs
for the AU 1 detector and the AU 2 detector. Arrows indicate the onset, offset, and peak as
scored by the human coder. AU1 was scored as intensity D. AU2 was scored as intensity c.
Output of the AU 1 detector for 500 frames of video. A human coded AU1 event from onset
to offset are in green, and peak is identified by the red dot
https://inc.ucsd.edu/mplab/grants/project1/research/Fully-Auto-FACS-Coding.html 3/3