Home / Technology / Facebook AI Learns Human Facial Reactions From Hundreds of Video Chats

Facebook AI Learns Human Facial Reactions From Hundreds of Video Chats

[Estimated read time: 3 minutes]

Researchers at Facebook’s Artificial Intelligence Lab have created a new human-like bot. The algorithm is trained on tons of video conversations, so that it could analyze and learn how humans change their expressions in different scenarios.

The study focuses on face-to-face conversations between an agent (animation controlled by AI) and human. The agent learns to alter its facial expression based on the human expression on the other side.

The researchers trained a deep neural network on hundreds of video chats (conversation between 2 people), with no external human supervision. Later, it was tested against a number of people, and researchers found that both the human and the AI agent were equally realistic and natural.


To optimize the learning process, user’s and agent’s face is represented into 136 dimensional vectors corresponding to 68 two dimensional keypoints. These keypoints are generated from the video chats containing frontal faces of 2 people. Each frame is considered independent while training VAE (Variational Autoencoders). Moreover, researchers used 20 latent stochastic dimensions with hidden layer size of 400.

After completion of training phase, Variational Autoencoder is used to encode facial landmarks in all frames. The posterior distribution of frame is delivered by a forward pass on the encoder using two dimensional keypoints as input.

The distribution Mean is encoded for the corresponding frame. Each dimension affects a particular aspects of the face like mouth movement, face orientation, eye movement, etc. Modeled by stochastic latent variables, the low dimensional manifold could store the variations across data points.

Source: Learning non-verbal interaction through observation

Similarly, trained VAE decoder is used for generation. The prediction model predicts the values for 20 VAE dimension for all frames. This predicted representation is used as a sample from the posterior, and perform a forward pass on decoder in order to get corresponding 136 dimensional vector, which represents 68 two dimensional keypoints for the predicted frame.

The animations were, however, quite basic – it doesn’t shed any light on whether a humanoid robot using this algorithm would be able to express natural reactions.

What Others Say

‘Learning some basic rules of facial interaction may not be enough to build a real conversation partners’, said Goren Gordon at Tel Aviv University, Israel. ‘Actual human facial expressions are based on what they think and feel’, he added.

‘Machines are not perfect at learning subtle element of human interaction. We know that people prefer to talk to robots that mimic facial expression, but Facebook is now working on AI to put robot conversations on a whole new level’, says Gordon. He hopes to eventually develop a machine that isn’t in the uncanny valley, which shows the awkward feeling you get when you see something that seems human, but in reality it is something different.

Read: Facebook AI Robots Shut Down – What Actually Happened?

According the Louis Philippe at Carnegie Mellon University, Pittsburgh, Facebook AI team has created a sort of average personality. In coming years, more experienced and advanced agents would be able to pick from multiple personalities or adapt their own to match the people they are communicating with.