- Engineers develop neural networks for self-driving cars to detect people and predict their next movements.
- It can accurately predict poses and next positions for multiple pedestrians simultaneously up to 45 meters from the vehicle.
Most self-driving vehicles generate and maintain an internal map of their surroundings, using a wide array of cameras, LiDAR and GPS. Algorithms then process these inputs, plot a path, and send instructions to the vehicle’s actuators that control steering, acceleration, and braking.
Other parameters like predictive modeling, hard-coded rules, obstacle avoidance and object discrimination algorithms help the software navigate while following traffic rules. Most of the work done in this area have only looked at still images, which do not take into account how pedestrians move in three dimensions.
To address this issue, researchers at the University of Michigan have developed an AI that can detect people and predict their next movements with higher precision compared to existing technologies. It can predict poses and next positions for multiple pedestrians concurrently up to 45 meters from the vehicle.
Biomechanically Inspired Recurrent Neural Network
So far the autonomous technology has used machine learning methods that have dealt with millions of two-dimensional pictures. They are capable of recognizing stop signs in real-time in the real world.
The new machine learning technique, on the other hand, utilizes video clips of several seconds to recognize motion and make accurate predictions about where pedestrians will go in the next step.
The system observes pedestrians’ poses, whether they are looking left/right or playing with their cellphones. These kinds of information tell a lot about what they most likely to do next.
The neural network is based on the long short-term memory network with inspiration from the biomechanics of human gait, for instance, mirror/bilateral symmetry of the human body and the periodicity of human walking.
How Accurate It Is?
The outcomes of the neural networks were quite impressive: the median translation error was about 10 centimeter after 1 second and less than 80 centimeters after 6 seconds. Whereas, other similar techniques were up to 700 centimeters off.
To make the network more efficient, the team put several physical constraints that apply to the human body — such as fastest possible walking/running speed or inability to fly — so that the system doesn’t have to calculate every possible next movement.
They used two NVIDIA TITAN X GPUs, with CUDA deep learning framework, to train the neural network on the PedX dataset which includes real intersections in Michigan.
It was implemented in Python 3.6 and it takes about 1 millisecond to predict the next step of each person in each frame. According to the researchers, the code can be further optimized to yield better results.
The AI can raise the bar of what driverless cars are capable of. Also, it may benefit gait studies of bipedal robots and can be applied to the development of clinical gait rehabilitation systems.