- Researchers develop a deep neural network that evaluates the ability of a camera to see clearly.
- Self-driving vehicles can use this network to make better decisions.
Dozens of companies are working on autonomous vehicle technology and they all approach the engineering challenges in different ways. To mimic the human’s ability to see, technology mainly relies on three basic elements: radar, cameras, and lidar.
However, several factors like rain, snow and other kinds of blockages can degrade camera vision. This hinders the robust perception system ability to make sense of its surroundings and validate data coming in from sensors.
In order to effectively detect invalidity of sensor data as quick as possible in the processing pipeline before it gets to downstream modules, researchers at NVIDIA have developed an AI model that evaluates the ability of a camera to see clearly.
This model uses a deep neural network — named ClearSightNet — to discover root causes of blockages, occlusions, and reductions in visibility. It has the potential to
- Reason across a broad range of possible causes of reduction in camera’s visibility.
- Provide actionable data.
- Run of various cameras with low computational overhead
How It Works?
The network splits the camera pictures into two different parts; one of them is associated with occlusion while other corresponds to the reduction in visibility.
Occlusion represents the specific portion of camera’s field of view that is blocked by opaque objects (like snow, mud, or dust) or contains no data (for example saturated pixels due to sunlight). In these portions, perception is entirely impaired.
Reduced visibility represents portions that are partially blocked due to fog, glare, or heavy rain. In such cases, the decision taken by algorithms should be marked with ‘lower confidence’.
Left side shows the input image while the right side is the image overlaid with the neural network output mask. Nearly 84 percent of the image pixels are affected by partial and complete occlusion.
To show these portions, ClearSightNet puts a mask on an input video/image in real-time. Reduced visibility regions are marked by green color, and completely occluded regions are marked by red. The network also displays how much area of the input video is affected by reduced visibility or occlusion.
This data could be used in several ways. The self-driving cars, for example, can choose not to apply any auto feature when visibility is low, and alert drivers to clean windshield or camera lens. Vehicles can use this network to know camera perception.
The team plans to further improve the ClearSightNet to provide end-to-end calculations and more detailed information about camera visibility, enabling greater control over the implementation process of autonomous vehicles.
As far as performance is considered [of current ClearSightNet], the network runs in about 1.3 millisecond (integrated GPU) and 0.7 millisecond (discrete GPU) per frame on Xavier. It’s already available in the NVIDIA DRIVE 9.0.