Virtual Reality (VR) and Mixed Reality (MR) are all new entertaining platform that immerse users into new and compelling experiences, from professional training to gaming. Almost all giant tech companies have started focusing on VR and MR technologies.
By 2016, there were at least 230 companies working on VR related products. Facebook has around 400 engineers working on VR development. Apple, Google, Microsoft, Amazon, Samsung and Sony, all had dedicated VR and Augmented Reality groups.
In December 2017, Google has revealed that they are working on a new approach to enhance visual experience in VR and MR. They have developed a pipeline that utilizes the characteristics of human visual perception to provide incredible visual experience at a low power cost. Let’s find out how the whole system works and how efficient it is.
Current VR/MR Technologies
The VR and MR technologies we have now involve a fundamental challenge – presenting pictures at high resolution required for displaying real environment demands high-power rendering engine and transmission process. Most of the headsets have insufficient display resolution, which limits the field of view, degrading the experience.
In order to drive a high resolution VR/MR device, the conventional rendering pipeline requires significant computational power that high-end mobile processors can’t achieve. There are already dozens of researches going on to deliver technologies to present high resolution images, and challenges of driving those displays continues to grow.
Moreover, the current limitations are not only restricted to content generation, but it also might be in transferring data, dealing with latency and allowing interaction with real objects in mixed reality apps.
The pipeline takes the full system dependency into account, including memory bandwidth, rendering engine and display module capabilities. The pipeline consists of
- Foveated Rendering that focuses on decreasing computer per pixel
- Foveated Processing for reducing visual artifacts
- Foveated Transmission that handles bits per pixel transmitted
Source: Google Research Blog
For those who don’t know, Foveated imaging is a digital processing technique in which the image resolution varies across the image, according to one or more fixation points. Here, the fixation point refers to highest resolution area of the picture. Below, we’ll be explaining these 3 key components of the proposed pipeline.
The human brain pays less attention to things in peripheral vision, and Foveated rendering takes advantage of this fact. It decreases the bit-depth resolution of objects in our peripheral vision in order to improve the performance.
To make this work perfectly, the High Acuity area needs to be constantly updated with eye-tracking to align with eye saccades (quick movement of both eyes between two or more phases of fixation in the same direction). Systems with no eye-tracking, on the other hand, render a much wider High Acuity area.
A conventional foveation approach might split a frame buffer into several regions of spatial resolution. The aliasing evolved while rendering lower spatial resolution can cause noticeable artifacts when you move your head or when you see a new animated content. For instance, the gif image shows temporal artifacts generated due to head rotation (on the right).
To decrease these artifacts Google developers presented 2 different techniques –
In conformal rendering, we render things that match the smoothly varying reduction in resolution of visual acuity, based on a nonlinear mapping. It has 2 advantages that enable aggressive foveation while preserving the same level of quality.
- By precisely matching the eye’s visual fidelity, the number of pixel computations can be decreased.
- By using smooth fall-off in fidelity, users are prevented from seeing a clear splitting line between Low Acuity and High Acuity.
It is performed by warping the vertices of the scene into nonlinear space. Then virtual scene is rasterized at less resolution, and unwarped into linear space.
Instead of frame’s head rotation, the Low Acuity region is aligned rotationally to the world (for example, always facing east, west, north etc.). Since the aliasing artifacts are almost invariant to position of head, it’s less detectable. After up-sampling, the pixels of these regions are reprojected onto the display to compensate for head rotation.
Although the technique reduces the severity of visual artifacts, it’s quite expensive to compute as compared to conventional foveation using the same acuity reduction level.
Foveated Image Processing
Image credit: Google
Head Mounted Displays usually require to perform image processing steps after rendering, like lens distortion correction, local tone mapping or light blending. In foveated image processing, different tasks are performed in different regions.
For instance, lens distortion correction might not require the same level of spatial accuracy for all screen display parts. We can save some computations by performing lens distortion correction on foveated image data before upscaling. This approach doesn’t create any noticeable artifacts.
One of the main source of power consumption in Head Mounted Displays is data transmission from SoC to display component. The objective of foveated transmission is to save bandwidth and power by transferring only the essential data to the display module.
Image credit: Google
It requires forwarding the basic blending and upscaling tasks to the display side and transferring only the foveated rendered portion. Since the display is not developed for foveated content, the foveal region moving with eye-tracking would increase the complexity, which may cause temporal noticeable artifacts.
No doubt, the proposed pipeline could enhance the VR/MR experiences on mobile hardware, but for now they will remain as research experiments. Without eye-tracking headsets, it remains as promising tech that is ‘just on the horizon’.
We may not see fruitful results of this research for a while, but it’s exciting to see that Google is dedicating resources for the long haul. It is only a matter of time before we see Head Mounted Devices with inbuilt eye tracking technology, and it would be better if we could utilize it as soon as it lands, rather than just waiting for software to take advantage of the hardware.