- MIT researchers develop a new machine learning-based model to automate many parts of the photo editing process.
- It can be used for simulating certain types of cameras, replacing backgrounds, and adjusting colors.
Most expert editors use Photoshop to make photos look great with creative enhancements. However, making these pictures look realistic is not as simple as it sounds. It requires capturing subtle aesthetic transitions between background and foreground, which could be an extremely difficult task for complex materials such as animal hair.
In these pictures, each pixel doesn’t solely belongs to one element. It’s often tough to figure out which pixels are part of the foreground (or specific subject) and which pixels correspond to the background.
Nowadays, filmmakers are concentrating more on CGI, and thus editors have to be proficient at ‘compositing’ – a method of combining background and foreground pictures in such a way that the scene looks real. This includes putting actors on different planets or giving them a new face, as they did with a fictional character named Davy Jones in Pirates of the Caribbean.
Making every detail right in each frame is a time-consuming and tedious work, even for experts. To makes things easier, the researchers at MIT have developed a new machine learning-based model that automates many parts of the photo editing process.
The main aim of the system is to make the image-editing process easier and faster so that experts don’t have to spend hours tweaking a picture pixel-by-pixel and frame-by-frame. It should take one click to merge photos and build realistic fantasy scenes.
How It Works?
The model takes a photo and splits it into multiple layers separated by a group of ‘soft transitions’. They are calling it Semantic Soft Segmentation. It examines the color and texture of an image and integrates it with data (object recognition) obtained via a neural network.
These soft segments are automatically measured by the system, thus you don’t have to perform any modifications in the image’s specific layer. In short, it makes manual editing tasks, such as setting colors and replacing backgrounds, much easier.
Reference: The Computational Fabrication Group | MIT CSAIL
To implement the algorithm, researches used MATLAB’s eigendecomposition and direct solver. The step took nearly 3 minutes for an image of size 640*480. The relaxed sparsification step utilizes MATLAB’s preconditioned conjugate gradient optimization. Each iteration converges in 50-80 iteration and it takes half a minute. Furthermore, the run-time increases linearly with the number of pixels.
At present, Semantic Soft Segmentation is capable of editing only static images. However, researchers believe that the next version would work with videos as well, opening a variety of film-production applications.
The current model can be used in social platforms, like Snapchat and Instagram to make filters more realistic, especially for simulating certain types of cameras or replacing the backgrounds. In the future, they will try to reduce the image-computing time from minutes to seconds and enhance the model’s ability to handles parameters like shadows, illumination, and color-matching.