Last December, Google Photos added a great new feature: Cinematic Photos. They can be generated automatically from the application, by clicking on the recent highlights section.
How do Cinematic Photos work in Google Photos?
From the Google blog they wanted to explain how they manage to give movement to the photos, making them have such an eye-catching 3D effect. As always, they use their neural networks and computational expertise.
According to Google, cinematic photos want to try to make the user relive “the immersive feeling of the moment when they took the photo” by simulating both the movement the camera made and the 3D parallax. How do they turn a 2D image into a 3D one?
Google uses its neural networks trained on photos taken with the Pixel 4s to estimate the depth of field with a single RGB image
Google explains that, just as they do with portrait mode or augmented reality, cinematic photographs require a depth map to give information about the 3D structure. To achieve this effect on any cell phone that does not have a dual camera, they have trained a convolutional neural network to predict a depth map from a single RGB image.
With only one point of view (the plane of the photo), it can estimate the depth of the photograph with monocular cues such as the relative sizes of objects, the perspective of the photograph, blur, and so on. To make this information more complete, they use data collected with the Pixel 4’s camera, to combine with other photographs taken with professional cameras by Google’s team.
The technique is similar to that of the Pixel portrait mode: The image is analyzed, segmented and once the background is successfully isolated, movement is simulated by shifting the background. This is quite more complex, as it requires several corrections and analysis on the photograph since, a few misinterpreted pixels, could ruin the final result.