Baoquan Chen： Reality Capture
Author：Advanced Innovation Center or Future Visual Entertainment
01 The Essence of Reality Capture
Good afternoon, everyone. I’m glad to talk about my research in the last decade here, which can be encapsulated in a phrase: Reality Capture (RC). We need to develop new reality capturing technologies for films in the future, because RC is the foundation of acquiring digital content, and unique content is constitutive of any film to be made in the future.
Surely, reality capture is a large topic. Reality is rich enough, anyway. Today, I will focus on scenario capture. You can write a whole history about its technological development, and a splendid, newer chapter is the capturing of 3D objects or scenarios with laser scanning. Why do we reserve the term to ourselves? Because we are talking about capturing large outdoor scenarios and complex indoor ones, instead of (relatively) tiny things.
MassiveTrees SIGGRAPH 2011
My team started on the task of laser scanning in 2002, when the first commercial tripod outdoor laser scanning device appeared. Back then, you needed to fix your device on a tripod for about half an hour until you got a range image.
It was quite awesome then. You were able to get an information-dense 3D range image through straightforward measurement, with no supplementary computations required. Now, if you want to capture something like Mt Rushmore, you may just place this small machine at its foot, and wait.
3d scanning of Mt Rushmore
With the captured data, we can move around the object in a 3D space, like a bird viewing it. For an even larger scenario, we have to do the same thing in several spots, and integrate the data into a single spatial coordinate system to get a dense 3D spot cloud.
As you might have seen, the raw data captured in this way is far from complete. Here is where computation comes in, for reconstruction. Our algorithm can recover the complete details of individual objects, such as a building or a tree, in an automatic or semi-automatic manner. Now let’s see how it may be useful for filmmaking: you can digitize a scene in a short time, and then do editing or recombination as you wish.
02 City in Computer
We targeted ourselves at something larger in 2006. Wouldn’t it be fabulous if we did RC all around a city, and constructed a fully-fledged city-scale 3D scenario with algorithms in a relatively short time? (Remember that Google Earth was launched at more or less the same time, but its 3D modeling was largely manual then.)
So we made an experiment that lasted for a whole summer. The device was fixed on a vehicle to spare the efforts of moving at the cost of efficiency, since we had to stop whenever scanning began. Even worse, a larger part of the data was not able to be integrated in one coordinate system.
In fact, the area we scanned was less than 2 km2, and not all the data was properly processed and reconstructed, so we reached a conclusion that the scanning technology was too premature to be applied massively. Nevertheless, we were noticed by some media, including CBS.
Two years later, in 2008, I came back to China and settled in Shenzhen. Fortunately, the first commercial vehicle-based movable laser scanner was launched, which was functional when the vehicle was moving at a normal speed.
Vehicle-based movable laser scanning
As is evident in this video, the system was capable of capturing city-scale 3D data in a speed which had been unimaginable merely two years before. We were really thrilled! However, a new challenge showed up immediately: how can we reconstruct such dense 3D spot cloud data covering an area as large as a whole city quickly?
We dived into this issue in the following years, and developed a series of 3D spot cloud processing algorithms. For one thing, we can minutely reconstruct a residential building of over twenty floors in 40 minutes, including the balconies.
Mobile Laser Scanning
Let’s watch the streets again. There are many naturally-growing plants of various genera and features. Is it possible to reconstruct them automatically? A tricky issue as it is, we accomplished this by automatically identifying the genera based on 3D spot cloud, and recovering the trees according to the 3D specifications of the identified genres. Technical details have been recorded in our papers. A 30-minutes session of Approaching Science (CCTV) introduced our researches, called “City in a Computer”.
We were flooded by inquiry calls after the show, which made us realize the tremendous demand for quick 3D capture which was far beyond our capacities.
Various occasions need high preciseness, real-time reconstruction or the ability to process dynamic scenarios. For instance, for Smart City applications, our 3D scenario has to be updated timely whenever and wherever a change occurs in the city. And a filmmaker might want to quickly digitize scenes.
Three years ago, we began to use robots to do scanning and reconstruction, as a step towards true autonomy. In a first attempt, we commanded a robot to automatically capture the 3D geometry of a small object, with the object in one hand and a 3D scanner in the other.
Robotic autonomous object scanning SIGGRAPH ASIA 2014
The path is determined by algorithm automatically, and real-time 3D reconstruction begins when the designated location is reached. The experience indicates the plausibility of automatic 3D object capture. Technically, it breaks away with the serial procedure in the past in which reconstruction comes after scanning in that the two processes are executed at the same time, and the scanning is partly determined by the results of previous reconstruction. In other words, it’s truly closed-loop computation.
The same idea works for indoor spaces as well. You can build a scanning robot that determines where to go, and then scans in an autonomous and real-time way until the room is completely captured.
Robotic indoor autonomous scanning SIGGRAPH ASIA 2015
These successes have made us very confident. In fact, robots can fly as much as they can walk. With the popularization of UAV, it is a worthwhile enterprise to establish an autonomous capture-reconstruction closed-loop system based on their photographic and navigational capacities. We have been cooperating with DJI since 2009 for 3D reconstruction with aerial photography. Here is a video showing 3D roam using such techniques.
3D reconstruction and roam based on aerial photography
03 Reality Capture in Films
The quick 3D scenario capture toolkit will benefit many industries, but let’s consider filmmaking first. This is a new martial film by Haofeng Xu, The Hidden Sword, and it is filmed on the Great Walls.
The opening ceremony of The Hidden Sword
Our center has experimented with digitizing the Great Walls, where the film was shot, as a technical support for planning and post-production. In the video, you can see a virtual camera whose effect is exactly the same as a physical one. The actors or actresses can be embedded into the virtual scene, so that the virtual camera can “capture” the hybrid scene authentically.
Staff of Hidden Sword experience virtual camera
Yet another challenge for films in the future is to capture real objects and events in motion, like this plant. We capture its continuous growth, and quickly reconstruct and analyze the 3D data. Its significance lies in its inspirational potential. For example, we may record an entire blossoming process, and watch its 3D reconstruction from any angle later.
Scanning a growing plant SIGGRAPH ASIA 2013
The next step is to capture physical properties, in addition to 3D geometry, of objects, like this lotus.
To achieve this, we need to 3D scan it, acquire its physical parameters, and run a simulation on a computer, say, of its lively transformations when a drop of dew falls on it or a breezes blows by. It is tremendously promising for creators in the future.
04 Human Emotion Synthesis
Human movement is our utmost concern when it comes to video making. The industry has made great efforts on motion capture, to which we have contributed to by empowering a series of captured motions with something complex and dramatic, beyond simply controlling the movements. Here is a sample of dance.
Can we lighten or depress the dancer a little? Now if we want new emotions in a dance as required by the playwright, an automatic synthesizing program is here to help.
Synthesizing various emotions based on captured performance
In the end, reality capture is in service to creation. About ten years ago, we did try to make something of aesthetic expressiveness with our captured scenes. The digitized 3D scenes offer us more information to process videos better and more freely, and present them in sketch, cartoon and water color, among others.
Sketch: Northrop Mall, U of M
Cartoon: Back of Walter Library，U of M
Watercolor roam: Stone Arch Bridge, Minneapolis
These works have all been exhibited as digital arts. Over a decade has passed, but they are still so intriguing whenever I behold them. Reality capture, that has developed from static and shapes to dynamic and physical, will provide increasingly rich and resourceful materials for artists.