Acquiring and Visualizing Compelling Interior Architecture
Summary

The acquisition and visualization of compelling interior architectures is one of the great challenges of computer graphics today. Worldwide a large number of visually striking interior designs have been created for which detailed digital models do not exist. Our ultimate goal is to develop a robust and practical approach for capturing the geometric and photometric details of interior architecture as well as supporting a variety of tools and visualization applications.

To date there is no robust, accurate, and widespread acquisition process for interior architecture. Many approaches in image-based rendering, 3D scanning, and computer vision began with the objective of capturing single objects and have been extended to acquire interior spaces. However, these approaches have overlooked several fundamental challenges that must be addressed to acquire interior spaces. In particular:

  • Interconnected Spaces. Indoor spaces consist of a network of narrowly interconnected spaces covering a large area; this severely exacerbates scene reconstruction and, in particular, camera pose estimation. Thus, while small positional errors might be tolerable, even small camera orientation errors have huge ramifications on distant structure estimation. In fact, simultaneously recovering camera position and camera orientation is a fundamentally ill-conditioned problem that cannot be solved mathematically.

  • In-place Acquisition. While individual objects can be placed on controlled stages, the size and complexity of indoor spaces makes explicit modeling prohibitive and acquisition must, by definition, occur in-place. Furthermore, important and compelling locations are in frequent and continual use. This prohibits using fully controlled scenarios and, in truth, acquisition might occur while a site is in active use.

  • Large and Tedious Task. Capturing an indoor space is extremely tedious and monotonous. Moreover, since acquisition must obtain many details and high accuracy often highly skilled personnel are needed. The tedious aspect of the work and unique skill requirement has so far prohibited many capture efforts and, altogether, is keeping us far away from “point-and-shoot” acquisition of interior architecture.

We are investigating a new approach to acquiring interior architecture that is a significant departure from current acquisition strategies and that addresses the fundamental problems that become intrinsically challenging when acquiring indoor spaces and have to date limited such efforts. The result will be a never seen before semi-automatic acquisition process for indoor spaces that significantly increases robustness and decreases acquisition time. This work will impact computer scientists and engineers, as well as provide significant new tools and applications for architects, designers, and curators. Our effort is divided into the following three major research categories.

Geometry Reconstruction and Refinement. Our objective of this research task is to obtain a geometric model of interior architecture by removing completely the ill-conditioned problem of simultaneously computing camera position and camera orientation. Achieving this task is particularly difficult and is in considerable contrast with the current formulation for geometry reconstruction. We formulate and use a new set of reconstruction equations that is significantly more robust and frees us from having to worry about accurate camera pose estimation which is very troublesome to perform within the complex interconnected areas (or rooms) of interior spaces.

Current Status: We have already developed a framework for eliminating variables from the 3D reconstruction equations. General variable elimination is hard especially when the number of unknowns and constants in the equations is high, as is the case in this problem. Symbolic elimination tools developed for polynomial equations cannot typically handle the size of this problem and often produce high degree polynomial expressions. However, in our framework we use an invariant-based method where we parameterize the standard reconstruction equations by the parameters to omit, generate an equivalent set of equations invariant to the parameters, and formulate new low-degree polynomial equations that omit the chosen pose parameters. Removing parameters makes acquisition easier but also greatly improves the robustness of the numerical computations and yields significantly more accurate solutions. As preliminary results, we have removed camera orientation from standard reconstruction equations and used it to provide a very fast 3D structure recovery approach, a significantly more robust cost function for geometry refinement, and a notably more accurate vision-based registration method for mixed and augmented reality. In addition, we have recent results for removing both camera position and camera orientation parameters from the reconstruction equations albeit with higher-degree polynomial equations and requiring depth estimates of the tracked features.

Photometry Reconstruction and Refinement. The goal of this research task is to replace explicitly modeling individual surface-light interactions with using a highly redundant image dataset from a dense set of viewpoints. A dense acquisition replaces complex and fragile algorithms which use interpolation to fill-in for missing samples with more robust extrapolation and refinement to obtain a globally-consistent photometric model without dependence on accurate camera pose.

Current Status: We developed several preliminary methods for photometric reconstruction using a dense sampling of images and without depending on accurate camera pose. These methods have been applied to several datasets of 2000 to 10000 omnidirectional images of 1024x1024 pixels sampling 30 to 1000 square feet of various indoor spaces. First, we created two prototype systems for dense acquisition in indoor environments. These prototypes use a 4D parameterization of the plenoptic function suitable for representing the photometric information of a bounded indoor environment with flat ground surfaces. Second, we presented a tailored interactive image compression algorithm. The captured images are stored in a spatial image hierarchy combined with a model-based compression algorithm which provides quick access to images along arbitrary viewpoint paths. Third, we described a novel high-quality image reconstruction algorithm affording errors in pose estimation by exploiting dense image sampling. This feature globalization algorithm detects 2D features in each image, tracks them to neighboring images, matches with similar features, and re-labels them as being the same. Hence, even in the presence of occluders, bad pose, and noisy images, our results show that the redundancy provided by dense sampling affords more correspondences to be found than other methods.

Mobile Platform Acquisition. Our approach rids us from having to solve the challenging task of precise localization and requires only approximate mapping of a mobile platform. Thus, this makes automatic navigation particularly attractive, because we can perform it robustly and naively, yet acquire very visually compelling model information. Hence, a semi-automatic mobile platform, requiring only minimal supervision by an untrained technician, will capture active and in-use indoor environments in about one day.

Current Status: We have developed several preliminary components of a mobile acquisition platform. In initial work, we developed an untethered mobile acquisition platform for indoor spaces built from off-the-shelf components and radio-controlled. To obtain a 360-degree horizontal field-of-view (FOV) and a large vertical FOV, the camera uses a convex paraboloidal mirror with an orthographic projection. This setup successfully captured 10,000 images in a few hours and from a constant height in indoor environments up to 1000 square feet in size. An operator navigates this platform via radio remote control in a simple zigzag pattern through the environment. Pose estimates of guaranteed accuracy are triangulated from a set of fiducials carefully placed in the environment using a heuristically-based solution to a variation of the classical art-gallery problem. Furthermore, we developed a multi-camera design called lag cameras to perform interactive foreground segmentation.This camera design consists of a small cluster of cameras where at least one camera follows (or “lags”) behind a lead camera in order to interactively acquire space-time samples of the environment. Moving objects can be interactively detected while the camera itself is continuously moving through the environment.



Publications


People