Contributions and Thesis Outline - Optimal Spatial Registration of SLAM for Augmented Reality

The scope of the current thesis is the geometric registration for augmented reality. The general objective is to ensure that the displayed virtual content appears at the correct position and from the right perspective. Geometric registration notably includes tracking, spatial registration, and display calibration. Within this context, various algorithms are presented that allow the numerical estimation of the registration parameters. User knowledge is one important source of information, and we show how the presented algorithms can optimally exploit this data for accurate registration.

In Chapter2, we present a closed-form solver to estimate the parameters of a single Euclidean or similarity transformation. The presented approach represents a further generalization of recent state-of-the-art perspective-n-point (PnP and GPnP) algorithms. It comprises several advantageous properties. First, it is universally ap-plicable to arbitrary geometric configurations including the planar and the non-planar PnP case. Second, it can handle both, minimal and overconstrained cases, and only requires a linear effort with regard to the number of input correspondences. And third, it computes all relevant minima at once including the global solution. Our derivation is based on the idea that the PnP problem can be interpreted as the least-squares registration between a set of points and corresponding lines in 3D space. The generalization consists of extending the applicability to also cover correspondences between points and planes, points and other points, or any mixture of these three correspondence types. The algorithm is based on a decoupling of the linear parameters (translation and scale)

1.6. Contributions and Thesis Outline

and the rotation using an orthogonal complement formulation. For the rotation parameters a system of multi-variate polynomial equations is set up, and its solutions are obtained by means of the Gröbner basis method.

Within comprehensive evaluations on synthetic data, we can show that our formulation is not only more general but also faster than previous approaches. The results of this chapter are based on our awarded ACCV conference publication [WK17] and its CVIU journal follow-up [WSFTK18]:

• WIENTAPPER F., SCHMITT M., FRAISSINET-TACHET M., KUIJPER A.: A universal, closed-form ap-proach for absolute pose problems. Computer Vision and Image Understanding (CVIU)(2018) — [WS-FTK18]

• WIENTAPPERF., KUIJPERA.: Unifying algebraic solvers for scaled Euclidean registration from point, line and plane constraints. InProc. Asian Conf. on Computer Vision (ACCV)(2017), pp. 52–66,Best Paper Honorable Mention Awardfor regular papers at the ACCV conference. — [WK17]

Algorithms for the PnP problem are an important element for dynamic motion estimation inside the tracking or SLAM. They are used to continuously estimate how the camera used for tracking is oriented with regard to the surrounding environment or the real world model. Beyond that our presented algorithm is also particularly useful for spatial registration. Based on correspondences between the real world and the virtual model, an Euclidean or similarity transformation can be computed, which allows to transform the reconstructed real world model and its associated camera path to the coordinate system of the virtual model.

Chapter3focuses on the simultaneous, iterative refinement of a large number of motion and structure parame-ters based on measurements of an image sequence as part of a bundle adjustment. Our main contribution consists of embedding scene knowledge provided by the user as constraints into the BA. Instead of integrating these con-straints by means of a standard Lagrangian formulation [TMHF00], we propose to interpret them as manifolds in parameter space, leading to a strictly feasible optimization-on-manifold approach for BA. Compared to a La-grangian formulation it preserves the least-squares character of the minimization and also reduces the number of optimization parameters. Since the aim is to compensate low-frequency deformations of the reconstructed model and the camera path, we refer to this technique as a non-rigid registration. The chapter is dedicated to the dif-ferences of our proposed approach compared to ordinary BA, and it comprises three steps. First, the constraints and the parameters for BA must be transformed into a common coordinate system. Thus, it involves a rigid simi-larity transformation and represents a use-case for the algorithm of Chap.2. Second, the constrained parameters must be projected onto their respective constraint manifolds to ensure the feasibility of the constraints right from the start. Third, during iterative minimization the feasibility is further maintained by forcing the parameters to evolve only along manifold geodesics. We exemplify these operations for various types of constraints including fully known parameters, line and plane-constrained points and translations, or axis-constrained rotations. We also discuss to what extent existing sparse minimization libraries can be used for the same purpose.

Next, we present two concrete application examples of these techniques in the subsections that follow. In Chapter4, we consider a marker-based SLAM pipeline. In this case, the user provides his scene knowledge in advance by placing some reference markers at planar surfaces or along edges of the target environment and by associating partial registration information to these reference markers. During runtime, all observed markers are first reconstructed in a local coordinate system. Once enough reference markers are reconstructed, the rigid align-ment, parameter projection, and manifold-constrained BA for non-rigid registration are executed automatically and the tracking continues in the desired coordinate frame of the virtual model.

Chapter5refers to the case, when a user manually establishes correspondences between a pre-reconstructed feature map and a virtual model with an appropriate interface. This user-provided information is exploited for rigid and subsequent non-rigid registration within our constrained BA framework. Moreover, we also demon-strate how this preparative stage additionally serves to improve feature recognition and tracking performance in order to setup a ready-to-use natural feature tracking-based application for AR.

For both use-cases we show that by internalizing the user information into the BA, a substantial improvement of registration accuracy is gained, as low-frequency deformations of the reconstructed real model and the camera path that occur due to systematic or random measurement errors are compensated to a large extent. We have participated with our system at various public tracking benchmarks for AR and we show the corresponding results at the end. Chapters3-5are mainly based on our IEEE-3DIMPVT conference publication [WWK11b], our Computers & Graphics journal publication [WWK11a] and in parts also on our CVIU journal publication [WSFTK18] (listed above):

• WIENTAPPER F., WUEST H., KUIJPERA.: Composing the feature map retrieval process for robust and ready-to-use monocular tracking.Computers & Graphics 35, 4 (2011), 778 – 788 — [WWK11a]

• WIENTAPPER F., WUEST H., KUIJPER A.: Reconstruction and accurate alignment of feature maps for augmented reality. InIEEE Proc. of Int’l Conf. on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT)(2011), pp. 140–147 — [WWK11b]

In Chapter6, we consider the visualization side of geometric registration. We present a generic and cost-effective camera-based calibration for an automotive head-up display as one example of an optical see-through display for AR. Our contribution comprises two aspects. First, we present a model, which maps the setup consisting of the user (observer) and the display to pinhole camera parameters as needed for the rendering.

These are naturally divided into user (head position) and hardware related parameters. The latter consists of the independent spatial geometry, i.e. the exact location, orientation and scaling of the virtual plane, and a view-dependent image warping transformation for correcting the distortions caused by the optics and the irregularly curved windshield. View-dependency is achieved by extending the classical polynomial distortion model for cameras and projectors to a generic five-variate mapping with the head position of the viewer as additional input.

Our model enables the HUD to be used together with a head tracker to form a head-coupled display which ensures a perspectively correct rendering of any 3D object in vehicle coordinates from a large range of possible viewpoints. Second, we propose a procedure for the retrieval of the calibration parameters. The calibration involves the capturing of an image sequence from varying viewpoints, while displaying a known target pattern on the HUD. For the accurate registration of the camera path we use the techniques presented in Chapters3and 5, which is why HUD calibration can be regarded as another application example. In the resulting approach all necessary data is acquired directly from the images, so no external tracking equipment needs to be installed. The accuracy of our HUD calibration is evaluated quantitatively and qualitatively. Finally, the calibration method is also extended to OST-HMD calibration. The separation of user and hardware parameters allows for a quick user adaptation by means of a simple user interface. The results of this chapter are based on our ISMAR full paper [WWRF13] and the ISMAR short HMD demo paper [WEK^∗14]. Moreover, we also published a patent based on this work [GWW15].

• WIENTAPPERF., WUESTH., ROJTBERGP., FELLNERD.: A camera-based calibration for automotive aug-mented reality head-up-displays. InIEEE Proc. Int’l Symp. on Mixed and Augmented Reality (ISMAR)(Oct 2013), pp. 189–197, awarded by the Fraunhofer IGD and TU Darmstadt’s Interactive Systems Group (GRIS) with theBest Paper Award - Honorable Mentionsin the categoryImpact on Business. — [WWRF13]

• WIENTAPPERF., ENGELKET., KEILJ., WUESTH., MENSIKJ.: [demo] user friedly calibration and track-ing for optical stereo see-through augmented reality. InIEEE Proc. Int’l Symp. on Mixed and Augmented Reality (ISMAR)(Sept 2014), pp. 385–386 — [WEK^∗14]

• GIEGERICHP., WIENTAPPERF., WUESTH.: Method and apparatus for controlling an image generating device of a head-up display. Patent. WO/2015/044280, 02 04, 2015 — [GWW15]

Finally, Chapter7concludes this thesis with a summary and a discussion of the achieved results.

Im Dokument Optimal Spatial Registration of SLAM for Augmented Reality (Seite 30-33)