• Keine Ergebnisse gefunden

5. Automatic Reconstruction of Indoor Spaces

5.1. Point Cloud Pre-Processing

Pre-processing of the point cloud in this work includes outlier removal, downsampling, noise removal, point cloud leveling and finally furniture and clutter removal.

5.1.1. Outlier Removal

In comparison with TLS, data collected by range cameras contain more noise as well as outliers due to measurement errors. This affects the accuracy of further processing steps which include normal vector estimation and estimation of walls. In the first pre-processing step, outliers are removed using a statistical outlier removal filter implemented by the Point Cloud Library (PCL) (Rusu and Cousins, 2011). The filter works based on the distribution of the distances between the neighboring points. The mean distance to K-nearest neighbors is computed at each point. Assuming a Gaussian distribution for the distances, points corresponding to the mean distances outside of an interval defined by the global mean and standard deviation are removed using a one-tailed normal distribution test. Figure 5.1 depicts the mean distances computed for an exemplary scene. In this example, outliers are removed based on a 1 one-tailed normal distribution test:

2

dist ~ N   , P(dist    ) 84.1% (5.1)

Figure 5.1 – Top: statistical outlier removal (9K out of 73K points are removed, equivalent to the 1 confidence level normal distribution one-tailed test); Bottom: corresponding mean distance (in meters) to the 20 nearest

neighbors versus point index (mean: 0.011m, standard deviation: 0.005m).

5.1.2. Downsampling

Point clouds collected by range cameras are usually very dense. The number of points is increased very fast soon after data acquisition starts; each range image frame captured by Kinect (for Xbox 360) delivers 640 480 300K  3D points, or 1920 1080 2Mio  points by Kinect V2. Reconstruction programs therefore can face difficulties handling a large number of points collected from large spaces.

In order to efficiently manage the memory, as well as achieve a uniform point density, the registered point cloud is downsampled by a voxel grid filter implemented by the PCL software library, in which points are replaced by the centroid of the corresponding bounding voxel generated by an Octree data structuring process (see figure 5.2). The voxel size can be set based on the overall noise of the registered point cloud (e.g. 3-5cm in case of using Kinect), or for example the maximum allowed tolerance specified by the adapted standard for building reconstruction (e.g. surface flatness tolerance suggested by DIN 18202 standard).

Figure 5.2 – Voxel grid filtering: points inside each voxel are replaced by the centroid of the corresponding voxel (red points).

5.1.3. Noise Removal

Noise in the point cloud causes erroneous object fitting or feature extraction. Although the wall estimation algorithm used in the following modeling process is capable of dealing with noise in the point cloud to some extent, noise removal significantly improves the accuracy of feature detection in modeling. The point cloud is smoothed using the moving least squares approximation algorithm, originally introduced by Lancaster and Salkauskas (1981), and implemented by the PCL software library. As described by Nealen (2004), the idea of the moving least squares algorithm is to start with a weighted least squares surface estimator (a degree

n

polynomial) for an arbitrary fixed point, in which weights are proportional to the neighboring points distance within a given radius. The point is then moved over the entire parameter domain, where a weighted least squares fit is estimated for each point individually, in order to estimate the overall surface. The global function f (x) is obtained from a set local functions f (x)x that minimize the following cost function:

2

x i x i i

i I

f (x) f (x), x x f (x ) f min

      (5.2)

in which  is the weight function tending to zero at infinity distance. During the process, small holes can be filled by resampling techniques, e.g. based on a higher order polynomial interpolation. This can further remove the “double walls” artifacts caused by erroneous registration of multiple scans. Figure 5.3 depicts an example in which an overall noise of 34mm is reduced to 25mm by local plane fitting using a moving least squares process within the search radius of 15cm.

68 5. Automatic Reconstruction of Indoor Spaces

Figure 5.3 – Before and after noise removal. In the cross section before the noise removal, the limited resolution of Kinect disparity measurements is noticeable as a stripe pattern.

5.1.4. Leveling the Point Cloud

The resulting point cloud has to be leveled for the upcoming processing steps, i.e. generation of the point height histogram and projection of points onto a horizontal plane. The leveling is performed by the analysis of the point cloud normal vectors. Since most of the surfaces in man-made scenes are aligned either horizontally or vertically, it is possible to cluster the normal vectors into two main groups. Assuming the vertical axis of the point cloud’s local coordinate system is inclined less than 45° with respect to the vertical axis of the world coordinate system, the group of horizontal and vertical surfaces can easily be distinguished, as they constitute a difference of 90°. The average of the normal vectors corresponding to horizontal (or alternatively vertical) surface points is then used to find the tilt, and thus level the point cloud. This procedure is an iterative process; in each step, after leveling the point cloud based on the estimated tilt, the classification of horizontal and vertical surface points is updated based on the new (recently transformed) normal vectors. Figure 5.4 depicts an example of a normal vector histogram for a sample point cloud after leveling.

Figure 5.4 – Left: computed point normal vectors. Right: histogram of the inclination of the normal vectors with respect to the vertical axis after leveling the point cloud.

5.1.5. Height Estimation and Furniture Removal

The room height information enables the extrusion, and therefore conversion of generated 2D to 3D models. After leveling the point cloud, it is possible to estimate the room height by the analysis of the point height histogram. The floor and ceiling can be distinguished in the histogram by the identification of the smallest and largest local maxima, even if only small parts of them are captured (see figure 5.5). The number of histogram bins is corresponding to the voxel size used in the previous downsampling process. Therefore, the histogram values correspond to the surface area instead of the number of points, if the point cloud is downsampled by the voxel grid filter.

As was mentioned before, the presented 2D modeling approach is based on the projection of the points onto a horizontal plane. Therefore, similar to Okorn et al. (2010), in order to remove furniture and clutter, a cross section of the 3D point cloud which is less affected by clutter is selected. By doing so, no important information is lost, as the remainder of points corresponding to walls will provide the required information about the room shape. The height range can be selected based on a typical height of the furniture (points with heights less than e.g. 1 – 1.5m) as well as lights or ceiling fans (points laid within e.g. 0.5m under the ceiling). Figure 5.6 depicts an example of furniture removal using this concept for a sample room. It should be noted that in practice the furniture removal process is recommended to be performed supervised (or in an interactive way), since the existence of possible remaining clutter may have a large impact on the subsequent modeling steps, unless the modeling parameters are selected manually.

Figure 5.5 – Point height histograms and the corresponding point clouds.

70 5. Automatic Reconstruction of Indoor Spaces

Figure 5.6 – Furniture removal based on a selective height filter. The top view of the filtered points (bottom-right figure) delivers information about the room shape geometry.