Conclusion & Outlook - 3D Robotic Mapping and Place Recognition

This subsection provides an overview of the conclusions as well as the possible future research directions in the domain of Environment representation, SLAM and Place recog-nition/loop closure detection.

5.2.1 Environment representation

The main advantage of the proposed environment representation is that it is capable of modeling the environment using a variable resolution grid which is stored in a hierarchy of axis aligned rectangular cuboids. The proposed approach is flexible in the sense that it allows the user to define the maximum number of children allowed per node for the hierarchy thereby influencing its characteristics such as insertion, access time as well as the number of grid cells required to represent the environment. The evaluation highlights that the proposed approach in comparison to the state-of-the-art Octomap approach requires less number of grid cells and provides faster access times. In addition, the number of inner nodes required to represent the hierarchy is significantly less, however the proposed approach requires higher insertion times as it incrementally generates the hierarchy based on the sensor observations.

The fusion process proposed in this thesis assumes a static environment. Possible future work includes an extension of the fusion process to operate in dynamic environments. In principle this extension can be carried out by monitoring the occupancy values of the fused grid cells and splitting them if these values fall below the fusion threshold. In addition, another important research direction is to incorporate object level dynamics into the environment representation. In such a scenario a classifier would be used to detect objects in the point cloud and furthermore approximate them as a rectangular cuboid and add them to the variable resolution grid. Incorporation of object level hypothesis into the environment representation is an essential step towards semantic mapping and critical for development of intelligent and autonomous robots.

5.2.2 Laser Intensities for SLAM

In context of laser based SLAM, a simple calibration process for extrinsic parameters is proposed that allows the robot to acquire a measure of surface reflectivity. The impor-tance of the proposed calibration process is shown by comparing it to other models which systematically ignore the influence of extrinsic parameters. The results show that extrinsic parameter calibration is essential to acquire a pose-invariant measure of surface reflectivity.

5.2 Conclusion & Outlook

In addition, this reflectivity measure is used in an extension of Hector SLAM in which a robot simultaneously estimates its own pose as well as acquires a reflectivity map of the environment. The proposed Hector SLAM extension has been shown to accurately esti-mate the robot pose and can be useful in cases when geometry information is ambiguous.

The reflectivity maps generated by the proposed approach can be used in a wide variety of robotic applications such as global localization, navigation as well as exploration.

The proposed Hector SLAM extension relies on a cost function based only on surface reflectivity information, hence it would interesting to consider other cost functions that combine reflectivity and occupancy information and evaluate their performance. In addi-tion, the scenario in which the point density is low can be problematic for normal vector estimation thereby causing problems for extrinsic parameter correction. Hence an interest-ing future research direction would be to switch cost functions based on the point density observed by the robot.

5.2.3 Place recognition/Loop closure Detection

This thesis evaluates and highlights the advantage of laser intensities for place recognition under challenging lighting conditions and compares its performance with other types of input data such as camera images or geometry information from laser scanners. The exper-imental evaluation shows that using intensity images as input in comparison to other forms of input data, i.e. camera or range images, is beneficial for place recognition algorithms (based on local or global descriptors) operating under challenging lighting conditions. The results also underline the importance of using intensity textured point clouds for 3D point cloud based place recognition. The evaluation highlights certain design decisions in con-text of place recognition algorithms such as the strong dependence of global descriptors on observer orientation, the effect of the limited field of view of the rectilinear projection model as well as the decrease in performance due to downsampling of point clouds. An interesting future research direction would be to develop approaches that combine the ad-vantage of local and global descriptors for place recognition. It will also be interesting to combine different types of input data such as camera images or intensity information from laser scanners to take advantage of their properties under different conditions.

In context of vocabulary generation mechanisms, the proposed loop closure detection approach shows that it is possible to generate binary vocabularies in an online, incremental manner. The proposed vocabulary generation mechanism coupled with a simple similarity function and temporal consistency constraint is capable of generating high precision-recall on real world datasets in comparison to the state-of-the-art loop closure detection al-gorithms. A drawback of the proposed vocabulary generation mechanism is the linear complexity in the update process. An interesting research direction would be to develop an approach that allows generation of a binary vocabulary tree in an online, incremental manner thereby reducing the vocabulary update complexity from linear to logarithmic in the number of descriptors present in the vocabulary.

A.1 Equirectangular projection

Given the laser scanner observations in cartesian coordinates the first step is to convert them to spherical coordinates defined by range, azimuth and elevation. The intensity, azimuth and elevation of each point observation is used to generate an equirectangular intensity image. It is possible to interpret the azimuth and elevation of the sensor observa-tions as rows and columns of an image respectively and accumulate the intensity value to form a gray scale image as shown in FigureA.1(a). An example of thepanoramic grayscale image generated via the above mentioned projection is shown in Figure A.1(b) whereas FigureA.4 shows the pseudocode for generating it given a point cloud.

(a) (b)

Fig. A.1: (a) Laser scanner observation of thej^thpoint in thei^thpoint cloudPi. (b) Equirect-angular intensity image obtained after projecting the point cloud. The azimuth and elevation of the j^th point is denoted by η^j and λ^j respectively.

(a) (b)

Fig. A.2: (a) The process of range image generation in which the range value is accumulated in the relevant elevation, azimuth bin. Furthermore, this range image is normalized by the maximum range (as represented by r¯^j in the figure) to generate a matrix of floating point values between 0 and 1. b) (Best visualized in color) An example of the generated range image visualized with a HSV colormap.

Im Dokument 3D Robotic Mapping and Place Recognition (Seite 112-115)