Decision Tree Hyperplanes - Visual Exploration of Classifiers for Hybrid Textual and Geospatial

5.2 Visual Exploration of Classifiers for Hybrid Textual and Geospatial

5.2.4 Decision Tree Hyperplanes

levenshtein 1

logDist 7.846829

yes levenshtein 0

yes no

partof 0

logDist 5.941076

bw 0.9375

levenshtein 3

substLev 2

yes no

no no

hyph 0

logDist 8.20253

yes no

levenshtein 4

hyph 0

logDist 6.998904

yes no

no no

Figure 5.13: Decision tree used in a classifier iteration.

match, but since the lowercase German letter ß does not have a corresponding uppercase counterpart it is written as SS when writing uppercase. Therefore the matching candidate is a match. This is an example of the effectiveness of the visualization in identifying possible improvements to the underlying representation.

5.2. Visual Exploration of Classifiers for Hybrid Textual and Geospatial Matching 79

y x

Figure 5.14: Two-dimensional feature vectors, classified as red and blue separated by 2-space hyperplanes (green line). In the right part of the image a projection to 1-space is shown; the separating hyperplanes within the domain are mapped to a line segment.

introduced here. Figure 5.14 shows how a decision tree separates two-dimensional feature vectors. The dots represent feature vectors, blue and red color represents the classification. The decision tree divides the space along the axes and separates the differently classified feature vectors. Since the feature vectors used by the decision tree classifier are high-dimensional, for visualization they need to be projected to a lower-dimensional space as indicated in Figure 5.14. The hyperplanes defined by the classifier are not bounded, therefore their projection would cover the whole domain.

However, the hyperplanes can be clipped by the bounding volume of the feature vectors, so that the hyperplanes project to a finite volume, which is a line segment in the 1D case.

Only blue points are projected to the one side of the line segment, to the other side only red points. Within the line segment there are blue and red points that cannot be visually separated in the projection.

Figure 5.15 visualizes the hyperplanes of the decision tree defined in Figure 5.13.

The feature vectors were projected orthogonally to the featureslevenshtein,logDist, and partof. In 3-space the projected hyperplanes are illuminated and rendered opaque with the projected feature vectors. Hyperplanes that divide regions based on the selected features project to planes in 3-space. Hyperplanes dividing regions based on other features project to volumes containing all feature vectors that were divided by them.

Figure 5.15: 6-space hyperplanes used by the classifier, lit in 3-space projection.

Decision Tree Hyperplane Rendering

In Figure 5.12, FastMap was used to obtain a quite good separation of the differently labeled items. This section presents how to apply the decision tree hyperplane rendering technique not only to projections where certain data dimensions have been mapped to certain axes, but to allow us to map linear combination of dimensions to the axes.

Each leaf node of the decision tree defines an orthotope. Always when differently labeled orthotopes are next to each other, a hyperplane separating the two regions needs to be drawn. A straightforward implementation checks each orthotope with each other, requiring (h(h−1))/2 checks (h being the number of leaf nodes). For decision tree presented in Figure 5.13 there are 14 orthotopes and 46 resulting hyperplanes.

It is known that ann-cube has 2ⁿnodes (vertices); and there are 2^{n−k n}_k

k-cubes on its boundary (k<n). The triangle count is two times the number of 2-cubes (k=2). Therefore in an f-dimensional feature space each hyperplane of the decision tree is ann= f− 1-dimensional orthotope. According to the formulas, each hyperplane has 2ⁿvertices and 2ⁿ⁻² ⁿ₂

=2ⁿ(n²−n)/8 quads are needed for rendering. For the 15-dimensional feature space considered here, therefore, 2(15−1)−2 15−1

=372 736 quads are required for each hyperplane. This leads to a total triangle count of over 34 M triangles, which is too large

5.2. Visual Exploration of Classifiers for Hybrid Textual and Geospatial Matching 81

Figure 5.16: Hyperplanes used by the classifier, lit in 3-space projection where linear combination of dimensions have been mapped to the axes.

to be drawn interactively even on modern graphics hardware.

However, this huge triangle count can be greatly reduced when considering the visibility of quads for a given axes mapping. Only a small fraction of the quads are visible, namely the quads having all the vertices on the convex hull of the 3D projection.

The number of vertices on the convex hull isn²−n+2 resulting inn²−nquads (without proof). Note that the number of hull-quads does not grow exponentially—like before the number of all quads—but only squared with the number of dimensions. For the given case resulting in only just under 17 k triangles, which can be drawn interactively. For the implementation it can be exploited that whether a vertex is part of the convex hull of the orthotopes or not, is independent of the location and size of the orthotope. Therefore, it is sufficient to project a single orthotope to 3-space, run a convex hull algorithm, and use the results for all hyperplanes.

Figure 5.16 shows such a decision tree hyperplane rendering, here PCA has been used to calculate the projection axes. Therefore, the projection axes do not point in the direction of a certain feature, but in a linear combination of features. Hence the hyperplanes have an accordingly complex geometry. The visualization technique has been integrated into the Multidimensional Analyzer as well, see Figure 5.17.

Unfortunately, for the given case the decision tree hyperplanes cover a large volume in 3D and most of the items are inside this volume, therefore a clear separation of the differently labeled items cannot be achieved this way. The results obtained suggest that the decision tree hyperplane rendering is best applied when just one data dimension is assigned to each axis and not a linear combination, because this way just one single data dimension contributes to the size of the hyperplane in each axis direction, and this tends to reduce the volume of the resulting decision tree hyperplanes.

Figure 5.17: Data set visualized with Multidimensional Analyzer. Overlay added to illustrate the large number of data items inside the decision tree hyperplanes.

5.2. Visual Exploration of Classifiers for Hybrid Textual and Geospatial Matching 83

Im Dokument 3D visualization of multivariate data (Seite 92-97)