Feature Selection and Signature Separability

7 Landsat ETM+ Classification

8.7 Feature Selection and Signature Separability

Quadratic classifiers like the MLC have limitations in handling data sets of high dimensionality (see chapter 2.6). The data set now available for the eastern test area comprises four IKONOS multi-spectral channels and ten texture features, with the possible inclusion of Landsat multimulti-spectral

channels and ancillary topographic data channels. This necessitates the selection of a channel subset before using them as input in a classification.

8.7.1 Reduction of Texture Channels after Correlation Analysis

A correlation analysis was conducted as a first step to reduce the large number of generated texture features (before adding the remaining channels to the multispectral channels). The correlation coefficients between the ten texture images were calculated for the eastern test area (table 13). Out of groups of highly correlated texture channels (with absolute values of correlation coefficients larger than 0.85 for the eastern test area), in each case the one which appeared visually most meaningful for forest discrimination was selected while the others were discarded, resulting in five (out of ten) retained texture channels. These were GLCM Entropy, Standard Deviation, Contrast, Mean, and Correlation. The maximum correlation coefficient between these remaining texture channels is 0.82.

Variance, when calculated in the same 15 m × 15 m size windows as the GLCM texture features (VAR 15), is highly correlated (r = 0.96) with GLCM standard deviation. The 9 × 9 window local variance (VAR 9) is also quite strongly correlated with both the GLCM standard deviation and the local variance calculated in the 15 × 15 window.

Table 13: Correlation coefficients for texture channels (scaled to 8 bit), eastern test area.

ASM ENT HOM CONT DISS SD MEAN CORR VAR 9 VAR 15

ASM 1

ENT -0.96 1

HOM 0.93 -0.96 1

CONT -0.66 0.79 -0.80 1

DISS -0.81 0.91 -0.93 0.96 1

SD -0.71 0.83 -0.72 0.82 0.81 1

MEAN 0.45 -0.43 0.42 -0.20 -0.31 -0.27 1

CORR -0.18 0.23 0.01 0.02 0.00 0.52 -0.10 1

VAR 9 -0.50 0.63 -0.53 0.73 0.68 0.87 -0.17 0.44 1 VAR 15 -0.52 0.67 -0.53 0.74 0.68 0.96 -0.17 0.56 0.89 1

The channels of the first three multispectral IKONOS bands (visible) are also highly correlated to each other (with r ≥ 0.95). But as there are only four multispectral IKONOS bands, they were, at least initially, all retained for the combination with the texture channels. Their correlation was only taken into account as a basis for channel reduction when the multispectral data were combined with texture as well as ancillary topographic data channels.

8.7.2 Feature Selection

These five texture channels combined with the four IKONOS multispectral channels were still more than the assumed optimum number of channels for a maximum likelihood classification (Peddle 1993, Hay et al. 1996). To select the best reduced channel combination for discriminating between different forest types and adjacent land cover types, class signatures were calculated based on training sites for all the forest classes, matorral, calimetal and grassland.

The seven optimal channels out of the four multispectral and five texture channels were selected using the criterion of maximised average transformed divergence between these class signatures.

The same criterion was used to select the best combination of seven channels from the combination of the eight Landsat channels and five texture channels, and to select the best seven out of eight Landsat channels.

The divergence (Gaussian form) between the classes i and j, D(i,j) is calculated from the class sample means and covariance matrices:

( )

ⁱ ^j

(

^M ^M

) [ ] (

^M ^M

)

^Trace

[

]

D = _i − _j ^T ⋅ Σ_i ⁻ + Σ_j ⁻ ⋅ _i − _j + (Σ_i)⁻ ⋅Σ_j +(Σ_j)⁻ ⋅Σ_i −2⋅ 2

) 1 ( ) 2 (

, 1 ¹ ¹ ¹ ¹ (16)

where

Mi = sample mean vector of class i, this vector has n elements, where n is the number of channels in the current set

Σi = covariance matrix for class i, which has n by n elements ( )^T = transpose of matrix

( )^-1 = inverse of matrix

Trace[ ] = trace of matrix (sum of diagonal elements) I = identity matrix

The transformed divergence between the classes i and j, TD(i,j), is calcualted as:

TD(i,j) = 2⋅

[

1−exp

(

−D

( )

i, j /8

) ]

(17)

This restricts the transformed divergence to values between 0 and 2.

Finally, the average transformed divergence (TDV) for all the NCLASS·(NCLASS-1) class pairs is calculated as:

TDV = _NCLASS_⋅

(

_NCLASS¹ ₋₁

)

^⋅

∑

^TD⁽ⁱ^,^j⁾⁽¹⁸⁾

where

∑

is the sum over all values of i and j, i < j.

The PCI ‘CHNSEL’ algorithm was used to find the best subset of seven out of nine (or more) channels using the criterion of maximised average transformed divergence. A branch and bound algorithm (optimisation over subproblems) is employed to perform this optimisation task (PCI Geomatics 2003).

For data sets including topographic data, separability measures were not employed in the overall channel selection because the terrain data cannot be assumed to have a Gaussian distribution for the classes under examination (Benediktsson et al. 1990). In this case, the non-topographic channels as selected before were combined with one or several topographic channels, and their contribution was evaluated through the results of test classifications.

8.7.3 Signature Separability

Signature separability matrices with the pairwise Bhattacharyya distance as the separability measure were calculated in order to evaluate to what extent the 14 classes of the original classification scheme are separable with the help of the IKONOS data and to assess the contribution of the texture features to the separability of the classes. Using the class signatures of the 14 classes, generated from the four multispectral IKONOS channels at 8 m resolution, the signature separability between all pairs of classes was calculated employing the PCI ‘SIGSEP’ algorithm. This was compared to the separability of the class signatures generated from the four multispectral channels and the three texture channels GLCM entropy, GLCM standard deviation and GLCM contrast, which had been selected as the optimal seven channel combination of IKONOS data.

The Bhattacharyya distance between two classes (class i and class j) is calculated from class means and class covariance matrices using equation 20:

BD(i,j) = ²^⋅

[

¹⁻^exp

(

⁻^a

( )

ⁱ^, ^j

) ]

⁽¹⁹⁾

Mi = mean vector of class i, where the vector has n elements (n is the number of channels used) Σi = covariance matrix for class i, which has n by n elements

Σi = determinant of the covariance matrix for class i ( )^T = transpose of matrix

[ ]^-1 = inverse of matrix (PCI Geomatics 2003).

The resulting signature separability matrices were examined and compared. For informational classes which still had very poor separability even after the inclusion of the texture features, the merging or discarding of classes was considered. The signatures of the class pair with the minimum separability value (dense and open secondary forest) were merged. Most of the later classifications were conducted with secondary forest as one class, resulting in a total of 13 instead of 14 classes.

Im Dokument Exploiting the Spatial Information in High Resolution Satellite Data and Utilising Multi-Source Data for Tropical Mountain Forest and Land Cover Mapping (Seite 138-142)