• Keine Ergebnisse gefunden

7 Landsat ETM+ Classification

8.8 Classifications of the Eastern Test Area

8.8.4 Object-Oriented Nearest-Neighbour Classification of Segmented Data

A ‘standard nearest neighbour classification’ (StNN) was used to assign the image object primitives of the segmented images to the sampled classes within the eCognition environment. (The term

‘standard nearest neighbour’ means that the same set of features is used for all classes in the calculations of the distance to the nearest neighbours.) For this object-oriented nearest neighbour classification, sample objects are required instead of sample pixels. The segments which were used as sample objects for the classification were chosen to match the training areas in the per-pixel training sample as closely as possible. The eCognition nearest neighbour algorithm computes the distance d in feature space between an image object and a sample object as the Euclidian distance standardised by the standard deviation of all feature values:

where

d = distance between sample object s and image object o

s

vf = feature value of sample object s for feature f

o

vf = feature value of image object o for feature f

σf = standard deviation of the feature values for feature f

For each image object, a fuzzy class membership value in the range of 0 to 1 is calculated for every class. This is done by finding the image object’s closest sample object of each class in feature space and calculating an exponential membership function z(d) based on the distance d.

) 2

(d e kd

z = (23)

where k was defined as k =ln

( )

5 .

The image object is then assigned to the class for which it has the highest degree of membership (Baatz et al. 2002). The membership values for the agroforestry class were multiplied by 0.96 before the class assignment to reduce the assignments to this class. This fulfils the same purpose as the reduction of the class bias in the maximum likelihood classification.

The eCognition software offers many possibilities to fine-tune a classification using the intrinsic features of the image object primitives, like layer values, texture and object shapes, and additional class-specific conditions, including relationships to other objects. However, it would have been very time-consuming to test the many options, using iterative loops of classifications at different hierarchical levels etc. So, keeping in mind that the classification of the segmented data should be comparable to the classifications of the data sets created with other types of spatial integration, only one-step (data-driven) classifications were conducted in eCognition, using only the same features (channel values) which were also used in the classifications of the other data sets.

8.9 Post-Classification Processing

The classification results (except for those of the segmented data) were mode filtered with filter sizes of 3 × 3, 5 × 5 and 7 × 7. The mode filter computes the mode, i.e. the value that occurs most frequently in the filter kernel and assigns it to the central pixel. Isolated pixel values are thus replaced by whatever value constitutes the majority in their neighbourhood. This reduces the noise in a classification and could be regarded as a kind of post-classification spatial integration.

As an alternative to the post-classification processing with a mode filter, a sieve filter (removing class polygons containing less than a specified number of pixels by merging them with their largest neighbour) was also employed in some cases. Sieve filtering did, however, not lead to the same

amount of accuracy improvement as mode filtering, so that it was abandoned in favour of the easy and straightforward mode filtering.

A post-classification sorting with the help of the DEM data, like for the Landsat classification of the whole UCRYN (chapter 7.3), was also taken into consideration. There is, however, only a relatively small range of elevations in the eastern test area (800 m to 1565 m), offering fewer possibilities to define decision rules involving absolute thresholds.

8.10 Accuracy Assessment

The classification results were displayed in the form of thematic maps and visually evaluated and compared to each other and to the ground knowledge. A sample-based accuracy assessment based on error matrices was conducted to calculate quantitative accuracy measures.

900 points were distributed over the classified eastern test area using stratified random sampling.

The stratification made sure that the “number of samples chosen randomly from each class are proportional to the percentage of the image occupied by each class” (PCI Geomatics 2003). (The stratification was based partly on the 14-class MLC result of data set 14 and partly on the MLC result of data set 26, see table 14.) This was done after the classification and also after the last field work campaign, so that the determination of the land cover classes had to rely on the previously collected data. Not all the sample sites had been visited on the ground. (And some parts of the eastern test area, particularly in the Scientific Reserve Ebano Verde, were not accessible on the ground anyway.)

The land cover classes of 583 of these 900 sample points could be determined using a combination of the knowledge and data collected on the ground during field work (e.g. annotations on the paper copies of the IKONOS image, ground photographs and notes relating to GPS waypoints) and visual interpretation of the IKONOS image itself (using several channel combinations, including pan-sharpened, for the display) and of the oblique and the historical aerial photographs. In some cases, the DEM, the Progressio (1995) land use map and a vegetation map of the Scientific Reserve Ebano Verde (García et al. 1994: fig. 5) also helped to determine the true land cover class. Not all of the sample sites had been visited during field work, but in many cases the land cover class could still be identified using the other available data. If a class could be clearly identified in the IKONOS image itself, this had the advantage that there could not be any geometric registration error or temporal difference between reference data and classified data. The historical aerial photographs helped to differentiate between mature cloud forest (forest in all the images) and secondary forest (treeless areas in 1966). The 1984 aerial photographs, showing the young pine plantations of that time, helped to distinguish pine plantations from other kinds of forest. The reference class was assigned if a majority land cover type could be identified in the 4 m × 4 m as well as the 8 m × 8 m area which

the sample point referred to. However, for a third of the points, the land cover class could not be determined with confidence, so they were dropped from the sample. This affected mostly sample points in areas without ground truth where the manual interpretation of the remote sensing data did not deliver unambiguous results. By contrast, sample points in transitional land cover areas or stages, e.g. between grassland and matorral or between matorral and secondary forest, were not dropped, but a decision was made to assign one of the classes to them. In all cases, the reference class was assigned blindly, i.e. without knowing the result of any of the classifications for the point in question.

The reference data were originally produced for the 14 class classification scheme. Both open and closed secondary forest reference points were later allocated to the merged class of secondary forest to test the accuracy of the 13 class classifications.

Figure 25: The reference points used for the accuracy assessment in the eastern test area.

The resulting 583 reference points were distributed over all classes and over the whole classified area (figure 25), including points close to class boundaries. For some kinds of spatially integrated data, up to six of the reference points ended up in non-classified areas and were then not used in the accuracy calculations. The testing data are independent of the training data. Using this set of reference points, accuracy assessments were conducted for all classifications of the eastern test area, comparing the reference classes and the classes predicted by the classifier (and possibly modified during post-classification processing). The results of the comparisons were reported in the form of error matrices.

The overall classification accuracy (percentage correct) was calculated for all classifications, as well as the class-specific user’s and producer’s accuracies. To make the class-specific accuracies comparable to each other without having to differentiate between user’s and producer’s accuracies, these two values were multiplied for each class. The 95 % confidence interval was calculated for the overall classification accuracy and the per-class user’s and producer’s accuracies as a function of the number of samples and the proportion of correctly classified samples.

Additionally, the Kappa index of agreement (Cohen 1960) was calculated as the actually observed agreement Acorrect adjusted by the expected chance agreement Achance.

KIA =

where ACorrect is the overall accuracy, i.e. the sum of observations xii in row i and column i (in the agreement diagonal) of an error matrix for r classes divided by the total number of observations N:

N dividing the sum of the resulting chance values in the agreement diagonal by N²:

²

To test the amount of error introduced by mixed pixels and by edge effects in the texture data, all reference points closer than 12 m to a land cover boundary (as interpreted in the pan-sharpened IKONOS image) were deleted from the test sample, reducing it to 333 points. This should eliminate those testing pixels for which the texture values calculated in a 15 m × 15 m window are likely to be influenced by between-class texture. Accuracy assessments for the spectral-textural classifi-cations at 4, 8, 12 and 16 m resolution were conducted with this set of reference points and the overall accuracies were compared to those produced with the complete set of 583 points.

The reference sample for the accuracy assessment of one classification was ‘fuzzified’ by reassessing the misclassified pixels according to the linguistic scale of Woodcock & Gopal (2000).

For each reference point in the subset, all classes were assigned discrete levels of class membership between 1 and 5, according to the following description:

“(5) Absolutely right: No doubt about the match. Perfect.

(4) Good answer: Would be happy to find this answer given on the map.

(3) Reasonable or acceptable answer: Maybe no the best possible answer but it is acceptable; this answer does not pose a problem to the user if it is seen on the map.

(2) Understandable but wrong [...]

(1) Absolutely wrong” (Woodcock & Gopal 2000:156).

Classes with scores of at least 3 (at least acceptable answers) were then counted as ‘right’ for the reference point. This made it possible to calculate a fuzzy overall accuracy, i.e. the percentage of pixels for which the classified value matches at least one reference class with a score between 3 and 5.

Finally, some classes were aggregated in the error matrix to calculate the overall accuracy for less detailed classifications.

9 Results and Discussion of Processing Methods and Classifications Involving IKONOS