Training set

Introduction

A sample of a specific class, comprising a number of training cells, forms a cluster in the feature space. See also Supervised Image Classfication Algorithm.

Explanation

The clusters selected by the operator:

  • should form a representative data set for a given class. This means that the variability of a class within the image should be taken into account. Also, in an absolute sense, a minimum number of observations per cluster is required. Although it depends on the classifier algorithm to be used, a useful rule of thumb is 30 × n (n = number of bands) observations.

  • should not or only partially overlap with the other clusters, otherwise a reliable separation is not possible. For a specific data set, some classes may have significant spectral overlap, which, in principle, means that these classes cannot be discriminated by image classification. Solutions are to add other spectral bands, and/or add images acquired at other moments.

The resulting clusters can be characterized by simple statistics of the point distributions. These are for one cluster: the vector of mean values of the DNs (for band 1 and band 2), and the standard deviations of the DNs (for band 1 and band 2), where the standard deviations are plotted as crosses).

Outgoing relations

Learning paths