Sampling is needed to limit the observations for statistical analysis. In raster image analysis, various sampling schemes have been proposed for selecting pixels to test. Choices to be made relate to the design of the sampling strategy, the number of samples required, and the area of the samples. Recommended sampling strategies in the context of land cover data are simple random sampling or stratified random sampling. The number of samples may be related to two factors in accuracy assessment: (1) the number of samples that must be taken in order to reject a data set as being inaccurate; or (2) the number of samples required to determine the true accuracy, within some error bounds, of a data set. Sampling theory is used to determine the number of samples required. The number of samples must be traded-off against the area covered by a sample unit. A sample unit can be a point but it could also be an area of some size; it can be a single raster element but may also include surrounding raster elements. Among other considerations, the “optimal” sample-area size depends on the heterogeneity of the class.
The similar concept "[IP3-4-9] Sampling strategies" has been merged to AM8-1 and was deleted.
Draft is here:
https://docs.google.com/document/d/1czz84dsrjNMF3aIbcYCnu_0ADFLemEwyAPpfDNZ0wUY/edit
Description of IP3-4-9 Sampling Strategies:
Sampling strategies or sampling pattern specifies the arrangement of observations used for training and/or validation purposes. Typically, the simple random sample of a geographic region is defined by first dividing the region to be studied into a network of cells. Each row and column in the network is numbered, then a random number table is used to select values that, taken two at a time, form coordinate pairs for defining the locations of observations. Because the coordinates are selected at random, the locations they define should be positioned at random. The random sample is probably the most powerful sampling strategy available as it yields data that can be subjected to analysis using inferential statistics. A stratified sampling pattern assigns observations to subregions of the image to ensure that the sampling effort is distributed in a rational manner. For example, a stratified sampling effort plan might assign specific numbers of observations to each category on the map to be evaluated. This procedure would ensure that every category would be sampled. Systematic sampling positions observations at equal intervals according to a specific strategy. Because selection of the starting point predetermines the positions of all subsequent observations, data derived from systematic samples will not meet the requirements of inferential statistics for randomly selected observations.
Discuss how the choice of sampling strategy impacts the classification result
Discuss how the choice of sampling strategy impacts the accuracy assesment for a classification result
In progress (GI-N2K)