1192 - Understand the importance of using spatially independent validation samples to assess the quality of the classification results

Understand the importance of using spatially independent validation samples to assess the quality of the classification results

Concepts

  • [IP4-2-1] Accuracy assessment
    A growing set of EO services and applications produce EO products that describe various aspects of the land, ocean and atmosphere. These products include for example image products at different processing levels, geometric measurements like in digital elevation models, semantic labelling products like land cover classifications, and EO-derived attribute products concerning air quality or other geophysical and biophysical parameters. Same as any geospatial data, EO products are not free of error and require accompanying documentation of their product quality. One term for describing different quality dimensions of an EO product is accuracy. Accuracy is a measure to estimate the uncertainty that originates from errors. An error is the deviation of a map value from a true value. The concept of error assumes well-defined phenomena where deviation results from imperfection of measurement equipment, environment effects, or imperfections of the observer. They cause gross errors and blunders, systematic errors, and random errors, for which different approaches are necessary to minimize error. Ideally, only random error remains that is probabilistic in nature and can be assessed with statistical approaches. For poorly defined phenomena, the concept of vagueness applies. For example in the case of thematic maps using fuzzy sets, the accuracy assessment requires a fuzzy approach as well. Judging error requires reference data with higher accuracy (by an order of magnitude) to which the map value can be compared. EO product quality dimensions about accuracy include thematic accuracy, spatial accuracy (both horizontal and vertical), radiometric accuracy, and accuracy of biophysical/geophysical parameter measurements. Respective equipment and approaches for reference data collection includes ground verification for thematic maps, GNSS positioning devices, field spectrometers, air quality sensors and in-situ biomass estimation. Ideally, reference data is collected in the field. In case of inaccessible areas of interest and/or if the service requirements allow it, approaches may rely on proxy reference data. The design of the accuracy assessment procedure should be done with the EO product design to match the requirements of the EO service. For example, a thematic accuracy assessment consists of the main three components of response design, analysis, and sampling design. The response design ensures that reference data and map data are comparable at a location and specifies under which cases they agree or disagree. The analysis, usually performed with an error matrix, specifies which quality indicators will be calculated to quantify accuracy. The sampling design specifies the subset of locations at which the response design will be applied. Depending on the classification process and application case, different sampling strategies can be suitable (e.g. clustered sampling, stratified random sampling). For other accuracy dimensions, respective accuracy assessment procedures exist, e.g. root mean squared error (RSME) for the positional accuracy assessment. After an accuracy assessment has been performed and the uncertainty in the EO product is understood, the challenge is to clarify how the uncertainty affects subsequent spatial analyses with the EO product. Different strategies exist that ignore error completely or that account for error by modelling uncertainty in the analysis outcomes. If uncertainty is judged low enough (or more hazardous, if users are unaware of the limited accuracy), subsequent analyses accept the EO product as true and ignore the accuracy value. If uncertainty is incorporated in subsequent analysis through uncertainty modelling, the results describe the bandwidth of outcomes, potentially supported with appropriate visualisations of uncertainty. The uncertainty modelling approach may greatly enhance the usability of the EO product, because it informs better how the error impacts the EO information and how much confidence a user should have in it. With a new generation of EO products on the horizon and a largely increased user community, a large number of new applications is to be expected. They may also identify innovative accuracy assessment approaches. For example, the availability of EO archives with long time series of EO data led to response design protocols tailored to collect time series of reference data. The use of volunteered geographic information (VGI) as reference data has great potential, if approaches are implemented that ensure its reliability. Methods for object-based accuracy assessment are continued to be developed. Further, the increasing number of EO parameter products based on continuous variables creates the need to describe their accuracy. Finally, the focus on validation of EO products during EO service development and operation will make feedback from users available to service providers, ultimately leading to more meaningful EO products with more meaningful accuracy metrics and other quality indicators.