Spatial Data Collection

Data collection

Introduction

Spatial data can be obtained from various sources. It can be collected from scratch, using direct spatial-data acquisition techniques, or indirectly, by making use of existing spatial data collected by others. The first source could include field survey data and remotely sensed images. To the second source belong printed maps and existing digital data sets.

Explanation

One way to obtain spatial data is by direct observation of relevant geographic phenomena. This can be done through ground-based field surveys or by using remote sensors on satellites or aircraft. Many Earth science disciplines have developed specific survey techniques as ground-based approaches remain the most important source of reliable data in many cases.

Data that are captured directly from the environment are called primary data. With primary data, the core concern in knowing their properties is to know the process by which they were captured, the parameters of any instruments used, and the rigour with which quality requirements were observed.

In practice, it is not always feasible to obtain spatial data by direct capture. Factors of cost and available time may be a hindrance, and sometimes previous projects have acquired data that may fit a current project’s purpose.

In contrast to direct methods of data capture, spatial data can also be sourced indirectly. This includes data derived by scanning existing printed maps, data digitized from a satellite image, processed data purchased from data-capture firms or international agencies, and so on. This type of data is known as secondary data. Secondary data are derived from existing sources and have been collected for other purposes, often not connected with the investigation at hand.

Over the past two decades, spatial data have been collected in digital form at an increasing rate and stored in various databases by the individual producers for their own use and for commercial purposes. More and more of these data are being shared among GIS users. There are several reasons for this. Some data are freely available, yet other data are only available commercially, as is the case for most satellite imagery. High quality data remain both costly and time consuming to collect and verify, as well as the fact that more and more GIS applications are looking at not just local, but national or even global, processes. New technologies have played a key role in the increasing availability of geospatial data. As a result of this increasing availability, we have to be more and more careful that the data we have acquired are of sufficient quality to be used in analyses and decision-making.

There are several related initiatives in the world to supply base data sets at national, regional and global levels, as well as those aiming to harmonize data models and definitions of existing data sets. Global initiatives include, for example, the Global Map, the USGS Global GIS database and the Second Administrative Level Boundaries (SALB) project. SALB, for instance, is a UN project aiming at improving the availability of information about administrative boundaries in developing countries.

Data formats and standards

An important problem in any environment involved in digital data exchange is that of data formats and data standards. Different formats have been implemented by various GIS vendors, and different standards came about under different standardization committees. The phrase “data standard” refers to an agreed way, in terms of content, type and format, of representing data in a system. The good news about both formats and standards is that there are many to choose from; the bad news is that this can lead to a range of conversion problems. Several meta-data standards for digital spatial data exist, including those of the International Organization for Standardization (ISO) and the Open Geospatial Consortium (OGC).

Learning outcomes

  • 9 - Data entry: data input techniques

    Describe and explain standard spatial (and non-spatial) data input techniques (non RS) including the management of the data collection process (level 1 and 2).

Prior knowledge

Outgoing relations

Incoming relations

Learning paths