Introduction to Spatial Data Analysis

Introduction to Exploratory Spatial Data Analysis

Definition of Spatial Data Analysis

  • Spatial data analysis involves examining locations, attributes, and relationships of features in spatial data using statistical and computational techniques.

Exploratory Data Analysis (EDA)

  • EDA coined by John Tukey (Tukey 1977)
  • Set of statistical tools designed to
    • discover “indications of unexpected phenomena”
    • “display the unanticipated”
    • “uncover potentially explicable patterns”

EDA Approach

  • Abductive reasoning
  • Interaction between data exploration and human perception to
    • detect patterns
    • formulation of hypotheses

Exploratory Spatial Data Analysis (ESDA)

A collection of techniques to describe and visualize spatial distributions, identify atypical locations or spatial outliers, discover patterns of spatial association, clusters or hot spots and suggest spatial regimes or other forms of spatial heterogeneity

Anselin (1999)

Exploratory Spatial Data Analysis (ESDA)

  • EDA extended to spatial data
  • Maps play a central role, but it doesn’t end with maps
  • Geovisualization, geospatial visual analytics
  • Combine visualizations with specialized quantitive measures

Importance of Spatial Data Analysis

  • Applications in various fields: urban planning, environmental science, public health, economics, etc.
  • Growing relevance with the rise of Geographic Information Systems (GIS) and spatial technologies.

Historical Context

  • Early use in geography and epidemiology.
  • Evolution with the development of GIS and advanced computational tools.

Snow Map

Arribas-Bel, de Graaff, and Rey (2017)

Snow Map

Arribas-Bel, de Graaff, and Rey (2017)

Scope of Spatial Analysis

Rey et al. (2022)

Types of Spatial Data

Vector Data

  • Definition: Represents spatial features using points, lines, and polygons.
  • Examples:
    • Points: Locations of cities, schools, or hospitals.
    • Lines: Roads, rivers, or pipelines.
    • Polygons: Land parcels, administrative boundaries, or lakes.
  • Applications: Urban planning, transportation networks, cadastral mapping.

Vector Data

Raster Data

  • Definition: Represents spatial phenomena as a grid of cells or pixels, each with a value representing a specific attribute.
  • Examples:
    • Satellite images, digital elevation models (DEMs), land cover maps.
  • Applications: Environmental monitoring, remote sensing, agricultural analysis.

Raster Data

Attribute Data

  • Definition: Non-spatial information associated with spatial features.
  • Examples:
    • Population data linked to census tracts, land use types associated with parcels.
  • Importance: Provides context and meaning to spatial locations and features.

Attribute Data

Spatio-Temporal Data

  • Definition: Spatial data that includes a time component, showing how spatial phenomena change over time.
  • Examples:
    • Spread of diseases, changes in land use, migration patterns.
  • Applications: Epidemiology, climate change studies, urban development.

Spatio-Temporal Data

Knaap et al. (2019)

Spatial Data Sources and Acquisition

Remote Sensing

  • Definition: The process of collecting data about the Earth’s surface from a distance, typically using satellites or aircraft.
  • Examples: Landsat, MODIS, LiDAR.
  • Applications: Environmental monitoring, disaster management, agricultural assessments.

Geographic Information Systems (GIS)

  • Definition: A system designed to capture, store, manipulate, analyze, manage, and present spatial or geographic data.
  • Components: Hardware, software, data, methods, and people.
  • Applications: Urban planning, transportation, environmental management.

Global Positioning System (GPS)

  • Definition: A satellite-based navigation system that provides location and time information.
  • Applications: Navigation, mapping, field data collection.

Crowdsourced Data

  • Definition: Data collected from a large number of people, often through mobile devices or online platforms.
  • Examples: OpenStreetMap, social media check-ins.
  • Applications: Disaster response, urban planning, public health monitoring.

Key Concepts in Spatial Data Analysis

Spatial Autocorrelation

  • Definition: The degree to which objects close to each other in space are also similar in other attributes.
  • Examples: Clustered patterns of disease, similar land uses in neighboring areas.
  • Measurement: Moran’s I, Geary’s C.

Spatial Autocorrelation

Spatial Scale and Resolution

  • Definition: The level of detail at which spatial data is observed or represented.
  • Examples: Global, regional, local scales.
  • Implications: Affects the analysis and interpretation of spatial data.

Modifiable Areal Unit Problem (MAUP)

  • Definition: The issue that the results of spatial analysis can vary depending on the spatial units used.
  • Examples: Changing the boundaries of districts can change the outcomes of an analysis.
  • Considerations: Important in the design and interpretation of spatial studies.

Modifiable Areal Unit Problem (MAUP)

Spatial Interpolation

  • Definition: The process of estimating unknown values at certain locations based on known values at other locations.
  • Examples: Estimating temperature or pollution levels across a region.
  • Methods: Kriging, Inverse Distance Weighting (IDW).

Spatial Interpolation

Conclusion

Recap of Key Points

  • Definitions of EDA, ESDA
  • Types and Sources of Spatial Data
  • Key Concepts in Spatial Data Analysis

Questions

References

Anselin, L. 1999. “Interactive Techniques and Exploratory Spatial Data Analysis.” In Geographical Information Systems: Principles, Techniques, Management and Applications, edited by P. A. Longley, M. Goodchild, D. J. Maguire, and D. W. Rhind, 251–64.
Arribas-Bel, Daniel, Thomas de Graaff, and Sergio J. Rey. 2017. “Looking at John Snow’s Cholera Map from the Twenty First Century: A Practical Primer on Reproducibility and Open Science.” In Regional Research Frontiers - Vol. 2: Methodological Advances, Regional Systems Modeling and Open Sciences, edited by Randall Jackson and Peter Schaeffer, 283–306. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-50590-9_17.
Knaap, Elijah, Wei Kang, Sergio Rey, Levi John Wolf, Renan Xavier Cortes, and Su Han. 2019. “Geosnap: The Geospatial Neighborhood Analysis Package.” Zenodo. https://doi.org/10.5281/ZENODO.3526163.
Rey, Sergio J., Luc Anselin, Pedro Amaral, Dani Arribas-Bel, Renan Xavier Cortes, James David Gaboardi, Wei Kang, et al. 2022. “The PySAL Ecosystem: Philosophy and Implementation.” Geographical Analysis 54 (3): 467–87. https://doi.org/10.1111/gean.12276.
Tukey, J. W. 1977. Exploratory Data Analysis. New York: Addison-Wesley.