Spatial data is data that contains information about location, shape, and the relationship between objects in space. Unlike ordinary tabular data, spatial data answers not only “what” but also “where”.
Spatial data is the foundation of Geographic Information Systems (GIS), spatial statistics, remote sensing, urban planning, environmental analysis, transportation systems, and modern spatial data science.
Why Spatial Data Matters
Most real-world phenomena are influenced by location.
Examples:
- House prices vary by neighborhood.
- Flood risk depends on elevation and river proximity.
- Traffic congestion differs across roads and regions.
- Disease outbreaks often cluster geographically.
- Population density changes spatially.
In many cases:
“Everything is related to everything else, but near things are more related than distant things.”
This principle is known as Tobler’s First Law of Geography.
Main Components of Spatial Data
Spatial data generally consists of two components:
| Component | Description |
|---|---|
| Spatial Component | Describes location and geometry |
| Attribute Component | Describes characteristics of objects |
Example:
| City | Population | Longitude | Latitude |
|---|---|---|---|
| Jakarta | 10.6 million | 106.8456 | -6.2088 |
| Bandung | 2.5 million | 107.6191 | -6.9175 |
- Longitude and latitude → spatial component
- Population → attribute component
Types of Spatial Data
Spatial data is commonly divided into two major types:
1. Vector Data
Vector data represents discrete objects using geometry.
Geometry Types
| Geometry | Description | Example |
|---|---|---|
| Point | Single location | ATM, school, hospital |
| Line | Connected path | Road, river |
| Polygon | Closed area | Province, lake, land parcel |
Examples:
2. Raster Data
Raster data represents space using grids or pixels.
Each cell contains a value representing information such as:
- elevation,
- temperature,
- rainfall,
- land cover,
- satellite imagery.
Examples:
Spatial Data vs Non-Spatial Data
| Non-Spatial Data | Spatial Data |
|---|---|
| Only attributes | Attributes + location |
| No coordinates | Has coordinates |
| No spatial relationships | Spatial relationships matter |
| Traditional statistics | Spatial statistics needed |
Example:
A regular dataset may show average income. A spatial dataset shows:
- average income,
- location,
- neighboring regions,
- spatial clustering.
Coordinate Systems in Spatial Data
Spatial data requires a coordinate system to define locations on Earth.
The two most common are:
| CRS Type | Example |
|---|---|
| Geographic CRS | Latitude/Longitude |
| Projected CRS | UTM |
Without a proper Coordinate Reference System (CRS):
- distances become incorrect,
- areas become distorted,
- spatial analysis becomes unreliable.
Spatial Relationships
Spatial data is unique because objects interact spatially.
Important spatial relationships include:
| Relationship | Meaning |
|---|---|
| Distance | How far objects are |
| Adjacency | Whether regions share boundaries |
| Containment | Whether an object lies inside another |
| Connectivity | Whether objects are linked |
Example:
- houses near city centers often cost more,
- regions near rivers may have higher flood risk.
Spatial Dependence
One of the most important characteristics of spatial data is spatial dependence.
Nearby observations tend to be more similar than distant observations.
Examples:
- neighboring districts often have similar poverty rates,
- nearby houses tend to have similar prices,
- adjacent regions may share climate patterns.
This violates the classical assumption of independent observations in ordinary regression.
Spatial Heterogeneity
Spatial processes often vary across locations.
This phenomenon is called spatial heterogeneity.
Example:
- distance to CBD may strongly affect house prices in urban areas,
- but have weak influence in rural regions.
This is one reason why methods such as GWR (Geographically Weighted Regression) are important.
Common Spatial Data Formats
| Format | Description |
|---|---|
| Shapefile (.shp) | Classic vector format |
| GeoJSON | Web-based spatial format |
| GeoPackage (.gpkg) | Modern spatial container |
| GeoTIFF | Raster format |
| WKT | Text representation of geometry |
Applications of Spatial Data
Spatial data is used in many fields:
| Field | Example |
|---|---|
| Urban Planning | Land-use analysis |
| Transportation | Route optimization |
| Real Estate | Property valuation |
| Environment | Deforestation monitoring |
| Public Health | Disease mapping |
| Agriculture | Precision farming |
| Disaster Management | Flood mapping |
Spatial Data in R
R has become one of the most powerful environments for spatial analysis.
The most widely used modern package is:
sf
Reading Spatial Data in R
Install Packages
install.packages("sf")
Load Package
library(sf)
Read a Shapefile
library(sf)data <- st_read("districts.shp")print(data)
Display Spatial Data
plot(st_geometry(data))
Creating Spatial Point Data in R
library(sf)df <- data.frame( city = c("Jakarta", "Bandung"), lon = c(106.8456, 107.6191), lat = c(-6.2088, -6.9175))points_sf <- st_as_sf( df, coords = c("lon", "lat"), crs = 4326)plot(st_geometry(points_sf), pch = 16)
Examples:
Why Spatial Data Requires Special Methods
Traditional statistical methods assume:
- observations are independent,
- relationships are constant across space.
Spatial data often violates these assumptions because:
- nearby observations influence each other,
- relationships vary geographically.
As a result, spatial analysis requires specialized methods such as:
- Moran’s I,
- Spatial Regression,
- Geographically Weighted Regression (GWR),
- Kriging,
- Spatial Econometrics.
Conclusion
Spatial data is data that contains geographic location and spatial relationships.
Unlike ordinary data, spatial data allows analysts to understand:
- where phenomena occur,
- how locations interact,
- how spatial patterns emerge.
Spatial data is the foundation of:
- GIS,
- spatial statistics,
- remote sensing,
- spatial econometrics,
- and modern spatial data science.
As spatial technologies continue to grow, understanding spatial data has become an essential skill in data analysis, research, and decision-making.



