Moran’s I Explained - Spatial Statistics Made Simple

Moran’s I is one of the most widely used statistics for measuring spatial autocorrelation. It evaluates whether similar values cluster together across geographic space. In simple terms:

Moran’s I measures whether nearby locations tend to have similar or dissimilar values.

It is a fundamental method in:

GIS,
spatial statistics,
spatial econometrics,
spatial data science.

Why Moran’s I Is Important

Traditional statistical methods assume observations are independent. However, spatial data often violates this assumption because nearby locations tend to influence one another. Examples:

neighboring districts may have similar poverty rates,
nearby houses often have similar prices,
adjacent regions may share climate conditions.

Moran’s I helps quantify these spatial relationships.

Moran’s I Formula

The global Moran’s I statistic is defined as:

$I=\frac{n}{\sum_i\sum_j w_{ij}}\frac{\sum_i\sum_j w_{ij}(x_i-\bar{x})(x_j-\bar{x})}{\sum_i(x_i-\bar{x})^2}$ I=∑i∑jwijn∑i(xi−xˉ)2∑i∑jwij(xi−xˉ)(xj−xˉ)

Where:

Symbol	Description
$n$ n	Number of observations
$x_i$ xi	Value at location $i$ i
$x_j$ xj	Value at neighboring location $j$ j
$\bar{x}$ xˉ	Mean value
$w_{ij}$ wij	Spatial weight between locations

Understanding the Formula

Moran’s I compares:

deviations from the mean,
weighted by spatial proximity.

If nearby observations have similar deviations: the statistic becomes positive. If nearby observations are dissimilar: the statistic becomes negative.

Interpretation of Moran’s I

Moran’s I Value	Interpretation
$I > 0$ I>0	Positive spatial autocorrelation
$I < 0$ I<0	Negative spatial autocorrelation
$I \approx 0$ I≈0	Spatial randomness

Moran’s I Value Range

Theoretically: $-1 \leq I \leq 1$ −1≤I≤1

However, actual values depend on:

data distribution,
spatial configuration,
weight matrix structure.

Spatial Weight Matrix

Moran’s I requires a spatial weight matrix that defines neighborhood relationships. Examples:

shared boundaries,
geographic distance,
nearest neighbors.

Spatial Weight Matrix Example

Suppose four regions:

Region	Neighbor
A	B, C
B	A, D
C	A
D	B

The spatial weight matrix may look like:

$W=\begin{bmatrix}0&1&1&0\\1&0&0&1\\1&0&0&0\\0&1&0&0\end{bmatrix}$ W=0110100110000100

Common Types of Spatial Weights

Method	Description
Queen Contiguity	Shared edge or corner
Rook Contiguity	Shared edge only
Distance-Based	Based on distance threshold
k-Nearest Neighbor	Closest neighbors

Expected Value of Moran’s I

Under spatial randomness:

$E(I)=-\frac{1}{n-1}$ E(I)=−n−11

This means: Moran’s I is not centered exactly at zero, especially for small sample sizes.

Hypothesis Testing

Moran’s I is commonly used with hypothesis testing.

Null Hypothesis

$H_0:\ \text{No spatial autocorrelation}$ H0: No spatial autocorrelation

Alternative Hypothesis

$H_1:\ \text{Spatial autocorrelation exists}$ H1: Spatial autocorrelation exists

Moran Scatter Plot

The Moran Scatter Plot visualizes: standardized values, spatially lagged values.

It helps identify: clusters, outliers, spatial dependence patterns.

Moran Scatter Plot Illustration

Quadrants of Moran Scatter Plot

Quadrant	Meaning
High-High (HH)	Cluster of high values
Low-Low (LL)	Cluster of low values
High-Low (HL)	Spatial outlier
Low-High (LH)	Spatial outlier

Moran’s I in R

R provides powerful tools for spatial autocorrelation analysis. Common packages:

sf
spdep
tmap

Install Required Packages

install.packages(c("sf", "spdep"))

Load Packages

library(sf)library(spdep)

Read Spatial Data

data <- st_read("districts.shp")

Suppose the dataset contains: district boundaries, poverty rate variable.

Visualize Spatial Data

plot(data["poverty_rate"])

Example Choropleth Map

Create Neighbor Structure

neighbors <- poly2nb(data)

This creates spatial neighbors based on polygon adjacency.

Create Spatial Weights

weights <- nb2listw(neighbors)

Compute Moran’s I

moran.test(data$poverty_rate, weights)

Example Output

Moran I statistic standard deviate = 5.12p-value = 0.000002Moran's I = 0.47

Interpretation

The results indicate:

Result	Interpretation
Moran’s I = 0.47	Moderate positive spatial autocorrelation
p-value < 0.05	Spatial clustering is statistically significant

Meaning: nearby districts tend to have similar poverty rates.

Moran Scatter Plot in R

moran.plot(data$poverty_rate, weights)

Local Moran’s I

Global Moran’s I summarizes the overall spatial pattern. However, it does not identify: where clusters occur, where hotspots exist. For that purpose: Local Moran’s I (LISA) is used.

Compute Local Moran’s I

local_moran <- localmoran(data$poverty_rate, weights)data$Ii <- local_moran[,1]

Visualize Local Moran’s I

plot(data["Ii"])

LISA Cluster Map Example

Moran’s I and Regression Analysis

Moran’s I is often used to test spatial dependence in regression residuals. If residuals show spatial autocorrelation: OLS assumptions are violated, inference becomes unreliable.

This motivates spatial regression models such as: Spatial Lag Model (SLM), Spatial Error Model (SEM), Geographically Weighted Regression (GWR).

Common Mistakes

1. Ignoring Coordinate Systems

Distance calculations depend on proper CRS.

2. Using Incorrect Spatial Weights

Different neighborhood definitions may produce different results.

3. Misinterpreting Clustering

Spatial clustering does not automatically imply causality.

Applications of Moran’s I

Field	Application
Real Estate	House price clustering
Epidemiology	Disease hotspots
Urban Planning	Regional inequality
Environment	Pollution clustering
Agriculture	Soil variability
Transportation	Traffic congestion

Moran’s I vs Traditional Correlation

Traditional Correlation	Moran’s I
Relationship between variables	Relationship across geographic space
No spatial component	Explicit spatial structure
Independent observations assumed	Spatial dependence considered

Conclusion

Moran’s I is one of the most important statistics in spatial analysis. It measures whether geographic observations are: clustered, dispersed, or randomly distributed. Understanding Moran’s I is essential for: GIS, spatial statistics, spatial econometrics, spatial data science. It provides a foundation for: hotspot detection, spatial regression, spatial clustering, advanced spatial modeling.