We were modelling the Public Health Development Index (PHDI) using the Geographically Weighted Regression which generate distinct regression models for each specific location. In the previous article, we have done step 1 of Geographically Weighted Regression (GWR) for Community Health Index which was Initial Exploration: Visualize your data on a map to look for clear patterns. Now we’re going to Step 2 which is Test of Spatial Dependence (Global Moran’s I) on the Dependent Variable: To see if there are clusters of values on the variable you want to explain.
We calculate the Moran’s index using various weight matrices, both distance-based and neighborhood-based. The most popular distance weight matrices are Key Nearest Neighbor (KNN), Radial Distance Weight, Power Distance Weight, and Exponential Distance Weight. Neighborhood weight matrices include Rook Contiguity and Queen Contiguity. The weight matrix that produces the highest Moran’s index value will be selected for testing.
To calculate the weight matrix based on distance, we need to determine the coordinates of each location. The longitude and latitude coordinates can be extracted from the geometry attributes in the .shp file, which can be downloaded in the previous article.
#Extract centroid for each MULTIPOLYGON
centroids_sf <- st_centroid(mapx)
centroid_coords <- st_coordinates(centroids_sf)
head(centroid_coords)
#result:
X Y
[1,] 106.5794 -5.700150
[2,] 106.8100 -6.272549
[3,] 106.9003 -6.255373
[4,] 106.8351 -6.181230
[5,] 106.7483 -6.165469
[6,] 106.8695 -6.129185
next, we can merge the coordinates into the variable mapx.
mapx$long<-centroid_coords[,1]
mapx$lat<-centroid_coords[,2]
mapx
#result:
Simple feature collection with 119 features and 22 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 105.0998 ymin: -8.78036 xmax: 116.2702 ymax: -5.048857
Geodetic CRS: WGS 84
First 10 features:
PROVNO KABKOTNO PROVINSI KABKOT ID2013 geometry Y X1 X2 X3 long lat
1 31 01 DKI JAKARTA KEPULAUAN SERIBU 3101 MULTIPOLYGON (((106.3842 -5... 0.6385 85.00 17.971 6 106.5794 -5.700150
2 31 71 DKI JAKARTA JAKARTA SELATAN 3171 MULTIPOLYGON (((106.8491 -6... 0.6737 90.00 30.138 72 106.8100 -6.272549
3 31 72 DKI JAKARTA JAKARTA TIMUR 3172 MULTIPOLYGON (((106.9719 -6... 0.6804 91.00 31.450 84 106.9003 -6.255373
4 31 73 DKI JAKARTA JAKARTA PUSAT 3173 MULTIPOLYGON (((106.8788 -6... 0.6503 90.00 20.456 35 106.8351 -6.181230
5 31 74 DKI JAKARTA JAKARTA BARAT 3174 MULTIPOLYGON (((106.711 -6.... 0.6494 90.00 24.460 73 106.7483 -6.165469
6 31 75 DKI JAKARTA JAKARTA UTARA 3175 MULTIPOLYGON (((106.969 -6.... 0.6365 90.00 28.309 45 106.8695 -6.129185
7 32 01 JAWA BARAT BOGOR 3201 MULTIPOLYGON (((106.994 -6.... 0.6098 81.06 16.918 101 106.7675 -6.559979
8 32 02 JAWA BARAT SUKABUMI 3202 MULTIPOLYGON (((106.9652 -6... 0.6057 74.58 13.788 58 106.7076 -7.076208
9 32 03 JAWA BARAT CIANJUR 3203 MULTIPOLYGON (((107.2843 -6... 0.6016 93.62 10.065 45 107.1578 -7.133713
10 32 04 JAWA BARAT BANDUNG 3204 MULTIPOLYGON (((107.75 -6.8... 0.6065 77.81 15.152 62 107.6108 -7.099969
we van also save the centroids as a new shapefile (optional):
st_write(centroids_sf, "D:/javamap/jawawithcentroids.shp")
Plot the centroid and calculate the distance.
plot(centroid_coords)

djarak<-dist(longlat)
m.djarak<-as.matrix(djarak)
K-Nearest Neighbor Weight
#k=3
W.knn<-knn2nb(knearneigh(longlat,k=3,longlat=TRUE))
W.knn
#result:
Neighbour list object:
Number of regions: 119
Number of nonzero links: 357
Percentage nonzero weights: 2.521008
Average number of links: 3
Non-symmetric neighbours list
W.knn1 <- nb2listw(W.knn,style='W')
W.knn1
#result:
Radial Distance Weight
#d=110, The dmax threshold value is set at 110 km
W.dmax<-dnearneigh(longlat,0,110,longlat=TRUE)
W.dmax.s <- nb2listw(W.dmax,style='W')
#W is row standardised (sums over all links to n).
W.dmax.s
Power Distance Weight / Invers Distance Weight
alpha1=1
W.idw <-1/(m.djarak^alpha1)
#normalization
#dinormalisasi
diag(W.idw)<-0
rtot<-rowSums(W.idw,na.rm=TRUE)
W.idw.sd<-W.idw/rtot #row-normalized
rowSums(W.idw.sd,na.rm=TRUE)
W.idw.1 = mat2listw(W.idw.sd,style='W')
summary(W.idw.1)
alpha1=2
W.idw2 <-2/(m.djarak^alpha1)
diag(W.idw2)<-0
rtot<-rowSums(W.idw2,na.rm=TRUE)
W.idw.sd2<-W.idw2/rtot #row-normalized
rowSums(W.idw.sd2,na.rm=TRUE)
W.idw.22 = mat2listw(W.idw.sd2,style='W')
summary(W.idw.22)
Exponential Distance Weight
alpha=1
W.e<-exp((-alpha)*m.djarak)
diag(W.e)<-0
rtot<-rowSums(W.e,na.rm=TRUE)
W.e.sd<-W.e/rtot #row-normalized
rowSums(W.e.sd,na.rm=TRUE)
W.ed1 = mat2listw(W.e.sd,style='W')
summary(W.ed1)
#for alpha=2
alpha=2
W.e2<-exp((-alpha)*m.djarak)
diag(W.e2)<-0
rtot<-rowSums(W.e2,na.rm=TRUE)
W.e.sd2<-W.e2/rtot #row-normalized
rowSums(W.e.sd2,na.rm=TRUE)
W.ed2 = mat2listw(W.e.sd2,style='W')
summary(W.ed2)
Matriks Bobot berdasarkan ketetanggaan (Spatial Contiguity Weight)
Rook Contiguity
sf::sf_use_s2(FALSE)
# Spherical geometry (s2) switched off
W.rook<-poly2nb(mapx,queen=FALSE)
W.rook
#result:
Neighbour list object:
Number of regions: 119
Number of nonzero links: 518
Percentage nonzero weights: 3.657934
Average number of links: 4.352941
1 region with no links:
1
3 disjoint connected subgraphs
W.rook.s <- nb2listw(W.rook,style='W', zero.policy = TRUE) #W is row standardised (sums over all links to n). Standardisasi Baris
print(W.rook.s, zero.policy=TRUE)
Queen Contiguity
W.queen<-poly2nb(peta,queen=TRUE)
W.queen
#result:
Neighbour list object:
Number of regions: 119
Number of nonzero links: 522
Percentage nonzero weights: 3.68618
Average number of links: 4.386555
1 region with no links:
1
3 disjoint connected subgraphs
W.queen.s <- nb2listw(W.queen,style='W', zero.policy = TRUE) #W is row standardised (sums over all links to n).
print(W.queen.s, zero.policy=TRUE)
Now we choose weight matrices
#KNN
MI.knn <- moran(dfx$Y, W.knn1, n=length(W.knn1$neighbours), S0=Szero(W.knn1))
MI.knn
#result:
$I
[1] 0.480363
$K
[1] 3.678483
#Radial Distance Weight
MI.radial <- moran(dfx$Y, W.dmax.s, n=length(W.dmax.s$neighbours), S0=Szero(W.dmax.s))
MI.radial
#result:
$I
[1] 0.2453089
$K
[1] 3.678483
#Power Distance Weight
MI.power1 <- moran(dfx$Y, W.idw.1, n=length(W.idw.1$neighbours), S0=Szero(W.idw.1))
MI.power1
#result:
$I
[1] 0.1323094
$K
[1] 3.678483
MI.power2 <- moran(dfx$Y, W.idw.22, n=length(W.idw.22$neighbours), S0=Szero(W.idw.22))
MI.power2
#result:
$I
[1] 0.3299451
$K
[1] 3.678483
#Exponential Distance Weight
MI.exp1 <- moran(dfx$Y, W.ed1, n=length(W.ed1$neighbours), S0=Szero(W.ed1))
MI.exp1
#result:
$I
[1] 0.1238541
$K
[1] 3.678483
MI.exp2 <- moran(dfx$Y, W.ed2, n=length(W.ed2$neighbours), S0=Szero(W.ed2))
MI.exp2
#result:
$I
[1] 0.2328439
$K
[1] 3.678483
#Rook Contiguity
MI.rook <- moran.test(dfx$Y, W.rook.s, randomisation = TRUE, zero.policy = TRUE)
MI.rook$estimate
#result:
Moran I statistic Expectation Variance
0.505966592 -0.008547009 0.004569424
#Queen Contiguity
MI.queen <- moran.test(dfx$Y, W.queen.s, randomisation = TRUE, zero.policy = TRUE)
MI.queen$estimate
#result:
Moran I statistic Expectation Variance
0.504911494 -0.008547009 0.004549995
Comparison of Moran’s Index Values
moranindeks<-data.frame(
"Matriks Bobot"=c("KNN (k=5)", "Radial distance weight", "Power distance weight (alpha=1)", "Power distance weight (alpha=2)", "Exponential distance weight (alpha=1)", "Exponential distance weight (alpha=2)", "Rook Contiguity", "Queen Contiguity"),
"Nilai Indeks Moran"=c(MI.knn$I, MI.radial$I, MI.power1$I, MI.power2$I, MI.exp1$I, MI.exp2$I, -0.008530337, -0.010844530))
moranindeks
#result:
Matriks.Bobot
<chr>
Nilai.Indeks.Moran
<dbl>
KNN (k=5) 0.480363044
Radial distance weight 0.245308916
Power distance weight (alpha=1) 0.132309418
Power distance weight (alpha=2) 0.329945074
Exponential distance weight (alpha=1) 0.123854065
Exponential distance weight (alpha=2) 0.232843892
Rook Contiguity -0.008530337
Queen Contiguity -0.010844530
8 rows
Conclusion:
The weight matrix that produces the largest Moran’s Index value is the knn weight matrix k=3 (W.knn1).
Next, the Moran’s Index will be tested using the knn matrix:
H0: I=0 (no spatial autocorrelation between locations)
H1: I>0 (there is spatial autocorrelation between locations)
Woptimum <- W.knn1
moran.test(dfx$Y, Woptimum, randomisation = TRUE, zero.policy = TRUE)
#result:
Moran I test under randomisation
data: dfx$Y
weights: Woptimum
Moran I statistic standard deviate = 7.176, p-value = 3.59e-13
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.480363044 -0.008474576 0.004640560
Conclusion:
Based on the results above, a p-value <0.05 is obtained, meaning H0 is rejected. The alternative hypothesis is greater, meaning that at the 5% significance level, there is positive spatial autocorrelation.
Moran Plot
moran.plot(dfx$Y, Woptimum, labels=dfx$'city')
Horizontal and vertical lines intersect the average X and Y values, dividing the graph into four quadrants: Quadrant I (top right, High-High / HH)
Areas with high values, surrounded by high-value areas. Example: Magelang, Sleman.
This means there is a positive cluster → high values tend to be close to other high values.
Quadrant II (top left, Low-High / LH)
Areas with low values, but surrounded by high-value areas. This means these areas are negative outliers amidst high-value areas.
Quadrant III (bottom left, Low-Low / LL)
Areas with low values, surrounded by low-value areas. Example: Bangkalan, Tasikmalaya, Garut, Lebak, Pandeglang, Sumenep.
This means there is a positive cluster but for low values.
Quadrant IV (bottom right, High-Low / HL)
Areas with high scores, but surrounded by low-value areas. Example: Banyuwangi, Cilegon City
This means these areas are positive outliers amidst low-value areas.
A positively sloping regression line → there is a global positive spatial autocorrelation, meaning that areas with high values tend to be adjacent to areas with high values, and areas with low values to areas with low values. However, points in quadrants II and IV indicate spatial outliers (areas with different patterns from their surroundings).
Next, OLS Model (Classical Regression): Run an OLS model as a baseline.