Join count statistics are a method of
spatial analysis used to assess the degree of association, in particular the
autocorrelation, of
categorical variables distributed over a spatial map. They were originally introduced by Australian statistician
P. A. P. Moran.
[1] Join count statistics have found widespread use in
econometrics,
[2]
remote sensing
[3] and
ecology.
[4] Join count statistics can be computed in a number of software packages including PASSaGE,
[5]
GeoDA, PySAL
[6] and spdep.
[7]
Given binary data distributed over spatial sites, where the neighbour relations between regions and are encoded in the spatial weight matrix
the join count statistics are defined as [8] [4]
Where
The subscripts refer to 'black'=1 and 'white'=0 sites. The relation implies only three of the four numbers are independent. Generally speaking, large values of and relative to imply autocorrelation and relatively large values of imply anti-correlation.
To assess the statistical significance of these statistics, the expectation under various null models has been computed. [9] For example, if the null hypothesis is that each sample is chosen at random according to a Bernoulli process with probability
then Cliff and Ord [8] show that
where
However in practice [10] an approach based on random permutations is preferred, since it requires fewer assumptions.
Anselin and Li introduced [11] [12] the idea of the local join count statistic, following Anselin's general idea of a Local Indicator of Spatial Association (LISA). [13] Local Join Count is defined by e.g.
with similar definitions for and . This is equivalent to the Getis-Ord statistic computed with binary data. Some analytic results for the expectation of the local statistics are available based on the hypergeometric distribution [11] but due to the multiple comparisons problem a permutation based approach is again preferred in practice. [12]
When there are categories join count statistics have been generalised [4] [8] [9]
Where is an indicator function for the variable belonging to the category . Analytic results are available [14] or a permutation approach can be used to test for significance as in the binary case.
{{
cite book}}
: CS1 maint: multiple names: authors list (
link)
Join count statistics are a method of
spatial analysis used to assess the degree of association, in particular the
autocorrelation, of
categorical variables distributed over a spatial map. They were originally introduced by Australian statistician
P. A. P. Moran.
[1] Join count statistics have found widespread use in
econometrics,
[2]
remote sensing
[3] and
ecology.
[4] Join count statistics can be computed in a number of software packages including PASSaGE,
[5]
GeoDA, PySAL
[6] and spdep.
[7]
Given binary data distributed over spatial sites, where the neighbour relations between regions and are encoded in the spatial weight matrix
the join count statistics are defined as [8] [4]
Where
The subscripts refer to 'black'=1 and 'white'=0 sites. The relation implies only three of the four numbers are independent. Generally speaking, large values of and relative to imply autocorrelation and relatively large values of imply anti-correlation.
To assess the statistical significance of these statistics, the expectation under various null models has been computed. [9] For example, if the null hypothesis is that each sample is chosen at random according to a Bernoulli process with probability
then Cliff and Ord [8] show that
where
However in practice [10] an approach based on random permutations is preferred, since it requires fewer assumptions.
Anselin and Li introduced [11] [12] the idea of the local join count statistic, following Anselin's general idea of a Local Indicator of Spatial Association (LISA). [13] Local Join Count is defined by e.g.
with similar definitions for and . This is equivalent to the Getis-Ord statistic computed with binary data. Some analytic results for the expectation of the local statistics are available based on the hypergeometric distribution [11] but due to the multiple comparisons problem a permutation based approach is again preferred in practice. [12]
When there are categories join count statistics have been generalised [4] [8] [9]
Where is an indicator function for the variable belonging to the category . Analytic results are available [14] or a permutation approach can be used to test for significance as in the binary case.
{{
cite book}}
: CS1 maint: multiple names: authors list (
link)