[GPSA] W05 - Detecting and Quantifying Patterns
modifiable arial unit problem
- in spatial data overlaid on a map, point data when zoomed in
- might turn out to be clustered point pattern
- interpreting dots on a map is a challenge
kernel density estimation
- measures the intensity of a point pattern based on sample observations
- it creates a continuous surface showing the density of features
- ignores administrative boundaries
classifying data: visualize a summary of the data
- the distribution is grouped into buckets
Jenks Natural Breaks
- optimizing methods that used data clustering with the aim to minimize the variance within classes and
- maximize the variance between classes
equal interval
- the range in split into equal intervals
- each interval has an unequal number of data points
quantile interval
- number of values per class is the same
- the range is adjusted accordingly
result communication
- analysis results should be communicated effectively
- the classification method should be related to data
- and based on the needed analysis
- to tell the story effectively
- good spatial analysis should
- be appropriately designed
- use appropriate techniques
- be effectively communicated
tobler’s first law
- everything is related
- near things are related more than distant things
- the concept of spatial dependence is the core of spatial analysis
advancement of spatial analysis
- epidemiology and econometrics advanced spatial statistics and cluster detection
cluster analysis
- used in many disciplines for various purposes
- identifying location of clusters and examining what led to those clusters
-
to find the density of where something is occurring or occurred
- 1854: transmission of cholera
- cholera deaths and locations of pumping wells
- none of the 70 brewery men contracted cholera
- they were saved by the beer
- had its own pump
- only 5 of the 535 inmates at the workhouse died
- had its own pump
hot spots
- statistically significant clusters of values are spots
- high values: hot spots
- low values: cold spots
- patterns and trends are important in data
- so are missing data
- erroneous data
- outliers
- do not overestimate the value that the data inherently brings
- garbage in - garbage out