modifiable arial unit problem

  • in spatial data overlaid on a map, point data when zoomed in
    • might turn out to be clustered point pattern
  • interpreting dots on a map is a challenge

kernel density estimation

  • measures the intensity of a point pattern based on sample observations
  • it creates a continuous surface showing the density of features
    • ignores administrative boundaries
classifying data: visualize a summary of the data
  • the distribution is grouped into buckets

Jenks Natural Breaks

  • optimizing methods that used data clustering with the aim to minimize the variance within classes and
  • maximize the variance between classes

equal interval

  • the range in split into equal intervals
  • each interval has an unequal number of data points

quantile interval

  • number of values per class is the same
  • the range is adjusted accordingly

result communication

  • analysis results should be communicated effectively
  • the classification method should be related to data
    • and based on the needed analysis
    • to tell the story effectively
  • good spatial analysis should
    • be appropriately designed
    • use appropriate techniques
    • be effectively communicated

tobler’s first law

  • everything is related
    • near things are related more than distant things
  • the concept of spatial dependence is the core of spatial analysis

advancement of spatial analysis

  • epidemiology and econometrics advanced spatial statistics and cluster detection

cluster analysis

  • used in many disciplines for various purposes
  • identifying location of clusters and examining what led to those clusters
  • to find the density of where something is occurring or occurred

  • 1854: transmission of cholera
    • cholera deaths and locations of pumping wells
    • none of the 70 brewery men contracted cholera
      • they were saved by the beer
      • had its own pump
    • only 5 of the 535 inmates at the workhouse died
      • had its own pump

hot spots

  • statistically significant clusters of values are spots
    • high values: hot spots
    • low values: cold spots
  • patterns and trends are important in data
    • so are missing data
    • erroneous data
    • outliers
  • do not overestimate the value that the data inherently brings
  • garbage in - garbage out