[GPSA] W02 - Understanding and Comparing Places
-
every analysis has a goal
- analyses start by
- asking questions
- querying data
- spatial data has location component associated with it
- allows visualizing data with maps
- enables insights and information that would not have been evident otherwise
example application:
- consider the national tree of Chile, monkey puzzle tree, whose population is declining
- goal: regrow monkey puzzle tree
- problem statement: where may this be best regrown?
- identify area that it can grow the best
- known info about monkey puzzle tree:
- likes elevation of over 1000 meters
- likes volcanic soil
- does not like freezing temperatures
- to find places that meet this criteria,
- apply elevation, soil type and temperatures maps on to a Chile map
- query only the data that applies to monkey puzzle tree on the layers
- this filters out the map area, showing only the suitable place for monkey puzzle tree repopulation
queries
query: a request to select features or records from a database, often written as a statement or logical expression
- query adopted from database systems
- it is a language used to retrieve and manipulate data in data systems
- can be applied to filter specific data needed for the said analysis
- here it is:
- elevation is above 1000 m
- soil type is volcanic
- temperature is above freezing
-
these filters applied to a map will help visually zeroing in on areas suitable for the monkey puzzle tree
- visualization other than maps, like plots, charts and graphs are grow
- useful to convey something using descriptive statistics
analysis geometry
- the analysis goal determines the geometric properties of the data
- three basic geometric shapes are
- points
- lines
- polygons
geographic scale
-
geometry also depends on the geographic scale of analysis
- for example:
- points rep cities in a city-to-city transport route within a county
- polygon reps a city in a analysis of city-wide bike pathways
- scale is related to the analytical goal, consequently affecting geometries of data used for analysis
small-scale map: map covers large area, so everything is scaled down
large-scale map: map covers small area, so everything is zoomed in
geometric properties
-
they use coordinate information from a digital map
-
they include information about
- position
- length
- direction
- area
- proximity
- volume
proximity
-
allows finding what’s near by
-
proximity can be expressed in terms of
- distance
- time
- cost
euclidean distance: the straight-line distance between two points on a plane
-
euclidean distance, or distance “as the crow flies” can be calculated using the pythagorean theorem
-
buffers can be created to find what’s near by, around
- points
- lines
- polygons
projections
- spatial data reps the earth
- the earth is not flat, or round
- it is a lumpy spheroid
map projection: a mathematical method by which the curved surface of the earth is portrayed on a flat surface
- in a projection, reality gets distorted in some way
-
impossible to map a spheroid on to a flat surface without some stretching, tearing or shearing
- some popular map projections:
- WGS 1984 world mercator: preserves direction
- World Mollweide: preserves area
- World Bonne: preserves area
- World Goode Homolosine Land: preserves area
- the chosen map type must be consistent with the type of analysis being performed
- choose projection that preserves the data that is crucial for your analysis
projection and scale
- when analysis is performed on a small (relative to the world map) area like a city, the projection distortion has little effect
- however, when the analysis is over a larger scale like countries or continents, the bigger the geographical area of analysis, more pronounced the effect of projection becomes
- results of analysis look very different on map of different projections
- larger the geometric area of analysis, more difference is seen
meaning
- an analysis is only as good as the question asked
- getting the question right is the key
- to get meaningful answers from analysis
- taking time to formulate better questions
- understand and clean the data