• all models are wrong
    • but some models are useful
  • garbage in - garbage out
    • analysis of ambiguous data yields useless interpretations
    • usually used for click-bait marketing

predictive modelling

gravity model: estimates the flow of people, materials, or information between locations
  • used for market analysis
    • attractiveness and cost yields demand for a product at a location
    • to predict and drive customer behaviors
regression analysis: a statistical method for evaluating the relationship between a single dependent variable and one or more independent variables
  • there is a big difference between correlation and cause
    • it’s a trap
    • correlation is a statistical measure, that determines the size and direction of a relationship relative to two or more variables
    • causation indicates that one event is due to the occurrence of the other event
  • as an analyst, you’re the expert
    • it is the expert’s responsibility to use the right data
    • and interpret data correctly
  • interpolation is used to fill in missing data to get continuous data from point data
    • the accuracy is questionable, but the visualization is continuous and smooth
  • knowing gives your results real meaning
    • knowing how well you predict gives more value
  • analysis leads to discovery
    • begin analysis with a focussed and meaningful question
    • refine the question
    • understand your data
    • clean your data
    • apply appropriate tool to answer question
    • solve problems and make world the better