- all models are wrong
- but some models are useful
- garbage in - garbage out
- analysis of ambiguous data yields useless interpretations
- usually used for click-bait marketing
predictive modelling
- used for market analysis
- attractiveness and cost yields demand for a product at a location
- to predict and drive customer behaviors
regression analysis: a statistical method for evaluating the relationship between a single dependent variable and one or more independent variables
- there is a big difference between correlation and cause
- it’s a trap
- correlation is a statistical measure, that determines the size and direction of a relationship relative to two or more variables
- causation indicates that one event is due to the occurrence of the other event
- as an analyst, you’re the expert
- it is the expert’s responsibility to use the right data
- and interpret data correctly
- interpolation is used to fill in missing data to get continuous data from point data
- the accuracy is questionable, but the visualization is continuous and smooth
- knowing gives your results real meaning
- knowing how well you predict gives more value
- analysis leads to discovery
- begin analysis with a focussed and meaningful question
- refine the question
- understand your data
- clean your data
- apply appropriate tool to answer question
- solve problems and make world the better