Regression
Attempts to estimate or predict the numerical value of some variable for an individual (e.g. price of Microsoft stock).
Classification
Attempts to predict which of a (small) set of classes an individual belong to.
Usually classes are mutually exclusive (e.g. customers will churn or will not churn).
Regression Mathematical Formula / Rule-based Formula
Mathematical Formula
Rule-based Formula
- Regression Trees
Classification Mathematical Formula / Rule-based Formula
Mathematical Formula
Rule-based Formula
- Classification Trees
Stages of a predictive modeling process
It has to be a quantifiable target.
a. E.g. what will be the stock price of Microsoft tomorrow? 200
b. E.g. will this client default on her loan? -> not specific enough! You want a time frame such as ‘first five years’ (though it depends on what you want to know)
We need data for the same or a related phenomenon
a. E.g. which customers defaulted last quarter? Think about whether or not this is the same phenomenon. Look at stuff that has happened in the past.
The model will look like a set of rules or a mathematical formula that allow establishing a prediction
a. E.g. rule: if (income <50k) then default, else no default.
b. E.g. mathematical formula: MSFTt+1 = 0.9 – APPLt -> predictive stock market value of Microsoft will be 90% of Apple’s.
The model can be applied to any customer. It gives us a prediction of the target variable.
a. E.g. the customer will default because its income is lower than 50k.
b. E.g. MSFT = $212.