is the science of extracting useful knowledge from huge data repositories.
Data Mining
is an open standard process model.
CRISP-DM REFERENCE MODEL
(Cross Industry Standard Process for Data Mining)
6 TASKS IN CRISP-DM REFERENCE MODEL
2 DATA MINING METHODS
is a method where we find human-interpretable patterns that describe the data.
Descriptive Method
is a method that uses some feature (variables) to predict unknown or future value of other variable.
Predictive Metho
5 DATA MINING TASKS
is a type of data mining task that predicts value of a given continuous valued variable based on the values of other variables.
Regression
is a type of data mining task that detects significant deviation from normal behavior.
Deviation / Anomaly Detection
5 CHALLENGES OF DATA MINING
3 TYPES OF TOOLS DATA MINING
2 COMMON PROGRAMMING ORIENTED TOOLS
4 INFO ABOUT DATA WAREHOUSE
data warehouses are designed to help you analyzed data.
Subject Oriented
integrates data from disparate sources into a consistent format.
Integrated
data in the data warehouse are never overwritten or deleted.
Nonvolatile
maintains both historical and (nearly) current data.
Time Variant
EXPLAIN EXTRACT, TRANSFORM, LOAD
is a term for data sets that are so large or complex that traditional data processing application are inadequate to deal with them.
Big Data
4 CHARACTERTISTICS OF BIG DATA
is a characteristic of big data that means there are different forms of data.
Variety
is a characteristic of big data that means the uncertainty of the data.
Veracity
is a characteristic of big data that means the analysis of data.
Velocity
is a characteristic of big data that means the scale of data.
Volume