What are the 3 normalization schemes used in class?
min-max
z-score
decimal scaling
What are the 3 preprocessing techniques?
What is spatial autocorrelation?
objecsts that are physically close tend to be similar
What are the 3 types of data sets?
Record
Graph
Ordered
What are the two different types of mappings?
1 of n
n of m
explain the 1 of n mapping:
create 1 new attribute for every ordinal value
explain the n of m mapping
create 1 new attribute with a unique representation for each ordinal value
What are the 3 types of missing values?
Missing completely at random
Missing at Random
Non Ignorable Data
What are 4 sampling schemes?
What is the difference between a training and test set?
Training set is used to build a model.
Test set is used to test a model.
Why is sampling important with respect to a training and test set?
We want to reduce the amount of bias.
What is accuracy?
Used to compare performance of models
of correct predictions / # of predictons
TP + TN / (TP + TN + FP + FN)
What is True Positive Rate / sensitivity?
fraction of positive examples predicted correctly by the classifier
TPR = TP / (TP + FN)
What is the True Negative Rate / specificity?
fraction of negative examples predicted correctly by the classifier
TNR = TN / (TN + FP)
What is False Positive Rate?
fraction of negative examples predicted as positive class
FPR = FP / (FP + TN)
What is False Negative Rate?
fraction of positive examples predicted as negative class
FNR = FN / (FN + TP)
What is percision?
The fraction of positive examples out of examples declared as positive
p = TP / TP + FP
What is recall?
the fraction of positive examples correctly predicted by the classifier
r = TP / (TP + FN)
What is F-measure?
summarizes precision and recall
2TP / ( 2TP + FP + FN)
What is the apriori principal?
if an itemset is frquent, then all of its subsets are frequent. Conversely, if an itemset if infrequen, then all of its supersets must be infrequent too.
Recite the Apriori algorithm?
Let k = 1
generate frequent itemsets of length 1
Repeat:
What is a frequent itemset?
An itemset that meets the minsup threshold.
What is a maximal frequent itemset?
An itemset is maximal frequent if none of its immediate supersets is frequent
What is a closed itemset?
An itemset is closed if none of its immediate supersets has the same support as the itemset.
It’s a compressed representation of support