Basic Concept of Classification Flashcards

Question 1

Q

given a collection of records, each record by characterized by a tuple (x,y), where x is the attribute set and y is the label set.

Answer

A

Classification

Question 2

Q

6 CLASSIFICATION TECHNIQUES

Answer

A

Decision Tree Based Methods
Rule-based Methods
Memory Based Reasoning
Neural Network / Deep Learning
Naive Bayes and Bayesian Belief Network
Support Vector Machines

Question 3

Q

DECISION TREE: INDUCTION:

Answer

A

Training set is being inducted to train a model (Learning Algorithm -> Learn Model),

the model will be able to form a decision tree,

and we can apply the model to deduct from the test set.

Question 4

Q

is a type of algorithm that uses attributes to split the data recursively, till each split contains only a single class.

Answer

A

Hunt’s Algorithm

Question 5

Q

4 TYPES OF ATTRIBUTES

Answer

A

Binary
Nominal
Ordinal
Continuous

Question 6

Q

2 TEST CONDITION FOR NOMINAL ATTRIBUTE

Answer

A

Multi-way Split – use as many partitions as distinct values.
Binary Split – divides values into two subsets.

Question 7

Q

2 TEST CONDITION FOR ORDINAL ATTRIBUTE

Answer

A

Multi-way Split – use as many partitions as distinct values.
Binary Split – divides values into two subsets and preserve order property among attribute values.

Question 8

Q

is an approach to getting the best split where nodes with homogenous class distribution are preferred.

Answer

A

Greedy Approach

Question 9

Q

Formula for General Framework when finding the best split.

Answer

A

M0 is the value of the parent.

M12 is Node 1 * Node 2

M34 is Node 3 * Node 4

Gain = M0 - M12 VS M0 - M34

Question 10

Q

3 WAYS TO MESURE NODE IMPURITY

Answer

A

Gini Index
Entropy
Classification Error

Question 11

Q

is a measure of how often a randomly chosen element from the set would be incorrectly labeled if it was randomly labeled according to the distribution of labels in the subset.

Answer

A

Gini Impurity / Index

Question 12

Q

measures homogeneity of a node uncertainty of a random variable or information content of a message.

Question 13

Q

measures misclassification made by a node.

Answer

A

Classification Erro

Question 14

Q

3 STOPPING CRITERIA FOR TREE INDUCTION

Answer

A

Stop expanding the node when all the record belongs to the same class.
Stop expanding a node when all the records in the node have the same attribute.
Early Termination Criteria.

Question 15

Q

4 ADVANTAGES OF DEVISION TREE BASED CLASSIFICATION

Answer

A

Inexpensive to Construct
Extremely Fast at Classifying Unknown Records
Easy to Interpret for Small-Sized Trees
Accuracy is Comparable to other classification techniques.

Basic Concept of Classification Flashcards

(15 cards)