What is linear regression?
A model that predicts continuous output as a linear combination of inputs.
What is logistic regression?
A classification algorithm predicting probabilities using the logistic function.
How does a decision tree split nodes?
By choosing splits that maximize information gain or minimize Gini impurity.
What is random forest?
An ensemble of decision trees trained on bootstrapped samples with feature randomness.
What is gradient boosting?
Sequentially builds models that correct errors of prior models using gradient descent.
Compare XGBoost and LightGBM.
LightGBM uses histogram-based algorithm and is faster on large datasets; XGBoost more mature with regularization options.
What is SVM?
A classifier that finds a hyperplane maximizing margin between classes.
What is kNN?
Instance-based method classifying samples by the majority label of k nearest neighbors.
Explain DBSCAN clustering.
Groups together points close to each other with many neighbors, marking outliers as noise.
What is overfitting in decision trees?
When trees become too deep and memorize training data patterns instead of generalizing.