What is object detection?
Object detection detects where in the image we have objects using a bounding box and classifies them
How can we do object detection using sliding windows=
Do a classification on every sliding window, resulting in classification as a object or background.
Why don’t we usually use the sliding window approach for object detection?
Too slow
How does the R-CNN ( Region based CNN) work?
What is the probem for the R-CNN?
Training and inference is slow, training is ad-hoc
What is the SPP methode for object detection?
What is the main advantage of SPP(Spatial pyramide pooling) over R-CNN?
Makes testing phase faster as we only need to use convo once. Training is still slow and ad-hoc, but faster than R-CNN
What is the fast R-CNN algorithm?
What kind of loss does the fast R-CNN use?
cross entropy for classification and L1 for regression
How does the Fast R-CNN RoI pooling work?
Divide the project proposal into 7x7 grid and do max pooling
What is the main advantage of Fast R-CNN over SPP and R-CNN?
It’s fully trainable, with “fast” training and inference time.
What is the main difference between Fast R-CNN and faster R-CNN?
Faster R-CNN trains a convo net to do proposal selection. The proposal convo net uses classification (object/not) and BB regression loss
What is the mask R-CNN?
It adds FCN part to the faster R-Cnn to do semantic segmentation
Describe the YOLO algorithm
How are the bounding boxes trained in Yolo?
For each cell find the best bounding box, adjust it and increase confidence. For cells without objects and other boxes reduce confidence.