Computer vision Flashcards

(18 cards)

1
Q

Why can’t CNN do object detection

A

can’t identify multiple objects in one photo.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the goal of a R-CNN

A

The goal of region-based convolutional neural network (R-CNN) is to take in an image, and correctly identify where
the main objects (via a bounding box) in the image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is computer Vision

A

Computer vision is a field of AI that enables computers to understand and interpret images or video.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is object detection

A

Object detection identifies what objects are in an image and where they are located.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

WHat is image segmentation

A

Image segmentation divides an image into regions or objects at the pixel level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the pipeline for R-CNN

A

Step 1 — Input image
The original image is given to the model.
Step 2 — Region proposals
Around 2000 candidate regions are generated
Done using Selective Search
Step 3 — CNN feature extraction
Each region is passed through a CNN to extract features.
Step 4 — Classification
Each region is classified (e.g., person, car).
Step 5 — Bounding box refinement
The predicted bounding boxes are adjusted to improve accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What loss function does R-CNN use

A

The loss combines:
Classification loss + Bounding box regression loss

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the cons of R-CNn

A
  • Computationally very expensive as you would have to classify 2000 region
    proposals per image.
  • The selective search algorithm is a fixed algorithm. Therefore, no learning is
    happening at that stage. This could lead to the generation of bad candidate
    region proposals.
  • Very slow
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Way is R-CNN so slow

A
  • It requires a forward pass of the CNN (AlexNet)
    for every single region proposal for every
    single image (that is around 2k forward passes
    per image!).
  • It has to train three different models
    separately - the CNN to generate image
    features, the classifier that predicts the class,
    and the regression model to tighten the
    bounding boxes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is fast r-cnn

A

Fast R-CNN improves the efficiency of R-CNN.
Key idea
Instead of running the CNN for every region:
Run the CNN once for the whole image.
Then extract features from the resulting feature map.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does Region of interest pooling allow

A

Take the shared feature map
Extract features for each proposed region
Convert them into fixed-size feature vectors

these vectors are used for classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the cons of fast R-CNN

A

Fast R-CNN is also using selective search to find out the region proposals.
Selective search is a slow and time-consuming process affecting the performance of
the network.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the pros of Fast R-CNN

A

The reason “Fast R-CNN” is faster than R-CNN is because you don’t have to feed
2000 region proposals to the convolutional neural network every time. Instead, the
convolution operation is done only once per image and a feature map is generated from
it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the faster R-CNN pipeline

A
  1. CNN extracts feature map
  2. Region Proposal Network (RPN) generates proposals
  3. RoI pooling extracts features
  4. Classifier predicts object classes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is region of interest pooling

A

after proposals are generated:
1. RoI pooling extracts fixed-size feature maps
2. These features are used to classify the object.

This ensures the network can classify objects regardless of their original size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

cont of ROI

A

After the RPN step, we have a bunch of object proposals with no class assigned to them. Our next problem to solve is
how to take these bounding boxes and classify them into our desired categories.
⚫ Faster R-CNN tries to solve, or at least mitigate, this problem by reusing the existing convolutional feature map. This is
done by extracting fixed-sized feature maps for each proposal using region of interest pooling. Fixed size feature maps
are needed for the R-CNN in order to classify them into a fixed number of classes.

17
Q

What are the advantages of semantic segmentation

A
  • The U-Net combines the location information from
    the downsampling path with the contextual
    information in the upsampling path to finally obtain a
    general information combining localisation and
    context, which is necessary to predict a good
    segmentation map.
  • No dense layer, so images of different sizes can be
    used as input (since the only parameters to learn on
    convolution layers are the kernel, and the size of the
    kernel is independent from input image’ size).
18
Q

What is Mask R-CNN

A

Mask R-CNN does this by adding a
branch to Faster R-CNN that outputs a
binary mask that says whether or not a
given pixel is part of an object.