What is computer vision?
Turns raw image data into higher-level concepts so that can interpreted and acted upon. Aims to categorize the visual world as we do.
Outline the traditional machine learning flow. What were the problems with early computer vision techniques.
Input image
Feature Extractor & Features List (manual tasks)
Traditional ML algorithm (automated)
Required strict complex rules to detect features in images which required significant manual effort. And was prone to fail if there were small changes in object size, rotation or data size.
How does deep learning solve problems of machine learning?
Feeding a sufficient number of well-labelled images so that it automatically learns where the edges are and how pixel level information combines to form features. Determines patterns that make image distinct.
What is deep learning?
A subfield of machine learning inspired by how the brain is structured and operates. Processes huge amounts of data, finding patterns humans often cannot detect. The word “deep” refers to the number of hidden layers in the neural network, which provide much of the power to learn.
Give term in artificial neural network for the following biological components.
Soma - Neuron
Dendrite - Input
Axon - Output
Synapse - Weight
What is excitatory? What is the opposite?
Excitatory - weights with positive value that increase probability of neuron firing.
Inhibitory - weights with negative value that decrease probability of neuron firing.
What does ANN stand for? What is the ANN approach? What is the name of this function?
Artificial Neural Network
Each neuron computers weighted sum of its inputs. If < threshold value output is 0 else 1.
Step.
What is activation function? Give formula for sigmoid.
Function employed in second stage of an artificial neuron, its maps weighted sum value to output value.
1 / 1 + e^-x where x is the weighted sum
What is the purpose of activation functions?
To introduce non-linearities.
What is ReLu? Give another name. Why is it most popular?
The Rectified Liner Unit is an activation function defined as y = max(0,x) where x is the weighted sum.
Rectifier.
Cheap and quick to compute as no complicated math.
What is bias?
Each neuron has its own bias which is learnable. We pass weighted sum + bias into activation function. Offers increased flexibility.
Show how bias offers increased flexibility. Want to decrease threshold of neuron from 0 to -1.
Lets say neuron x value (weighted sum) = -0.35
Using ReLU this neuron does not fire.
Now add bias the opposite of desired -1 so +1
-0.35 + 1 = 0.65
Using ReLu this neuron now does fire.
What was the early drawback of ANNs? The proposed solution?
Major drawback was the process of making adjustments to the weights model.
Backpropagation. Propagates total loss back into neural network to know how much of the loss every node is responsible for and updates weights to minimize the loss.
Process of backpropagation.
For each neuron calculated the weighted sum into the node. (multiply inputs * weights and add).
Then pass as x into activation function to get output value of neuron.
Error = given output of NN - output of last neuron
Calculate Unit Error for last neuron. By multiplying: delta = output * (1 - output) * (Error)
Then take this delta and pass back to all preceding neurons to multiply: delta = output * (1 - output) * (weight travelled through * passed value). Repeat this step till all levels of neurons done.
Next for each neuron: multiply learning rate * delta passed to this neuron * neuron output. This is change in weight.
New value for weight = change in weight + old value
We then repeat with new forward pass with new weights.
What is the most basic form of neural network? Name three other types.
A fully connected neural network.
Recurrent Neural Network (RNN)
Convolutional Neural Network (CNN)
Generative Adversarial Network (GANs)
What is RNN?
Recurrent Neural Network is designed to work with sequence prediction problems by not only processing the input but also the prior inputs across time. Essentially a string of neural networks that feed on each other based on complex algorithms. E.g. Autofill
What are CNNs? What should it be used for?
Convolutional Neural Networks are designed to map image data to an output variable. Develop an internal representation of a 2-D image. Allows model to learn position and scale in variant structures in the data.
Use for: Image Data, Classification prediction problems, regression prediction problems.
What should RNN be used for? And not used for?
Should be used for:
- Text data
- Speech data
- Classification prediction problems
- Regression prediction problems
Don’t use for
- Tabular data
- Image data
What are GAN’s?
Generative Adversarial Networks, have two generative models compete against each other in a tight feedback loop.
Generator creates lots of photos and discriminator tries to determine which is real. Then generator tries to make generated images look more real and so on.
After many iterations discriminator no longer needed. This process could also be done with unlabeled data.
Who invented GAN’s and when?
Ian Goodfellow, 2014
What is most common hardware used for DL?
GPU (but many companies are making their own)
In general, to what kind of tasks is deep learning applicable?
Works well in domains in which there are large number of input features and large training datasets available.
Which neural network is best for speech recognition and translation?
RNN
Which neural network is best for autonomous driving?
Hybrid NN