Deep Q Networks (DQN)
Deep Q Networks (DQN) are a type of Artificial Intelligence that combines the techniques of Deep Learning and Q-Learning, a model-free reinforcement learning algorithm.
Deep Q Network (DQN) is a variant of Q-Learning that uses deep neural networks to approximate the Q-value function, which helps an agent to learn how to play games by taking smart actions based on the state of the game.
In DQN, a neural network is used as a function approximator for the Q-value function. The input to the network is the current state of the game, and the output is the corresponding Q-value for each possible action in that state.
DQN uses a technique called experience replay where past transitions are stored into a replay memory. During training, minibatches of transitions are sampled from this memory to update the Q-values. This approach breaks the correlation between consecutive samples, stabilizing the training process.
DQN also incorporates a technique known as a target network, which is a copy of the main network but with its weights frozen. The target network is used to calculate the target Q-value during updates, providing more stable learning targets.
DQN typically employs an epsilon-greedy policy for exploration, where the agent occasionally takes a random action instead of the one with the highest estimated Q-value. This balance between exploration and exploitation allows the agent to learn a more robust policy.
DQN agents learn from both positive rewards and punishments. If an action leads to a higher score in a game, for example, it receives a positive reward. On the other hand, if an action causes the game to end, the agent receives a punishment.
While DQNs have shown remarkable success, particularly in learning to play video games from raw pixel inputs, they are not without challenges. They can be sample inefficient, meaning they require a lot of experience (gameplay) to learn effectively. Also, the choice of reward function can greatly impact the agent’s learning, and designing these reward functions can be non-trivial.
DQN has been notably used by Google’s DeepMind to train an AI to play Atari games to a superhuman level, directly from raw pixel inputs.