The Approximate Posterior Distribution: g(z|¢)
What creates it? The encoder.
How Posterior distribution is created
For a single input z (like an image), the encoder outputs parameters (mean and variance o) that define a specific Gaussian distribution.
The Prior Distribution: p(z)
Purpose: This distribution is the encoder’s ‘fuzzy’ representation or learned code for that specific input.
What creates prior distribution?
It is fixed before training. It does not depend on any input t.
Typical Choice of strandard prior distribution?
The Standard Normal Distribution, N(0, I) (a Gaussian with a mean of 0 and variance of 1).
Blurriness
It’s a side effect of the reconstruction loss. The model learns to “average out variations in the original data to get a good “average reconstruction. This averaging leads to blurrier, less sharp images compared to the originals.
Latent Space Regularization
VAEs force the latent space to follow a simple, pre-chosen distribution (usually a Gaussian). The problem is that the data’s true distribution might be much more complex. This can prevent the VAE from capturing all the complex details and variations present in the data.
The Balancing Act
It’s the trade-off between two competing goals: 1. Reconstruction Loss: How well the VAE can recreate the input image. 2. KL Divergence: How closely the latent space matches the simple (Gaussian) prior. Too much weight on Reconstruction: ignores the latent space structure (leads to poor new samples), Too much weight on KL Divergence: ignores the data’s details (leads to blurry, poor reconstructions).