GAN vs Diffusion Models
Both Generative Adversarial Networks (GANs) and Diffusion Models are modern techniques used in machine learning for image generation.
1. Generative Adversarial Networks (GAN)
GANs consist of two neural networks:
- Generator (G) – generates fake images
- Discriminator (D) – distinguishes real vs fake images
They compete in a minimax game.
GAN Objective Function
\[
\min_G \max_D V(D,G) =
\mathbb{E}_{x \sim p_{data}(x)}[\log D(x)]
+
\mathbb{E}_{z \sim p_z(z)}[\log (1 - D(G(z)))]
\]
Where:
- \(x\) = real data sample
- \(z\) = random noise vector
- \(G(z)\) = generated image
- \(D(x)\) = probability image is real
Generator Loss
\[
L_G = -\mathbb{E}_{z \sim p_z(z)}[\log D(G(z))]
\]
Discriminator Loss
\[
L_D =
-\mathbb{E}_{x \sim p_{data}(x)}[\log D(x)]
-
\mathbb{E}_{z \sim p_z(z)}[\log (1-D(G(z)))]
\]
GANs learn by making the generator produce images that the discriminator cannot distinguish from real images.
When GANs Are Useful
- Fast generation
- Image-to-image translation
- Low compute environments
- Real-time applications
2. Diffusion Models
Diffusion models generate images by gradually adding noise to data and then learning to reverse the process.
Forward Diffusion Process
Noise is added to data step-by-step:
\[
q(x_t|x_{t-1}) =
\mathcal{N}
(
x_t;
\sqrt{1-\beta_t}x_{t-1},
\beta_t I
)
\]
Where:
- \(x_t\) = noisy image at time step \(t\)
- \(\beta_t\) = noise schedule
The closed form:
\[
q(x_t|x_0) =
\mathcal{N}
(
x_t;
\sqrt{\bar{\alpha}_t}x_0,
(1-\bar{\alpha}_t)I
)
\]
Reverse Diffusion Process
The model learns to remove noise:
\[
p_\theta(x_{t-1}|x_t)
=
\mathcal{N}
(
x_{t-1};
\mu_\theta(x_t,t),
\Sigma_\theta(x_t,t)
)
\]
Training Objective
The network predicts the noise added:
\[
L =
\mathbb{E}_{t,x_0,\epsilon}
\left[
\left|
\epsilon -
\epsilon_\theta(x_t,t)
\right|^2
\right]
\]
Where:
- \(\epsilon\) = true noise
- \(\epsilon_\theta\) = predicted noise
Generation Process
Image generation starts from pure noise:
\[
x_T \sim \mathcal{N}(0,I)
\]
Then noise is gradually removed until \(x_0\) is produced.
3. Comparison
| Feature | GAN | Diffusion |
|---|---|---|
| Training Stability | Hard | Easier |
| Generation Speed | Very Fast | Slow |
| Image Quality | Good | Excellent |
| Diversity | Lower | Higher |
| Compute Cost | Lower | Higher |
4. Summary
- Need fast generation → Use GAN
- Need highest quality → Use Diffusion