GAN vs Diffusion Models

Both Generative Adversarial Networks (GANs) and Diffusion Models are modern techniques used in machine learning for image generation.

1. Generative Adversarial Networks (GAN)

GANs consist of two neural networks:

Generator (G) – generates fake images
Discriminator (D) – distinguishes real vs fake images

They compete in a minimax game.

GAN Objective Function

\[ \min_G \max_D V(D,G) = \mathbb{E}_{x \sim p_{data}(x)}[\log D(x)] + \mathbb{E}_{z \sim p_z(z)}[\log (1 - D(G(z)))] \]

Where:

\(x\) = real data sample
\(z\) = random noise vector
\(G(z)\) = generated image
\(D(x)\) = probability image is real

Generator Loss

\[ L_G = -\mathbb{E}_{z \sim p_z(z)}[\log D(G(z))] \]

Discriminator Loss

\[ L_D = -\mathbb{E}_{x \sim p_{data}(x)}[\log D(x)] - \mathbb{E}_{z \sim p_z(z)}[\log (1-D(G(z)))] \]

GANs learn by making the generator produce images that the discriminator cannot distinguish from real images.

When GANs Are Useful

Fast generation
Image-to-image translation
Low compute environments
Real-time applications

2. Diffusion Models

Diffusion models generate images by gradually adding noise to data and then learning to reverse the process.

Forward Diffusion Process

Noise is added to data step-by-step:

\[ q(x_t|x_{t-1}) = \mathcal{N} ( x_t; \sqrt{1-\beta_t}x_{t-1}, \beta_t I ) \]

Where:

\(x_t\) = noisy image at time step \(t\)
\(\beta_t\) = noise schedule

The closed form:

\[ q(x_t|x_0) = \mathcal{N} ( x_t; \sqrt{\bar{\alpha}_t}x_0, (1-\bar{\alpha}_t)I ) \]

Reverse Diffusion Process

The model learns to remove noise:

\[ p_\theta(x_{t-1}|x_t) = \mathcal{N} ( x_{t-1}; \mu_\theta(x_t,t), \Sigma_\theta(x_t,t) ) \]

Training Objective

The network predicts the noise added:

\[ L = \mathbb{E}_{t,x_0,\epsilon} \left[ \left| \epsilon - \epsilon_\theta(x_t,t) \right|^2 \right] \]

Where:

\(\epsilon\) = true noise
\(\epsilon_\theta\) = predicted noise

Generation Process

Image generation starts from pure noise:

\[ x_T \sim \mathcal{N}(0,I) \]

Then noise is gradually removed until \(x_0\) is produced.

3. Comparison

Feature	GAN	Diffusion
Training Stability	Hard	Easier
Generation Speed	Very Fast	Slow
Image Quality	Good	Excellent
Diversity	Lower	Higher
Compute Cost	Lower	Higher

4. Summary

Need fast generation → Use GAN
Need highest quality → Use Diffusion

Search This Blog