Skip to main content

GAN vs Diffusion Models


GAN vs Diffusion Models

Both Generative Adversarial Networks (GANs) and Diffusion Models are modern techniques used in machine learning for image generation.

1. Generative Adversarial Networks (GAN)

GANs consist of two neural networks:

  • Generator (G) – generates fake images
  • Discriminator (D) – distinguishes real vs fake images

They compete in a minimax game.

GAN Objective Function

\[ \min_G \max_D V(D,G) = \mathbb{E}_{x \sim p_{data}(x)}[\log D(x)] + \mathbb{E}_{z \sim p_z(z)}[\log (1 - D(G(z)))] \]

Where:

  • \(x\) = real data sample
  • \(z\) = random noise vector
  • \(G(z)\) = generated image
  • \(D(x)\) = probability image is real

Generator Loss

\[ L_G = -\mathbb{E}_{z \sim p_z(z)}[\log D(G(z))] \]

Discriminator Loss

\[ L_D = -\mathbb{E}_{x \sim p_{data}(x)}[\log D(x)] - \mathbb{E}_{z \sim p_z(z)}[\log (1-D(G(z)))] \]

GANs learn by making the generator produce images that the discriminator cannot distinguish from real images.

When GANs Are Useful

  • Fast generation
  • Image-to-image translation
  • Low compute environments
  • Real-time applications

2. Diffusion Models

Diffusion models generate images by gradually adding noise to data and then learning to reverse the process.

Forward Diffusion Process

Noise is added to data step-by-step:

\[ q(x_t|x_{t-1}) = \mathcal{N} ( x_t; \sqrt{1-\beta_t}x_{t-1}, \beta_t I ) \]

Where:

  • \(x_t\) = noisy image at time step \(t\)
  • \(\beta_t\) = noise schedule

The closed form:

\[ q(x_t|x_0) = \mathcal{N} ( x_t; \sqrt{\bar{\alpha}_t}x_0, (1-\bar{\alpha}_t)I ) \]

Reverse Diffusion Process

The model learns to remove noise:

\[ p_\theta(x_{t-1}|x_t) = \mathcal{N} ( x_{t-1}; \mu_\theta(x_t,t), \Sigma_\theta(x_t,t) ) \]

Training Objective

The network predicts the noise added:

\[ L = \mathbb{E}_{t,x_0,\epsilon} \left[ \left| \epsilon - \epsilon_\theta(x_t,t) \right|^2 \right] \]

Where:

  • \(\epsilon\) = true noise
  • \(\epsilon_\theta\) = predicted noise

Generation Process

Image generation starts from pure noise:

\[ x_T \sim \mathcal{N}(0,I) \]

Then noise is gradually removed until \(x_0\) is produced.

3. Comparison

Feature GAN Diffusion
Training Stability Hard Easier
Generation Speed Very Fast Slow
Image Quality Good Excellent
Diversity Lower Higher
Compute Cost Lower Higher

4. Summary

  • Need fast generation → Use GAN
  • Need highest quality → Use Diffusion

Contact Us

Name

Email *

Message *