Interview Preparation
AutoEncoder · VAE · GAN
Brief notes prepared for technical interviews
AutoEncoderVAE / VQ-VAEGAN / WGANGenerative Metrics
← Back to Archives

These notes cover the family of latent-variable and adversarial generative models — how to compress data into representations, how to make those representations probabilistic, and how to train generators by competing against a discriminator.

AutoEncoder

AutoEncoder

Architecture

Objective

Deterministic latent representation

Variational AutoEncoder (VAE)

VAE (page 44 portion)

Encoder

Decoder

Reparameterization Trick

Objective

VAE Objective (handwritten)

ELBO (Evidence Lower BOund)

ELBO derivation (handwritten)

Drawbacks

Vector-Quantized VAE (VQ-VAE)

VQ-VAE

Generative Model

Generative Model

Evaluation Metrics

Evaluation Metrics intro

Peak Signal-to-Noise Ratio (PSNR)

PSNR (page 46 portion)

\[\text{MSE} = \frac{1}{N} \sum_{i=1}^{N} (x_i - \hat{x}_i)^2, \qquad \text{PSNR} = 10 \log_{10}\!\left(\frac{\text{MAX}^2}{\text{MSE}}\right)\]

Structural Similarity Index (SSIM)

SSIM

\[\text{SSIM}(x, \hat{x}) = l(x, \hat{x}) \cdot c(x, \hat{x}) \cdot s(x, \hat{x})\]

Fréchet Inception Distance (FID)

FID

\[\text{FID} = \|\mu_r - \mu_g\|^2 + \text{Tr}\!\left(\Sigma_r + \Sigma_g - 2(\Sigma_r \Sigma_g)^{1/2}\right)\]

Inception Score

FID continued (page 48 top)

Inception Score

\[\text{IS} = \exp\!\left(\mathbb{E}_x[\text{KL}(p(y \mid x) \,\|\, p(y))]\right), \qquad p(y) = \mathbb{E}_x[p(y \mid x)]\]

Learned Perceptual Image Patch Similarity (LPIPS)

LPIPS (page 48 portion)

LPIPS (page 49 portion)

\[\text{LPIPS}(x, \hat{x}) = \sum_l w_l \, \|\hat{\phi}_l(x) - \hat{\phi}_l(\hat{x})\|_2^2\]

Generative Adversarial Network (GAN)

GAN / WGAN (handwritten)

Adversarial training

Objective (standard)

Strengths

Weaknesses

Wasserstein GAN (WGAN)

WGAN — section header, critic replacement, support-mismatch motivation (page 49 bottom)

WGAN — Wasserstein distance proxy, Lipschitz constraint, WGAN-GP (page 50 top)