Interview Preparation
Optimization Theory
Brief notes prepared for technical interviews
ConvexityLipschitz SmoothnessLagrangian Duality
← Back to Archives

These notes cover the theoretical conditions that make optimization problems tractable: when local solutions are global (convexity), what smoothness lets us bound step sizes, and how constrained problems are reformulated via duality.

Convexity

Convexity intro

Intuition

Convex Function

Convex Function (with Jensen inequality)

Hessian-Based Characterization

Hessian-Based Characterization

Why Convexity Matters in ML

Why convexity matters in ML

v.s. non-convex problems

Lipschitz Continuity

Lipschitz Continuity & Smoothness

Definition

\[\|f(x) - f(y)\| \leq L \, \|x - y\| \quad \forall x, y\]

Intuition

Lipschitz Continuous Gradient (Smoothness)

Lipschitz Continuous Gradient (Smoothness)

\[\|\nabla f(x) - \nabla f(y)\| \leq L \|x - y\|\]

Interpretation

Why Lipschitz continuity matters in optimization

Lagrangian Duality

Lagrangian Duality (handwritten)

Constrained optimization problem

\[\min_x f(x) \quad \text{s.t.} \quad g(x) \leq 0, \quad h(x) = 0\]

Lagrangian

\[\mathcal{L}(x, \lambda, \nu) = f(x) + \sum_i \lambda_i g_i(x) + \sum_j \nu_j h_j(x), \quad \lambda_i \geq 0\]

Dual function

\[D(\lambda, \nu) = \inf_x \mathcal{L}(x, \lambda, \nu)\]

Conjugate Prior (Bayesian context)