Skip to content

Commit 4f9252e

Browse files
committed
update week 1
1 parent 12a4c45 commit 4f9252e

File tree

149 files changed

+37424
-3007
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

149 files changed

+37424
-3007
lines changed

doc/pub/week1/html/week1-bs.html

Lines changed: 215 additions & 465 deletions
Large diffs are not rendered by default.

doc/pub/week1/html/week1-reveal.html

Lines changed: 187 additions & 420 deletions
Large diffs are not rendered by default.

doc/pub/week1/html/week1-solarized.html

Lines changed: 202 additions & 429 deletions
Large diffs are not rendered by default.

doc/pub/week1/html/week1.html

Lines changed: 202 additions & 429 deletions
Large diffs are not rendered by default.
-419 KB
Binary file not shown.

doc/pub/week1/ipynb/week1.ipynb

Lines changed: 483 additions & 953 deletions
Large diffs are not rendered by default.

doc/pub/week1/pdf/week1.pdf

-656 KB
Binary file not shown.
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
\documentclass[11pt,a4paper]{article}
2+
3+
\usepackage{amsmath,amssymb,amsfonts}
4+
\usepackage{geometry}
5+
\usepackage{hyperref}
6+
\usepackage{physics}
7+
\usepackage{graphicx}
8+
9+
\geometry{margin=1in}
10+
11+
\title{\textbf{Discriminative and Generative Deep Learning Models:\\
12+
A Mathematical and Computational Study}}
13+
\author{}
14+
\date{}
15+
16+
\begin{document}
17+
\maketitle
18+
19+
\section*{Project Overview}
20+
21+
Deep learning methods can broadly be divided into \emph{discriminative
22+
models}, which learn decision boundaries for labeled data, and
23+
\emph{generative models}, which learn probability distributions over
24+
data. Convolutional neural networks (CNNs) dominate modern
25+
classification tasks, while generative models such as variational
26+
autoencoders (VAEs), Boltzmann machines, and diffusion models provide
27+
probabilistic descriptions of data and enable synthesis, uncertainty
28+
quantification, and representation learning.
29+
30+
The goal of this project is to develop a unified mathematical and
31+
computational understanding of these model classes. Students will
32+
analyze classification and generative learning as optimization
33+
problems over high-dimensional function spaces, emphasizing
34+
probabilistic modeling, variational principles, and numerical
35+
optimization.
36+
37+
38+
39+
\section{Classification with Convolutional Neural Networks}
40+
41+
A convolutional neural network defines a parametric mapping
42+
\begin{equation}
43+
f_\theta : \mathbb{R}^{H \times W \times C} \rightarrow \{1,\dots,K\},
44+
\end{equation}
45+
where inputs are structured data (e.g.\ images) and outputs are class labels.
46+
47+
Mathematically, CNNs combine:
48+
\begin{itemize}
49+
\item Convolutional linear operators with local receptive fields,
50+
\item Nonlinear activation functions,
51+
\item Pooling and subsampling operations.
52+
\end{itemize}
53+
54+
Students will analyze:
55+
\begin{itemize}
56+
\item Convolutions as structured sparse linear maps,
57+
\item Translation equivariance and symmetry reduction,
58+
\item Parameter sharing and its effect on sample complexity.
59+
\end{itemize}
60+
61+
The classification problem is formulated as empirical risk minimization with cross-entropy loss,
62+
\begin{equation}
63+
\mathcal{L}_{\text{clf}}(\theta) = -\frac{1}{N}\sum_{i=1}^N \log p_\theta(y_i \mid x_i),
64+
\end{equation}
65+
and optimized using stochastic gradient descent.
66+
67+
\section{Probabilistic Generative Modeling}
68+
69+
Generative models aim to learn an approximation $p_\theta(x)$ to an unknown data distribution. This project considers three complementary paradigms.
70+
71+
\subsection{Variational Autoencoders}
72+
73+
VAEs introduce latent variables $z$ and define
74+
\begin{equation}
75+
p_\theta(x,z) = p_\theta(x \mid z)p(z),
76+
\end{equation}
77+
with training based on variational inference.
78+
79+
Students will derive the evidence lower bound (ELBO),
80+
\begin{equation}
81+
\mathcal{L}_{\text{VAE}} = \mathbb{E}_{q_\phi(z\mid x)}[\log p_\theta(x\mid z)] - \mathrm{KL}(q_\phi(z\mid x)\|p(z)),
82+
\end{equation}
83+
and analyze:
84+
\begin{itemize}
85+
\item Encoder--decoder architectures,
86+
\item Reparameterization trick and differentiability,
87+
\item Trade-offs between reconstruction accuracy and latent regularization.
88+
\end{itemize}
89+
90+
\subsection{Boltzmann Machines}
91+
92+
Boltzmann machines define energy-based models
93+
\begin{equation}
94+
p_\theta(x) = \frac{1}{Z_\theta} e^{-E_\theta(x)},
95+
\end{equation}
96+
where $Z_\theta$ is the partition function.
97+
98+
The project examines:
99+
\begin{itemize}
100+
\item Energy landscapes and statistical mechanics analogies,
101+
\item Maximum likelihood learning and gradient structure,
102+
\item Approximate inference methods such as contrastive divergence.
103+
\end{itemize}
104+
105+
Connections between Boltzmann machines and variational principles are emphasized.
106+
107+
\subsection{Diffusion Models}
108+
109+
Diffusion models define a forward noising process and a learned reverse-time denoising process. Training minimizes a denoising objective equivalent to variational inference.
110+
111+
Students will analyze:
112+
\begin{itemize}
113+
\item Stochastic differential equations and discretization,
114+
\item Score matching and reverse-time dynamics,
115+
\item Relations to Langevin sampling and thermodynamics.
116+
\end{itemize}
117+
118+
Diffusion models are interpreted as iterative numerical solvers for sampling from complex distributions.
119+
120+
\section{Discriminative vs Generative Learning}
121+
122+
A central theme of the project is the comparison between discriminative and generative objectives.
123+
124+
Topics include:
125+
\begin{itemize}
126+
\item Decision boundaries vs density estimation,
127+
\item Sample efficiency and representation learning,
128+
\item Uncertainty quantification and out-of-distribution detection.
129+
\end{itemize}
130+
131+
Students will explore how latent representations learned by generative models can support downstream classification tasks.
132+
133+
\section{Implementation and Experiments}
134+
135+
The practical component consists of implementing:
136+
\begin{itemize}
137+
\item A CNN for image classification,
138+
\item At least one generative model (VAE, Boltzmann machine, or diffusion model).
139+
\end{itemize}
140+
141+
Experiments will use standard labeled datasets and focus on:
142+
\begin{itemize}
143+
\item Classification accuracy and confusion structure,
144+
\item Quality of generated samples,
145+
\item Latent-space geometry and interpolation.
146+
\end{itemize}
147+
148+
Computational results are interpreted through the lens of the mathematical models.
149+
150+
\section*{Expected Outcomes}
151+
152+
By completing this project, students will:
153+
\begin{itemize}
154+
\item Understand deep learning through probabilistic and variational principles,
155+
\item Connect classification and generation within a unified framework,
156+
\item Analyze neural networks as numerical optimization problems,
157+
\item Gain insight applicable to physics-inspired machine learning, statistical inference, and scientific data analysis.
158+
\end{itemize}
159+
160+
\end{document}
161+
Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
\documentclass[11pt,a4paper]{article}
2+
3+
\usepackage{amsmath,amssymb,amsfonts}
4+
\usepackage{geometry}
5+
\usepackage{hyperref}
6+
\usepackage{physics}
7+
\usepackage{graphicx}
8+
9+
\geometry{margin=1in}
10+
11+
\title{\textbf{Reinforcement Learning, Generative Models, and PDEs:\\
12+
A Mathematical Project in Control and Inference}}
13+
\author{}
14+
\date{}
15+
16+
\begin{document}
17+
\maketitle
18+
19+
\section*{Project Overview}
20+
21+
Reinforcement learning (RL) and modern generative models are
22+
increasingly understood through the lens of partial differential
23+
equations (PDEs), stochastic processes, and variational
24+
principles. Reinforcement learning is closely related to optimal
25+
control and Hamilton--Jacobi--Bellman (HJB) equations, while
26+
generative models such as diffusion models and score-based methods are
27+
connected to Fokker--Planck equations, stochastic differential
28+
equations (SDEs), and gradient flows in probability space.
29+
30+
The goal of this project is to develop a unified mathematical
31+
understanding of reinforcement learning and generative learning as
32+
PDE-driven optimization problems. Students will analyze value
33+
functions, policies, and probability densities as solutions to PDEs,
34+
and compare how control and inference emerge from related mathematical
35+
structures.
36+
37+
38+
39+
\section{Reinforcement Learning and Optimal Control}
40+
41+
Reinforcement learning problems are commonly formulated as Markov decision processes, but in the continuous-state and continuous-time limit they are naturally described by stochastic control theory.
42+
43+
Consider a controlled stochastic differential equation
44+
\begin{equation}
45+
dX_t = f(X_t,u_t)\,dt + \sigma(X_t)\,dW_t,
46+
\end{equation}
47+
where $u_t$ is a control policy. The objective is to minimize the expected cost functional
48+
\begin{equation}
49+
J(u) = \mathbb{E}\left[ \int_0^T \ell(X_t,u_t)\,dt + g(X_T) \right].
50+
\end{equation}
51+
52+
The associated value function
53+
\begin{equation}
54+
V(x,t) = \inf_u \mathbb{E}_{x,t} \left[ \int_t^T \ell(X_s,u_s)\,ds + g(X_T) \right]
55+
\end{equation}
56+
satisfies the Hamilton--Jacobi--Bellman (HJB) equation
57+
\begin{equation}
58+
\partial_t V + \min_u \left\{ \ell(x,u) + \nabla V \cdot f(x,u) \right\}
59+
+ \frac{1}{2}\mathrm{Tr}\!\left(\sigma\sigma^T \nabla^2 V\right) = 0.
60+
\end{equation}
61+
62+
\subsection*{Derivation Task 1}
63+
Derive the HJB equation from the dynamic programming principle for the continuous-time control problem.
64+
65+
\section{Deep Reinforcement Learning as PDE Approximation}
66+
67+
In practical reinforcement learning, the value function $V(x)$ or action-value function $Q(x,u)$ is approximated by a neural network $V_\theta(x)$. Learning corresponds to minimizing a residual of the Bellman equation,
68+
\begin{equation}
69+
\mathcal{L}(\theta) = \mathbb{E}\left[ \left( \mathcal{T}V_\theta - V_\theta \right)^2 \right],
70+
\end{equation}
71+
where $\mathcal{T}$ denotes the Bellman operator.
72+
73+
From a PDE perspective:
74+
\begin{itemize}
75+
\item Neural networks act as nonlinear trial spaces,
76+
\item Training corresponds to a Galerkin or collocation method,
77+
\item Instabilities arise from nonlinearity and bootstrapping.
78+
\end{itemize}
79+
80+
\subsection*{Derivation Task 2}
81+
Show that the Bellman operator is a contraction in the discounted case and explain why this property is generally lost under nonlinear function approximation.
82+
83+
\section{Generative Models and Forward--Backward PDEs}
84+
85+
Generative models aim to learn a probability density $\rho(x)$ rather than an optimal control. Many modern generative models are governed by diffusion processes
86+
\begin{equation}
87+
dX_t = b(X_t,t)\,dt + \sqrt{2\beta^{-1}}\,dW_t,
88+
\end{equation}
89+
whose probability density evolves according to the Fokker--Planck equation
90+
\begin{equation}
91+
\partial_t \rho = -\nabla \cdot (b\rho) + \beta^{-1}\Delta \rho.
92+
\end{equation}
93+
94+
Diffusion models learn the \emph{reverse-time dynamics}, which can be written as
95+
\begin{equation}
96+
dX_t = \left[ b(X_t,t) - 2\beta^{-1}\nabla \log \rho_t(X_t) \right]dt + \sqrt{2\beta^{-1}}\,dW_t.
97+
\end{equation}
98+
99+
\subsection*{Derivation Task 3}
100+
Derive the reverse-time SDE associated with the Fokker--Planck equation and explain its connection to score matching.
101+
102+
\section{Variational and Entropic Perspectives}
103+
104+
Both reinforcement learning and generative modeling admit variational formulations.
105+
106+
In entropy-regularized RL, the objective becomes
107+
\begin{equation}
108+
J(\pi) = \mathbb{E}_\pi \left[ \sum_t r_t - \alpha \sum_t \log \pi(a_t|s_t) \right],
109+
\end{equation}
110+
leading to a modified HJB equation with a log-sum-exp structure.
111+
112+
Similarly, diffusion and score-based models can be interpreted as minimizing free-energy or Kullback--Leibler functionals over probability paths.
113+
114+
\subsection*{Derivation Task 4}
115+
Show that entropy-regularized reinforcement learning leads to a soft HJB equation and compare it to the variational objective of diffusion models.
116+
117+
\section{Control vs Inference: A PDE Comparison}
118+
119+
A central comparison explored in this project is:
120+
\begin{center}
121+
\begin{tabular}{l l}
122+
\textbf{Reinforcement Learning} & \textbf{Generative Learning} \\
123+
\hline
124+
Optimal control & Probabilistic inference \\
125+
HJB equation & Fokker--Planck equation \\
126+
Backward PDE & Forward--backward PDE \\
127+
Policy optimization & Density evolution \\
128+
\end{tabular}
129+
\end{center}
130+
131+
Students will analyze how:
132+
\begin{itemize}
133+
\item Policies correspond to optimal drift fields,
134+
\item Value functions resemble logarithmic transforms of densities,
135+
\item Control and sampling differ mathematically but share PDE structure.
136+
\end{itemize}
137+
138+
\subsection*{Derivation Task 5}
139+
Demonstrate the formal correspondence between a logarithmic transformation of the value function and a density-based formulation.
140+
141+
\section{Computational Experiments}
142+
143+
The computational component consists of:
144+
\begin{itemize}
145+
\item Solving a low-dimensional HJB equation numerically,
146+
\item Implementing a reinforcement learning agent approximating the same solution,
147+
\item Training a diffusion or score-based model on a related stochastic system.
148+
\end{itemize}
149+
150+
Results are compared in terms of convergence, stability, and approximation quality.
151+
152+
\section*{Expected Outcomes}
153+
154+
By completing this project, students will:
155+
\begin{itemize}
156+
\item Understand reinforcement learning and generative models as PDE problems,
157+
\item Connect stochastic control, inference, and variational principles,
158+
\item Analyze neural networks as numerical solvers,
159+
\item Gain tools relevant to scientific machine learning, control, and physics-informed AI.
160+
\end{itemize}
161+
162+
\end{document}
163+

0 commit comments

Comments
 (0)