- Building blocks of a Neural Network
- Neuron
- Layer
- Group of layers : Multiple layers are combined into blocks, forming repeating patterns of larger models.
- Network : The entire model
-
torch.nn.Sequential
is a sequential Container used to combine different layers to create a feed-forward network. -
nn.Linear(n,m)
is a module that creates single layer feed forward network withn
inputs andm
outputs.
import torch
from torch import nn
net = nn.Sequential(nn.Linear(20, 256), nn.ReLU(), nn.Linear(256, 10))
X = torch.rand(2, 20)
net(X)
# [Out] >>
tensor([[ 0.0780, -0.5502, 0.0677, 0.0065, -0.1885, 0.0720, -0.2333, 0.1405,
0.2586, -0.2608],
[-0.0240, -0.3782, 0.1303, -0.0035, -0.0373, 0.1256, -0.3652, 0.1166,
0.5024, -0.1961]], grad_fn=<AddmmBackward>)
-
Check §5.1.2 in the textbook for the coding detail of
nn.Sequential
. -
The
nn.Sequential
class makes model construction easy, but not all architectures are simple daisy chains. When greater flexibility is required, we will want to define our own blocks.
-
A basic Neural Network Modules created in PyTorch with the
nn.Module
, a base class used to develop all neural network models. -
Modules can be composed together to create complex functions. (e.g. Here we combine 2
nn.Linear
modules to create a bigger MLP module.)
import torch
from torch import nn
from torch.nn import functional as F
class MLP(nn.Module):
# Declare a layer with model parameters. Here, we declare two fully
# connected layers
def __init__(self, input_size, hidden_size, output_size):
# Call the constructor of the `MLP` parent class `Module` to perform
# the necessary initialization. In this way, other function arguments
# can also be specified during class instantiation, such as the model
# parameters, `params` (to be described later)
super().__init__()
self.hidden = nn.Linear(input_size, hidden_size) # Hidden layer
self.out = nn.Linear(hidden_size, output_size) # Output layer
# Define the forward propagation of the model, that is, how to return the
# required model output based on the input `X`
def forward(self, X):
# Note here we use the funtional version of ReLU defined in the
# nn.functional module.
return self.out(F.relu(self.hidden(X)))
net = MLP(input_size=20, hidden_size=256, output_size=10)
X = torch.rand(2, 20) # mock input : 2 examples, each example has 20 features.
net(X)
# [Out] >>
tensor([[-0.0388, 0.2331, 0.0764, 0.0713, -0.1126, -0.0475, 0.0977, -0.0520,
0.1215, 0.3064],
[ 0.0100, 0.1782, 0.1803, -0.0358, -0.0426, 0.0025, 0.0782, -0.1170,
0.0353, 0.1929]], grad_fn=<AddmmBackward>)
- In the
forward
function of ann.Module
class, you can define how your model is going to be run, from input to output.
class FixedHiddenMLP(nn.Module):
def __init__(self):
super().__init__()
# Random weight parameters that will not compute gradients and
# therefore keep constant during training
self.rand_weight = torch.rand((20, 20), requires_grad=False)
self.linear = nn.Linear(20, 20)
def forward(self, X):
X = self.linear(X)
# Use the created constant parameters, as well as the `relu` and `mm`
# functions
X = F.relu(torch.mm(X, self.rand_weight) + 1)
# Reuse the fully-connected layer. This is equivalent to sharing
# parameters with two fully-connected layers
X = self.linear(X)
# Control flow
while X.abs().sum() > 1:
X /= 2
return X.sum()
net = FixedHiddenMLP()
X = torch.rand(2, 20)
net(X)
- We can mix and match various ways of assembling blocks together. In the following example, we nest blocks in some creative ways.
class NestMLP(nn.Module):
def __init__(self):
super().__init__()
self.net = nn.Sequential(nn.Linear(20, 64), nn.ReLU(),
nn.Linear(64, 32), nn.ReLU())
self.linear = nn.Linear(32, 16)
def forward(self, X):
return self.linear(self.net(X))
chimera = nn.Sequential(NestMLP(), nn.Linear(16, 20), FixedHiddenMLP())
chimera(X)