Skip to content

Latest commit

 

History

History
127 lines (99 loc) · 4.52 KB

5.1_model-construction.md

File metadata and controls

127 lines (99 loc) · 4.52 KB
  • Building blocks of a Neural Network
    • Neuron
    • Layer
    • Group of layers : Multiple layers are combined into blocks, forming repeating patterns of larger models.
    • Network : The entire model

1. The Sequential Block

  • torch.nn.Sequential is a sequential Container used to combine different layers to create a feed-forward network.

  • nn.Linear(n,m) is a module that creates single layer feed forward network with n inputs and m outputs.

import torch 
from torch import nn

net = nn.Sequential(nn.Linear(20, 256), nn.ReLU(), nn.Linear(256, 10))
X = torch.rand(2, 20)
net(X)
# [Out] >> 
        tensor([[ 0.0780, -0.5502,  0.0677,  0.0065, -0.1885,  0.0720, -0.2333,  0.1405,
                  0.2586, -0.2608],
                [-0.0240, -0.3782,  0.1303, -0.0035, -0.0373,  0.1256, -0.3652,  0.1166,
                  0.5024, -0.1961]], grad_fn=<AddmmBackward>)
  • Check §5.1.2 in the textbook for the coding detail of nn.Sequential.

  • The nn.Sequential class makes model construction easy, but not all architectures are simple daisy chains. When greater flexibility is required, we will want to define our own blocks.

2. A Custom Block (§5.1.1)

  • A basic Neural Network Modules created in PyTorch with the nn.Module, a base class used to develop all neural network models.

  • Modules can be composed together to create complex functions. (e.g. Here we combine 2 nn.Linear modules to create a bigger MLP module.)

import torch
from torch import nn
from torch.nn import functional as F

class MLP(nn.Module):
    # Declare a layer with model parameters. Here, we declare two fully
    # connected layers
    def __init__(self, input_size, hidden_size, output_size):
        # Call the constructor of the `MLP` parent class `Module` to perform
        # the necessary initialization. In this way, other function arguments
        # can also be specified during class instantiation, such as the model
        # parameters, `params` (to be described later)
        super().__init__()
        self.hidden = nn.Linear(input_size, hidden_size)  # Hidden layer
        self.out = nn.Linear(hidden_size, output_size)  # Output layer

    # Define the forward propagation of the model, that is, how to return the
    # required model output based on the input `X`
    def forward(self, X):
        # Note here we use the funtional version of ReLU defined in the
        # nn.functional module.
        return self.out(F.relu(self.hidden(X)))

net = MLP(input_size=20, hidden_size=256, output_size=10)
X = torch.rand(2, 20) # mock input : 2 examples, each example has 20 features.
net(X)
# [Out] >> 
        tensor([[-0.0388,  0.2331,  0.0764,  0.0713, -0.1126, -0.0475,  0.0977, -0.0520,
                  0.1215,  0.3064],
                [ 0.0100,  0.1782,  0.1803, -0.0358, -0.0426,  0.0025,  0.0782, -0.1170,
                  0.0353,  0.1929]], grad_fn=<AddmmBackward>)

3. Executing Code in the Forward Propagation Function

  • In the forward function of a nn.Module class, you can define how your model is going to be run, from input to output.
class FixedHiddenMLP(nn.Module):
    def __init__(self):
        super().__init__()
        # Random weight parameters that will not compute gradients and
        # therefore keep constant during training
        self.rand_weight = torch.rand((20, 20), requires_grad=False)
        self.linear = nn.Linear(20, 20)

    def forward(self, X):
        X = self.linear(X)
        # Use the created constant parameters, as well as the `relu` and `mm`
        # functions
        X = F.relu(torch.mm(X, self.rand_weight) + 1)
        # Reuse the fully-connected layer. This is equivalent to sharing
        # parameters with two fully-connected layers
        X = self.linear(X)
        # Control flow
        while X.abs().sum() > 1:
            X /= 2
        return X.sum()

net = FixedHiddenMLP()

X = torch.rand(2, 20)
net(X)
  • We can mix and match various ways of assembling blocks together. In the following example, we nest blocks in some creative ways.
class NestMLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(nn.Linear(20, 64), nn.ReLU(),
                                 nn.Linear(64, 32), nn.ReLU())
        self.linear = nn.Linear(32, 16)

    def forward(self, X):
        return self.linear(self.net(X))

chimera = nn.Sequential(NestMLP(), nn.Linear(16, 20), FixedHiddenMLP())
chimera(X)