What’s happening inside a Convolutional Neural Network (CNN)?

In general, a Convolutional Neural Network (CNN) consists of an input layer, hidden layers, and an output layer. Real-world CNNs are nonlinear because they include activation functions that introduce nonlinearity.

In our case, the model takes a single input, which is passed to a 13-dimensional linear layer, followed by a nonlinear 13-dimensional tanh activation layer. The output from this layer is then passed through another 13-dimensional linear layer that produces a single output value.

Linear layers contain weights and biases, following the equation y = mx + b, while the tanh activation function outputs values in the range [-1, 1].

1. What’s happening overall

The text explains how PyTorch stores and updates the weights and biases (called parameters) of a small neural network built using nn.Sequential.

A “parameter” is just a number that the model can learn, like weights and biases.

2. `model.parameters()`

When you call model.parameters(), PyTorch collects all the weights and biases from every layer in your model.


[param.shape for param in seq_model.parameters()]
# Output:
[torch.Size([13, 1]), torch.Size([13]), torch.Size([1, 13]), torch.Size([1])]

This means:

First layer weights: [13, 1]
First layer bias: [13]
Second layer weights: [1, 13]
Second layer bias: [1]

These are the exact numbers the optimizer (like SGD or Adam) will update during training.

3. After `backward()`

When you run loss.backward(), PyTorch calculates how each parameter should change (the gradient).

Compute loss
Call loss.backward() → get gradients
Call optimizer.step() → update weights

4. `named_parameters()`

This function gives you the names of the parameters along with their values.


for name, param in seq_model.named_parameters():
    print(name, param.shape)

# Output:
0.weight torch.Size([13, 1])
0.bias torch.Size([13])
2.weight torch.Size([1, 13])
2.bias torch.Size([1])

Here 0 and 2 are the layer order numbers inside nn.Sequential.

5. Using `OrderedDict` for readable names


from collections import OrderedDict
seq_model = nn.Sequential(OrderedDict([
    ('hidden_linear', nn.Linear(1, 8)),
    ('hidden_activation', nn.Tanh()),
    ('output_linear', nn.Linear(8, 1))
]))

Now the parameters look more descriptive:


hidden_linear.weight torch.Size([8, 1])
hidden_linear.bias torch.Size([8])
output_linear.weight torch.Size([1, 8])
output_linear.bias torch.Size([1])

6. Accessing specific parameters


seq_model.output_linear.bias

# Output:
Parameter containing:
tensor([-0.0173], requires_grad=True)

This means it’s a bias value that will be updated during training.

7. Checking gradients

After training, you can see how much each parameter changed by checking its .grad value:


seq_model.hidden_linear.weight.grad

This shows how each weight in the hidden layer changed after training.

Summary Table

Concept	Meaning
`parameters()`	Collects all weights and biases of the model
`named_parameters()`	Same, but includes names for easier identification
`loss.backward()`	Calculates gradients (how much each parameter should change)
`optimizer.step()`	Updates all parameters using those gradients
`OrderedDict`	Lets you name your layers instead of using numbers
`.grad`	Shows the gradient of a parameter after backpropagation

In short: PyTorch tracks every learnable weight and bias in your model, computes their gradients when you train, and updates them using the optimizer to make the model perform better.

Search This Blog