Skip to main content

What’s happening inside a Convolutional Neural Network (CNN)?


In general, a Convolutional Neural Network (CNN) consists of an input layer, hidden layers, and an output layer. Real-world CNNs are nonlinear because they include activation functions that introduce nonlinearity.

In our case, the model takes a single input, which is passed to a 13-dimensional linear layer, followed by a nonlinear 13-dimensional tanh activation layer. The output from this layer is then passed through another 13-dimensional linear layer that produces a single output value.

Linear layers contain weights and biases, following the equation y = mx + b, while the tanh activation function outputs values in the range [-1, 1].


1. What’s happening overall

The text explains how PyTorch stores and updates the weights and biases (called parameters) of a small neural network built using nn.Sequential.

A “parameter” is just a number that the model can learn, like weights and biases.


2. model.parameters()

When you call model.parameters(), PyTorch collects all the weights and biases from every layer in your model.


[param.shape for param in seq_model.parameters()]
# Output:
[torch.Size([13, 1]), torch.Size([13]), torch.Size([1, 13]), torch.Size([1])]
  

This means:

  • First layer weights: [13, 1]
  • First layer bias: [13]
  • Second layer weights: [1, 13]
  • Second layer bias: [1]

These are the exact numbers the optimizer (like SGD or Adam) will update during training.


3. After backward()

When you run loss.backward(), PyTorch calculates how each parameter should change (the gradient).

  1. Compute loss
  2. Call loss.backward() → get gradients
  3. Call optimizer.step() → update weights

4. named_parameters()

This function gives you the names of the parameters along with their values.


for name, param in seq_model.named_parameters():
    print(name, param.shape)

# Output:
0.weight torch.Size([13, 1])
0.bias torch.Size([13])
2.weight torch.Size([1, 13])
2.bias torch.Size([1])
  

Here 0 and 2 are the layer order numbers inside nn.Sequential.


5. Using OrderedDict for readable names


from collections import OrderedDict
seq_model = nn.Sequential(OrderedDict([
    ('hidden_linear', nn.Linear(1, 8)),
    ('hidden_activation', nn.Tanh()),
    ('output_linear', nn.Linear(8, 1))
]))
  

Now the parameters look more descriptive:


hidden_linear.weight torch.Size([8, 1])
hidden_linear.bias torch.Size([8])
output_linear.weight torch.Size([1, 8])
output_linear.bias torch.Size([1])

  

6. Accessing specific parameters


seq_model.output_linear.bias

# Output:
Parameter containing:
tensor([-0.0173], requires_grad=True)
  

This means it’s a bias value that will be updated during training.


7. Checking gradients

After training, you can see how much each parameter changed by checking its .grad value:


seq_model.hidden_linear.weight.grad
  

This shows how each weight in the hidden layer changed after training.


Summary Table

Concept Meaning
parameters() Collects all weights and biases of the model
named_parameters() Same, but includes names for easier identification
loss.backward() Calculates gradients (how much each parameter should change)
optimizer.step() Updates all parameters using those gradients
OrderedDict Lets you name your layers instead of using numbers
.grad Shows the gradient of a parameter after backpropagation

In short: PyTorch tracks every learnable weight and bias in your model, computes their gradients when you train, and updates them using the optimizer to make the model perform better.


Further Reading


People are good at skipping over material they already know!

View Related Topics to







Contact Us

Name

Email *

Message *

Popular Posts

Constellation Diagrams of ASK, PSK, and FSK with MATLAB Code + Simulator

๐Ÿ“˜ Overview of Energy per Bit (Eb / N0) ๐Ÿงฎ Online Simulator for constellation diagrams of ASK, FSK, and PSK ๐Ÿงฎ Theory behind Constellation Diagrams of ASK, FSK, and PSK ๐Ÿงฎ MATLAB Codes for Constellation Diagrams of ASK, FSK, and PSK ๐Ÿ“š Further Reading ๐Ÿ“‚ Other Topics on Constellation Diagrams of ASK, PSK, and FSK ... ๐Ÿงฎ Simulator for constellation diagrams of m-ary PSK ๐Ÿงฎ Simulator for constellation diagrams of m-ary QAM BASK (Binary ASK) Modulation: Transmits one of two signals: 0 or -√Eb, where Eb​ is the energy per bit. These signals represent binary 0 and 1.    BFSK (Binary FSK) Modulation: Transmits one of two signals: +√Eb​ ( On the y-axis, the phase shift of 90 degrees with respect to the x-axis, which is also termed phase offset ) or √Eb (on x-axis), where Eb​ is the energy per bit. These signals represent binary 0 and 1.  BPSK (Binary PSK) Modulation: Transmits one of two signals...

Fading : Slow & Fast and Large & Small Scale Fading (with MATLAB Code + Simulator)

๐Ÿ“˜ Overview ๐Ÿ“˜ LARGE SCALE FADING ๐Ÿ“˜ SMALL SCALE FADING ๐Ÿ“˜ SLOW FADING ๐Ÿ“˜ FAST FADING ๐Ÿงฎ MATLAB Codes ๐Ÿ“š Further Reading LARGE SCALE FADING The term 'Large scale fading' is used to describe variations in received signal power over a long distance, usually just considering shadowing.  Assume that a transmitter (say, a cell tower) and a receiver  (say, your smartphone) are in constant communication. Take into account the fact that you are in a moving vehicle. An obstacle, such as a tall building, comes between your cell tower and your vehicle's line of sight (LOS) path. Then you'll notice a decline in the power of your received signal on the spectrogram. Large-scale fading is the term for this type of phenomenon. SMALL SCALE FADING  Small scale fading is a term that describes rapid fluctuations in the received signal power on a small time scale. This includes multipath propagation effects as well as movement-induced Doppler fr...

Online Simulator for ASK, FSK, and PSK

Try our new Digital Signal Processing Simulator!   Start Simulator for binary ASK Modulation Message Bits (e.g. 1,0,1,0) Carrier Frequency (Hz) Sampling Frequency (Hz) Run Simulation Simulator for binary FSK Modulation Input Bits (e.g. 1,0,1,0) Freq for '1' (Hz) Freq for '0' (Hz) Sampling Rate (Hz) Visualize FSK Signal Simulator for BPSK Modulation ...

Theoretical BER vs SNR for BPSK

Theoretical Bit Error Rate (BER) vs Signal-to-Noise Ratio (SNR) for BPSK in AWGN Channel Let’s simplify the explanation for the theoretical Bit Error Rate (BER) versus Signal-to-Noise Ratio (SNR) for Binary Phase Shift Keying (BPSK) in an Additive White Gaussian Noise (AWGN) channel. Key Points Fig. 1: Constellation Diagrams of BASK, BFSK, and BPSK [↗] BPSK Modulation Transmits one of two signals: +√Eb or −√Eb , where Eb is the energy per bit. These signals represent binary 0 and 1 . AWGN Channel The channel adds Gaussian noise with zero mean and variance N₀/2 (where N₀ is the noise power spectral density). Receiver Decision The receiver decides if the received signal is closer to +√Eb (for bit 0) or −√Eb (for bit 1) . Bit Error Rat...

Understanding the Q-function in BASK, BFSK, and BPSK

Understanding the Q-function in BASK, BFSK, and BPSK 1. Definition of the Q-function The Q-function is the tail probability of the standard normal distribution: Q(x) = (1 / √(2ฯ€)) ∫ x ∞ e -t²/2 dt What is Q(1)? Q(1) ≈ 0.1587 This means there is about a 15.87% chance that a Gaussian random variable exceeds 1 standard deviation above the mean. What is Q(2)? Q(2) ≈ 0.0228 This means there is only a 2.28% chance that a Gaussian value exceeds 2 standard deviations above the mean. Difference Between Q(1) and Q(2) Even though the argument changes from 1 to 2 (a small increase), the probability drops drastically: Q(1) = 0.1587 → errors fairly likely Q(2) = 0.0228 → errors much rarer This shows how fast the tail of the Gaussian distribution decays. It’s also why BER drops drama...

Pulse Shaping using Raised Cosine Filter (with MATLAB + Simulator)

  MATLAB Code for Raised Cosine Filter Pulse Shaping clc; clear; close all ; %% ===================================================== %% PARAMETERS %% ===================================================== N = 64; % Number of OFDM subcarriers cpLen = 16; % Cyclic prefix length modOrder = 4; % QPSK oversample = 8; % Oversampling factor span = 10; % RRC filter span in symbols rolloff = 0.25; % RRC roll-off factor %% ===================================================== %% Generate Baseband OFDM Symbols %% ===================================================== data = randi([0 modOrder-1], N, 1); % Random bits txSymbols = pskmod(data, modOrder, pi/4); % QPSK modulation % IFFT to get OFDM symbol tx_ofdm = ifft(txSymbols, N); % Add cyclic prefix tx_cp = [tx_ofdm(end-cpLen+1:end); tx_ofdm]; %% ===================================================== %% Oversample the Baseband Signal %% ===============================================...

What is - 3dB Frequency Response? Applications ...

๐Ÿ“˜ Overview & Theory ๐Ÿ“˜ Application of -3dB Frequency Response ๐Ÿงฎ MATLAB Codes ๐Ÿงฎ Online Digital Filter Simulator ๐Ÿ“š Further Reading Filters What is -3dB Frequency Response?   Remember, for most passband filters, the magnitude response typically remains close to the peak value within the passband, varying by no more than 3 dB. This is a standard characteristic in filter design. The term '-3dB frequency response' indicates that power has decreased to 50% of its maximum or that signal voltage has reduced to 0.707 of its peak value. Specifically, The -3dB comes from either 10 Log (0.5) {in the case of power} or 20 Log (0.707) {in the case of amplitude} . Viewing the signal in the frequency domain is helpful. In electronic amplifiers, the -3 dB limit is commonly used to define the passband. It shows whether the signal remains approximately flat across the passband. For example, in pulse shapi...

Theoretical BER vs SNR for m-ary PSK and QAM

Relationship Between Bit Error Rate (BER) and Signal-to-Noise Ratio (SNR) The relationship between Bit Error Rate (BER) and Signal-to-Noise Ratio (SNR) is a fundamental concept in digital communication systems. Here’s a detailed explanation: BER (Bit Error Rate): The ratio of the number of bits incorrectly received to the total number of bits transmitted. It measures the quality of the communication link. SNR (Signal-to-Noise Ratio): The ratio of the signal power to the noise power, indicating how much the signal is corrupted by noise. Relationship The BER typically decreases as the SNR increases. This relationship helps evaluate the performance of various modulation schemes. BPSK (Binary Phase Shift Keying) Simple and robust. BER in AWGN channel: BER = 0.5 × erfc(√SNR) Performs well at low SNR. QPSK (Quadrature...