PyTorch RuntimeError: Stack expects each tensor to be equal size
Error Message:
RuntimeError: stack expects each tensor to be equal size, but got [1, 64, 64, 3] at entry 0 and [1, 64, 64] at entry 1
What Happened?
This error occurs during DataLoader batch collation when PyTorch tries to stack multiple tensors into a single batch. In PyTorch, all tensors in a batch must have the same shape.
Understanding the Tensor Shapes
[1, 64, 64]: This represents a single-channel (grayscale) image with height and width 64 pixels. The first dimension (1) is the channel dimension.[1, 64, 64, 3]: This represents a multi-channel image (likely RGB) where the last dimension3represents 3 color channels (Red, Green, Blue).
Since PyTorch expects tensors to have shape [channels, height, width], a proper single-channel image should always have shape [1, 64, 64]. Having a 4D tensor with shape [1, 64, 64, 3] breaks the uniformity and causes the DataLoader to fail.
Why Some Images Have 3 Channels?
Even though PET/CT images are usually grayscale, some DICOMs may have multiple channels due to:
- Scanner export settings
- Color overlays or RGB annotations
How to Fix?
Convert all images to single-channel grayscale before returning from the dataset:
if img.ndim == 3:
img = img.mean(axis=-1) # Convert RGB to grayscale
After this, all images will have the shape [1, 64, 64], and the DataLoader can stack them correctly.
Summary
- PyTorch DataLoader requires all tensors in a batch to have the same shape.
- The error occurs when some images are RGB (multi-channel) and some are grayscale (single-channel).
- Always ensure all images are converted to a consistent shape before batching.