Neural Networks for Tabular and Image Data

Using Neural Networks for Tabular and Image Data: A Practical Guide

Neural networks are versatile models that can learn patterns from a wide variety of data types. In this article, we explore how to feed tabular data and image data into neural networks for classification tasks, with practical examples in PyTorch.

1. Tabular Data: Features and Targets

Tabular data is structured in rows and columns, where:

Features: Input variables used by the model to learn patterns (e.g., age, salary, scores).
Target: The output variable the model predicts (e.g., personality type, class label).

Example: Personality Classification

Suppose we have a synthetic dataset with 29 features representing various attributes of individuals and a target column personality_type.

my_df = pd.read_csv('personality_synthetic_dataset.csv')

# Split features and target
X = my_df.drop('personality_type', axis=1).values   # 29 features
y = my_df['personality_type'].values                # target

Data Preparation

Before feeding data into a neural network:

Train-test split: To evaluate model performance.
Normalization: Ensures that all features are on a similar scale.
Label encoding: Converts categorical targets into numeric form.

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=10
)

X_train = torch.FloatTensor(X_train)
X_test = torch.FloatTensor(X_test)

label_encoder = LabelEncoder()
y_train = torch.LongTensor(label_encoder.fit_transform(y_train))
y_test = torch.LongTensor(label_encoder.transform(y_test))

X_train_mean = X_train.mean(dim=0)
X_train_std = X_train.std(dim=0)
X_train = (X_train - X_train_mean) / X_train_std
X_test = (X_test - X_train_mean) / X_train_std

Defining the Neural Network

A simple feedforward network with multiple fully connected layers can learn patterns from tabular data:

class Model(nn.Module):
    def __init__(self, in_features=29, h1=64, h2=32, h3=16, out_features=3):
        super().__init__()
        self.fc1 = nn.Linear(in_features, h1)
        self.fc2 = nn.Linear(h1, h2)
        self.fc3 = nn.Linear(h2, h3)
        self.fc4 = nn.Linear(h3, out_features)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        return self.fc4(x)

Training involves defining a loss function (e.g., cross-entropy for classification) and an optimizer (e.g., Adam):

model = Model()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

After training, the network can predict the personality type of unseen individuals.

2. Image Data: Classes and Pretrained Models

Images are high-dimensional data, and convolutional neural networks (CNNs) are the standard choice for extracting spatial patterns.

Dataset Structure

For PyTorch, image datasets are often organized as:

dataset/
  class_1/
    img1.jpg
    img2.jpg
  class_2/
    img1.jpg
    img2.jpg

Each folder represents a class.
Images are fed to the network in batches using a DataLoader.

dataset = datasets.ImageFolder(root="dataset", transform=transform)
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
label_names = dataset.classes
num_classes = len(label_names)

Using Pretrained Models

Pretrained models like ResNet18 can accelerate training:

model = models.resnet18(weights=ResNet18_Weights.DEFAULT)

# Freeze all layers except the final fully connected layer
for param in model.parameters():
    param.requires_grad = False

model.fc = nn.Linear(model.fc.in_features, num_classes)
for param in model.fc.parameters():
    param.requires_grad = True

model.to(device)

Training Loop

Only the last layer is optimized to adapt the pretrained model to our dataset:

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.fc.parameters(), lr=0.001)

for epoch in range(5):
    running_loss = 0.0
    model.train()
    for images, labels in dataloader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
    print(f"Epoch {epoch+1}, Loss: {running_loss / len(dataloader):.4f}")

After training, the model can classify images into the correct categories with high accuracy.

3. Summary

Tabular Data:
- Identify features and target.
- Normalize features and encode labels.
- Use feedforward networks for classification.
Image Data:
- Organize images by class folders.
- Use pretrained CNNs to leverage transfer learning.
- Replace the final layer to match the number of classes.
Practical Use:
- The code can be applied to any tabular dataset for classification.
- Image classification can be performed on datasets ranging from medical images to object recognition.

This framework shows how different data types—structured vs. unstructured—can be processed for neural networks, enabling practical machine learning applications.

Search This Blog

Neural Networks for Tabular and Image Data

Using Neural Networks for Tabular and Image Data: A Practical Guide

1. Tabular Data: Features and Targets

Example: Personality Classification

Data Preparation

Defining the Neural Network

2. Image Data: Classes and Pretrained Models

Dataset Structure

Using Pretrained Models

Training Loop

3. Summary

Parent Topics

Contact Us

Popular Posts

BER vs SNR for M-ary QAM, M-ary PSK, QPSK, BPSK, ...

Constellation Diagrams of ASK, PSK, and FSK

Online Simulator for ASK, FSK, and PSK

Channel Impulse Response (CIR)

Power Spectral Density Calculation Using FFT in MATLAB

Comparisons among ASK, PSK, and FSK | And the definitions of each

RMS Delay Spread, Excess Delay Spread and Multi-path ...

Coherence Bandwidth and Coherence Time