Getting Started With Image Classification Using Pytorch And The Cifar 10 Dataset: Boosting Accuracy With Improved Models

We will first train the model in the original CIFAR-10 dataset without making any changes to the dataset. Below the code is implemented and explained step by step through the process.

First, you will need to import the necessary libraries and set some parameters for the training process:

import torch
import torchvision
import torchvision.transforms as transforms

batch_size = 128
num_epochs = 10
learning_rate = 0.001

Next, you will need to load the CIFAR-10 dataset and apply any necessary preprocessing:

# Load the CIFAR-10 dataset
train_dataset = torchvision.datasets.CIFAR10(root='path/to/data', train=True,
                                        download=True, transform=transforms.ToTensor())
test_dataset = torchvision.datasets.CIFAR10(root='path/to/data', train=False,
                                       download=True, transform=transforms.ToTensor())

# Create data loaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size,
                                          shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size,
                                         shuffle=False)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to path/to/data/cifar-10-python.tar.gz



  0%|          | 0/170498071 [00:00<?, ?it/s]


Extracting path/to/data/cifar-10-python.tar.gz to path/to/data
Files already downloaded and verified

Then you will need to define your model, in this case a convolutional neural network:

import torch.nn as nn

# Define the model
class CIFAR10Model(nn.Module):
    def __init__(self):
        super(CIFAR10Model, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)
        self.fc1 = nn.Linear(128 * 8 * 8, 10)

    def forward(self, x):
        x = self.pool(nn.functional.relu(self.conv1(x)))
        x = self.pool(nn.functional.relu(self.conv2(x)))
        x = x.view(-1, 128 * 8 * 8)
        x = self.fc1(x)
        return x

# Create an instance of the model
model = CIFAR10Model()

Then you will need to define a loss function and an optimizer:

# Define a loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

Now you can train your model:

# Train the model
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    # Print the current loss
    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

Epoch [1/10], Loss: 1.2209
Epoch [2/10], Loss: 1.0396
Epoch [3/10], Loss: 0.7750
Epoch [4/10], Loss: 0.7497
Epoch [5/10], Loss: 0.8293
Epoch [6/10], Loss: 0.9280
Epoch [7/10], Loss: 0.6827
Epoch [8/10], Loss: 0.7405
Epoch [9/10], Loss: 0.6626
Epoch [10/10], Loss: 0.7509

Once the model is trained, you can evaluate its performance on the test dataset:

# Test the model
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print(f'Accuracy of the model on the test images: {100 * correct / total}%')

Accuracy of the model on the test images: 71.3%

You can increase the number of epochs in order to put your model in more training iterations and improve the accuracy. You can also experiment with different types of techniques that improve the accuracy of the model like:

Increase the amount of training data: More data can help the model learn better representations of the images.

Use data augmentation: Techniques such as flipping, rotating, and cropping images can help the model learn to be more robust to variations in the input.

Use a pre-trained model: Transfer learning can be used to leverage the knowledge learned by a model trained on a large dataset to improve the performance of a model for a different task.

Use a more powerful model architecture: Models such as ResNet and Inception can help improve the performance of image classification tasks.

Fine-tune the model: Once you have a model that is working well, you can fine-tune it by training it on a smaller dataset or by adjusting the model’s hyperparameters to improve its performance.

Use techniques like Batch normalization, Dropout, and Early stopping to improve the performance of the model.

Use a larger image resolution, it will give the model more information to work with.

Finally, use a good quality dataset and make sure the model is generalizing well on unseen data.

Getting Started With Image Classification Using Pytorch And The Cifar 10 Dataset

Blog Archive

Archive of all previous blog posts