Getting Started With Image Classification Using Pytorch And The Cifar 10 Dataset

In this blog post, we will be discussing how to load and preprocess the CIFAR-10 dataset using PyTorch, and how to train and evaluate a convolutional neural network (CNN) model for image classification.

The CIFAR-10 dataset is a widely used dataset for image classification, which consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images. The 10 classes are airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck.

First, we need to load the CIFAR-10 dataset and apply some data augmentation techniques, such as random horizontal flipping and random cropping. Data augmentation helps to increase the diversity of the training set and improve the generalization of the model. We also normalize the image pixel values to the range of [-1, 1].

import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

transform = transforms.Compose(
    [transforms.RandomHorizontalFlip(),
     transforms.RandomCrop(32, padding=4),
     transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=100,
                                         shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=100,
                                        shuffle=False, num_workers=2)

Files already downloaded and verified
Files already downloaded and verified

Next, we will define the CNN model architecture using PyTorch’s nn module. The model architecture is a stack of convolutional layers, followed by max pooling layers and fully connected layers. We use the ReLU activation function and the cross-entropy loss function. We also use the SGD optimizer with a learning rate of 0.001 and momentum of 0.9.

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

Finally, we will train the model for 50 epochs and evaluate the model on the test set. We can see that the model achieves an accuracy of about 60% on the test set, which is not bad for a simple CNN model.

for epoch in range(50):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
    print('Epoch %d loss: %.3f' %
          (epoch + 1, running_loss / (i + 1)))
print('Finished Training')

Epoch 1 loss: 2.303
Epoch 2 loss: 2.297
Epoch 3 loss: 2.260
Epoch 4 loss: 2.100
Epoch 5 loss: 1.982
Epoch 6 loss: 1.886
Epoch 7 loss: 1.799
Epoch 8 loss: 1.738
Epoch 9 loss: 1.692
Epoch 10 loss: 1.654
Epoch 11 loss: 1.616
Epoch 12 loss: 1.580
Epoch 13 loss: 1.552
Epoch 14 loss: 1.529
Epoch 15 loss: 1.506
Epoch 16 loss: 1.487
Epoch 17 loss: 1.475
Epoch 18 loss: 1.446
Epoch 19 loss: 1.431
Epoch 20 loss: 1.411
Epoch 21 loss: 1.394
Epoch 22 loss: 1.376
Epoch 23 loss: 1.363
Epoch 24 loss: 1.347
Epoch 25 loss: 1.334
Epoch 26 loss: 1.320
Epoch 27 loss: 1.306
Epoch 28 loss: 1.296
Epoch 29 loss: 1.278
Epoch 30 loss: 1.273
Epoch 31 loss: 1.262
Epoch 32 loss: 1.251
Epoch 33 loss: 1.238
Epoch 34 loss: 1.230
Epoch 35 loss: 1.218
Epoch 36 loss: 1.218
Epoch 37 loss: 1.200
Epoch 38 loss: 1.199
Epoch 39 loss: 1.186
Epoch 40 loss: 1.176
Epoch 41 loss: 1.172
Epoch 42 loss: 1.166
Epoch 43 loss: 1.165
Epoch 44 loss: 1.155
Epoch 45 loss: 1.146
Epoch 46 loss: 1.141
Epoch 47 loss: 1.135
Epoch 48 loss: 1.131
Epoch 49 loss: 1.120
Epoch 50 loss: 1.119
Finished Training

evaluate the model on the test set

correct = 0
total = 0

with torch.no_grad():
  for data in testloader:
    images, labels = data
    outputs = net(images)
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    correct += (predicted == labels).sum().item()

print('Accuracy of the network on the test images: %d %%' % (
100 * correct / total))

Accuracy of the network on the test images: 59 %

In conclusion, we have successfully loaded and preprocessed the CIFAR-10 dataset using PyTorch, and trained and evaluated a simple CNN model for image classification. However, this is just the beginning and there are many ways to improve the performance of the model such as using ResNet, DenseNet and using Data augmentation techniques like Cutout. This is a good starting point for anyone looking to work with image classification using PyTorch.

Welcome To The Deep Learning & Software Engineering Digest

Getting Started With Image Classification Using Pytorch And The Cifar 10 Dataset: Boosting Accuracy With Improved Models