Using Pre-trained Models: PyTorch and Keras¶
In this post, we will try to use pre-trained models to do image classification. We will use two popular deep learning frameworks, PyTorch and Keras. Let's find out the workflow of using pre-trained models in these two frameworks.
PyTorch pre-trained models¶
Let's first look at the pre-trained models in PyTorch. We can find all of them in torchvision.models
.
from torchvision import models
import torch
dir(models)
Step 1: load pre-trained model¶
# load the pretrained alexnet
alexnet = models.alexnet(pretrained=True)
# view the alexnet
print(alexnet)
Step 2: Specify transformations of images¶
Once we have the model with us, the next step is to transform the input image so that they have the right shape and other characteristics like mean and standard deviation. These values should be similar to the ones which were used while training the model.
from torchvision import transforms
transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)])
Step 3: load and transform input image¶
%matplotlib inline
from PIL import Image
import matplotlib.pyplot as plt
img = Image.open("cat1.jpg")
plt.imshow(img)
# transform the image and prepare a batch to be passed to the alexnet
img_t = transform(img)
batch_t = torch.unsqueeze(img_t, 0)
Step 4: Model Inference¶
# put our model to inference mode
alexnet.eval()
# forward pass
out = alexnet(batch_t)
print(out.shape)
For this, we will first read and store the labels from a text file having a list of all the 1000 labels.
# read in imagenet class labels
with open("imagenet_classes.txt") as f:
classes = [line.strip() for line in f.readlines()]
_, index = torch.max(out, 1)
percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100
print(classes[index[0]], percentage[index[0]].item())
We got the 'Siamese cat' class with an over 99% confidence. Let's see what other labels it might be.
_, indices = torch.sort(out, descending=True)
[(classes[idx], percentage[idx].item()) for idx in indices[0][:5]]
Let's try out resnet50 and resnet101.¶
# resnet 50
# first load model
resnet_50 = models.resnet50(pretrained=True)
# then put the model in eval mode
resnet_50.eval()
# forward pass
out = resnet_50(batch_t)
# Forth, print the top 5 classes predicted by the model
_, indices = torch.sort(out, descending=True)
percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100
[(classes[idx], percentage[idx].item()) for idx in indices[0][:5]]
# resnet 101
# first load model
resnet_101 = models.resnet101(pretrained=True)
# then put the model in eval mode
resnet_101.eval()
# forward pass
out = resnet_101(batch_t)
# Forth, print the top 5 classes predicted by the model
_, indices = torch.sort(out, descending=True)
percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100
[(classes[idx], percentage[idx].item()) for idx in indices[0][:5]]
Keras work flow on pre-trained models¶
Let's try out an example using keras. First, let's find out the pretrained models in keras.
import keras
dir(keras.applications)
In this case, we will only try out mobilenetV2. Other models work in a similar way.
from keras.preprocessing.image import load_img, img_to_array
from keras.applications.imagenet_utils import decode_predictions
from keras.applications import mobilenet_v2
from keras.applications.mobilenet_v2 import preprocess_input
import numpy as np
# first, load image , to 224*224 imagenet image size
original_image = load_img("cat1.jpg", target_size=(224, 224))
# second, convert the PIL image to numpy array
numpy_image = img_to_array(original_image)
# third, convert the image into 4D tensor (samples, height, width, channels)
input_image = np.expand_dims(numpy_image, axis=0)
print('PIL image size = ', original_image.size)
print('NumPy image size = ', numpy_image.shape)
print('Input image size = ', input_image.shape)
plt.imshow(np.uint8(input_image[0]))
# fourth, Normalize the image
processed_image = preprocess_input(input_image.copy())
Now, we are ready to make predictions.
mobilenet_model= mobilenet_v2.MobileNetV2(weights="imagenet")
prediction = mobilenet_model.predict(processed_image)
label = decode_predictions(prediction)
print('label = ', label[0][:5])
Now, we have seen the workflows of using pre-trained models in PyTorch and Tensorflow. Using these pre-trained models is very convenient, but in most cases, they may not satisfy the specifications of our applications. We may want a more specific model. It opens up another topic Transfer Learning, or Fine Tuning these pre-trained models to meet our demands. In a flowing post, we will focus on Tranfer Learning using these models.
Comments
comments powered by Disqus