Multi-class Semantic Segmentation of Satellite Images using U-Net

in

Essentially, what we want to do is take a bunch of satellite imagery and label each pixel as belonging to one of several categories (e.g., water, forest, urban).

To accomplish this task, the authors use a deep learning technique called U-Net. This architecture consists of two main components: an encoder that extracts features from the input image, and a decoder that upsamples those features back to their original size while preserving spatial information. The output is then passed through a softmax function to produce probabilities for each class at every pixel in the image.

Here’s how it works step by step:
1. Load your satellite imagery dataset and split it into training, validation, and test sets. Make sure you have enough data to train your model effectively (at least 500 images per category).
2. Preprocess your data by resizing all the images to a standard size (e.g., 1024×1024 pixels) and converting them into grayscale format if necessary. This will help speed up training and improve performance.
3. Train your U-Net model using a loss function that encourages accurate segmentation of each class, such as cross entropy or dice coefficient loss. You can use popular deep learning frameworks like TensorFlow or PyTorch to implement this step.
4. Once you’ve trained your model, test it on a separate set of images and evaluate its performance using metrics like mean intersection over union (mIoU) or F1 score. This will help you identify any areas where the model is struggling and allow you to fine-tune your training parameters accordingly.
5. Finally, use your trained U-Net model to segment new satellite imagery data in real time! You can deploy it as a web service or integrate it into an existing GIS system for easy access by users.

Here’s some example code using PyTorch:

# Import necessary libraries
import torch
from torchvision import transforms, models
from PIL import Image
import numpy as np

# Load your pretrained U-Net model and convert it to evaluation mode
# Use the DeepLabv3 ResNet101 model from the torchvision library
model = models.segmentation.deeplabv3_resnet101(pretrained=True)

# Freeze the parameters of the model to prevent them from being updated during training
for param in model.parameters():
    param.requires_grad = False

# Set the model to evaluation mode
model.eval()

# Load your input image using PIL library and convert it to a tensor
# Use the transforms module from torchvision to resize and crop the image
img = Image.open('input.jpg')
transform = transforms.Compose([transforms.Resize(256),
                                transforms.CenterCrop(224)])
inp = transform(img).unsqueeze(0)

# Pass the input through your U-Net model and get the segment
# The model takes in an input tensor and returns a tensor with the predicted segmentation
predicted = model(inp)

# Convert predicted output to a color map for visualization purposes
# Create a numpy array with color values for each class
label_colors = np.array([[0, 0, 0], [128, 0, 0], [0, 128, 0], [192, 0, 0], [64, 0, 128],
                         [192, 128, 0], [0, 64, 0], [128, 64, 0], [0, 192, 0], [128, 192, 0],
                         [64, 128, 0], [192, 128, 0], [0, 64, 128], [192, 64, 128], [64, 0, 0],
                         [192, 0, 0], [64, 128, 0], [192, 128, 0], [0, 64, 0], [128, 64, 0],
                         [0, 192, 0], [128, 192, 0], [0, 64, 128]])
# Convert the array to unsigned integer type
label_colors = label_colors.astype(np.uint8)
# Create a 3D array by repeating the color values for each pixel
label_colors = np.tile(label_colors[..., np.newaxis], (1, 1, 3))
# Convert the predicted output to a numpy array and get the index of the maximum value for each pixel
predicted = predicted.argmax(dim=1).detach().cpu().numpy()
# Use the color map to map the predicted values to their corresponding colors
rgb = label_colors[predicted]
# Display the segmented image
plt.imshow(rgb); plt.show()

![output2](https://user-images.githubusercontent.com/123977559/215648026-cbfa5dcb-fcae-4bff-a5d3-a0f5920c8e9c.JPG)

In this example, we’re using the Deep Learning for Semantic Segmentation with Python & Pytorch course to train our own segmentation models on a custom dataset of satellite imagery. The output is then visualized as a color map for easy interpretation by users.

SICORPS