Skip to content

Repo for analysis of aerial images semantic segmentation problem

License

Notifications You must be signed in to change notification settings

MKaczkow/aerial_segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aerial Segmentation

Repo for TWM (Machine Vision Techniques) project @ WUT 24L semester

Code style: black Imports: isort

Introduction

The following instructions describe:

  1. Setting up the environment and running notebooks
  2. Functions and purpose of the notebooks
  3. Sources and brief description of the data

Setting up the environment and running notebooks

  1. Create a virtual environment (conda, venv, etc.), e.g.: python -m venv aerial_images
  2. Activate the virtual environment (depending on the operating system):
    • Windows: .\aerial_images\Scripts\activate
    • Linux: source aerial_images/bin/activate
  3. Download and install the torch library according to the instructions on pytorch.org
  4. Install the remaining required libraries: pip install -r requirements.txt

Code and notebooks

Generally, the repository is organized in such a way that the data directory contains only raw data, the src directory contains code and helper functions, and the notebooks directory contains notebooks with code. For better clarity, the notebooks are not stored in the root directory, but should be moved there before running. The individual sub-directories contain what their names indicate, below is a detailed description:

  • data - directory containing only data, directories ending with Patches contain datasets divided into patches
  • src - directory containing source code, including helper functions
    • callbacks - directory containing callback functions, assisting in model training management
    • datasets - directory containing dataset classes (inheriting from torch.utils.data.Dataset)
      • utils - directory containing helper functions for datasets, mainly for converting masks to labels of type {0, 1, 2, ...} and transformations (torchvision.transforms)
    • evaluation - directory containing helper functions for model evaluation
    • models - directory containing a helper baseline model class (inheriting from torch.nn.Module)
    • utils.py - general helper functions
  • notebooks - directory containing notebooks with code
    • datasets_to_patches - notebooks demonstrating the division of datasets into patches
    • masks_conversion - notebooks demonstrating the conversion of masks to labels of type {0, 1, 2, ...}
    • no_finetune - notebooks demonstrating attempts to use models without training for aerial image segmentation (baseline, weights trained on ImageNet do not transfer to the new dataset)
    • sanity_checks - notebooks demonstrating sanity checks for datasets
    • with_finetune - notebooks demonstrating actual training of models on new datasets

Data

Downloading

Number of classes in datasets

  • INRIA: 2 (binary - building i non-building) source
  • Dubai: 6 source
    • Building: #3C1098
    • Land (unpaved area): #8429F6
    • Road: #6EC1E4
    • Vegetation: #FEDD3A
    • Water: #E2A929
    • Unlabeled: #9B9B9B
  • Aerial Drone: 20 (tree, gras, other vegetation, dirt, gravel, rocks, water, paved area, pool, person, dog, car, bicycle, roof, wall, fence, fence-pole, window, door, obstacle) source

Warning

It looks like, there are actually 23 classes.

  • UAVid: 8 source
    1. building: living houses, garages, skyscrapers, security booths, and buildings under construction.
    2. road: road or bridge surface that cars can run on legally. Parking lots are not included.
    3. tree: tall trees that have canopies and main trunks.
    4. low vegetation: grass, bushes and shrubs.
    5. static car: cars that are not moving, including static buses, trucks, automobiles, and tractors. Bicycles and motorcycles are not included.
    6. moving car: cars that are moving, including moving buses, trucks, automobiles, and tractors. Bicycles and motorcycles are not included.
    7. human: pedestrians, bikers, and all other humans occupied by different activities.
    8. clutter: all objects not belonging to any of the classes above.

Number of images in datasets

  • INRIA
    • train: 180 (labels present, need to manually split into train and val)
    • test: 144 (no labels)
  • Dubai
    • train: 72 (labels present, need to manually split into train and val)
  • Aerial Drone
    • train: 400 (labels present, need to manually split into train and val)
  • UAVid
    • train: 200
    • val: 70
    • test: 10

Resources