Introduction to Deep Learning

This notebook introduces fastai library, model training, aiking utilities

Open In Colab

Compatibility Block

Check Platform

Platform & Environment Configuration

Imports

Public Imports

from fastai.vision.all import *
from fastcore.all import *

Private Imports

from aiking.data.external import *
from aiking.core import aiking_settings
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[7], line 1
----> 1 from aiking.data.external import *
      2 from aiking.core import aiking_settings

File ~/rahuketu/programming/aiking/aiking/data/external.py:28
     26 from bs4 import BeautifulSoup
     27 import time
---> 28 import kaggle
     29 from ..core import aiking_settings
     30 from .utils import download_named_images

File /opt/homebrew/Caskroom/miniforge/base/envs/aiking/lib/python3.9/site-packages/kaggle/__init__.py:23
     20 from kaggle.api_client import ApiClient
     22 api = KaggleApi(ApiClient())
---> 23 api.authenticate()

File /opt/homebrew/Caskroom/miniforge/base/envs/aiking/lib/python3.9/site-packages/kaggle/api/kaggle_api_extended.py:403, in KaggleApi.authenticate(self)
    401         config_data = self.read_config_file(config_data)
    402     else:
--> 403         raise IOError('Could not find {}. Make sure it\'s located in'
    404                       ' {}. Or use the environment method.'.format(
    405                           self.config_file, self.config_dir))
    407 # Step 3: load into configuration!
    408 self._load_config(config_data)

OSError: Could not find kaggle.json. Make sure it's located in /Users/rahul1.saraf/.kaggle. Or use the environment method.

Is Bird or not?

Bird Recognition is a problem which was considered impossible to solve 5 years ago. With modern advances in software, algorithms and computing; it can now be solved in a very few lines of code on your local computer.

XKCD

Problem Setup

We solve a classfication problem using image recognition algorithms and deep learning. We can easily tell if a picture is of bird. But how to teach a computer what is not bird! Best we can do is take another class of images (here - Forest) and teach the computer to distinguish them from bird pictures.

Workflow

Below is the workflow for this notebook :-

flowchart TB
    subgraph Data

        subgraph Create[Data Scraping and Upload to Datasette]
        A[create_image_dataset] --> B[Make Sqlite db from created image.csv \n and upload to Datasette]
        end

        subgraph ReproducibleData
        C[data_frm_datasette] --> D[Laptop]
        C --> E[Kaggle]
        C --> F[Colab]
        C --> G[RemoteServer]
        end
    end
    subgraph DeepLearning
        subgraph Datablock
        H[Define Blocks] --- I[get_items] --- J[splitter] --- K[parent_label]---L[item_tfms]
        end

        subgraph Learner
        M[Vision Learner] ---N[fine_tune] ---O[predict]
        end
    end

    B --> ReproducibleData
    ReproducibleData --> DeepLearning
    Datablock --> Learner

Download Data from Datasette

  • Keeping Data reference in Datasette helps me in keeping my data consistent once its scraped initally.
  • It is also useful to have notebook working on various platforms, local computer, remote GPU machine and/ or colab while interating on same dataset.
  • Provide automatic api for datasets which can be used for dashboard creation, integration with observable and other 3rd party tools.
dsname = 'BirdsvsForest'
datasette_base_url = "https://datasette.zealmaker.com"
path = data_frm_datasette(dsname, datasette_base_url); path
Path('/mnt/d/rahuketu/programming/AIKING_HOME/data/BirdsvsForest')
Some useful links
  • data_frm_datasette api is available here

  • Dataset BirdsvsForest is available here for reproducibility and visualization.

  • Code for Creating Dataset is available here. Function construct_image_dataset is used. Reference api is here

path.ls().map(lambda p : p.name)
(#3) ['Bird','Forest','image.csv']
(path/"Bird").ls()[:2]
(#2) [Path('/mnt/d/rahuketu/programming/AIKING_HOME/data/BirdsvsForest/Bird/018632bc-4ac6-4479-a821-3bff4d8f4919.jpg'),Path('/mnt/d/rahuketu/programming/AIKING_HOME/data/BirdsvsForest/Bird/02b56c6f-36f4-452d-8173-d97716069c7a.jpg')]
(path/"Forest").ls()[:2]
(#2) [Path('/mnt/d/rahuketu/programming/AIKING_HOME/data/BirdsvsForest/Forest/012b9515-0b1c-4feb-9353-364af6fbe946.jpg'),Path('/mnt/d/rahuketu/programming/AIKING_HOME/data/BirdsvsForest/Forest/01940593-c864-4241-8329-718bac60fed6.jpg')]

DataBlocks and DataLoaders

  • Dataloaders -> object that contains training and validation set
  • Datablock -> Fastai object to create Datablock
doc(ImageBlock)

ImageBlock

ImageBlock(cls:fastai.vision.core.PILBase=)

A `TransformBlock` for images of `cls`

Show in docs

doc(CategoryBlock)

CategoryBlock

CategoryBlock(vocab:collections.abc.MutableSequence|pandas.core.series.Series=None, sort:bool=True, add_na:bool=False)

`TransformBlock` for single-label categorical targets

Show in docs

dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=[Resize(192, method='squish')]
).dataloaders(path, bs=32)

dls.show_batch(max_n=6)

Model Training

learn = vision_learner(dls, resnet18, metrics=[error_rate, accuracy]); learn
/home/rahuketu86/mambaforge/envs/aiking/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/home/rahuketu86/mambaforge/envs/aiking/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
<fastai.learner.Learner>
learn.fine_tune(3)
epoch train_loss valid_loss error_rate accuracy time
0 0.666977 0.174406 0.055046 0.944954 00:22
epoch train_loss valid_loss error_rate accuracy time
0 0.177000 0.190055 0.055046 0.944954 00:20
1 0.111666 0.165208 0.064220 0.935780 00:21
2 0.081644 0.178287 0.045872 0.954128 00:19