Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Concepts
Image Classification
Image Localization
Image Classification: Predict the type or class of an object in an image. Input: An image with a single object, such as a photograph. Output: A class label (e.g. one or more integers that are mapped to class labels). Object Localization: Locate the presence of objects in an image and indicate their location with a bounding box. Input: An image with one or more objects, such as a photograph. Output: One or more bounding boxes (e.g. defined by a point, width, and height). Object Detection: Locate the presence of objects with a bounding box and types or classes of the located objects in an image. Input: An image with one or more objects, such as a photograph. Output: One or more bounding boxes (e.g. defined by a point, width, and height), and a class label for each bounding box.
Image Segmentation - Semantic Segmentation - Instance segemntation
https://en.wikipedia.org/wiki/Image_segmentation
R-CNN
Fast R-CNN
Faster R-CNN
Mask R-CNN
YoLo
Faster R-CNN Mask R-CNN YoLo
Models for Object Detection
Region-Based Convolutional Neural Networks, or R-CNNs, are a family of techniques for addressing object localization and recognition tasks, designed for model performance. You Only Look Once, or YOLO, is a second family of techniques for object recognition designed for speed and real-time use.
Finetune Object Detection Models in PyTorch
Fine-tuning Faster-RCNN using pytorch
Beagle Detector: Fine-tune Faster-RCNN
TORCHVISION OBJECT DETECTION FINETUNING TUTORIAL
References
A Gentle Introduction to Object Recognition With Deep Learning