# microsoft/computervision-recipes

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/microsoft-computervision-recipes).**

9,866 stars · 1,208 forks · Jupyter Notebook · MIT

## Links

- GitHub: https://github.com/microsoft/computervision-recipes
- awesome-repositories: https://awesome-repositories.com/repository/microsoft-computervision-recipes.md

## Topics

`artificial-intelligence` `azure` `computer-vision` `convolutional-neural-networks` `data-science` `deep-learning` `image-classification` `image-processing` `jupyter-notebook` `kubernetes` `machine-learning` `microsoft` `object-detection` `operationalization` `python` `similarity` `tutorial`

## Description

This project is a collection of educational resources and implementation frameworks providing deep learning model recipes, code samples, and step-by-step guides for computer vision tasks. It organizes complex workflows into modular recipes and implementation guides to facilitate the building of image and video analysis models.

The framework focuses on specialized vision capabilities, including an image similarity framework for fast retrieval and re-ranking, human pose estimation, and video action recognition. It also provides specific tools for crowd density estimation and document image cleaning.

The project covers a broad range of development and deployment capabilities, including image classification, object detection, and image segmentation. It provides utilities for data annotation, model training with hyperparameter optimization, and the orchestration of models using containers and Kubernetes for REST API inference.

The implementation is centered around a PyTorch vision workflow using notebook-driven prototyping.

## Tags

### Education & Learning Resources

- [Computer Vision Tutorials](https://awesome-repositories.com/f/education-learning-resources/computer-vision-tutorials.md) — Offers a comprehensive collection of code samples and best practices for building deep learning image and video analysis models.
- [Implementation Recipes](https://awesome-repositories.com/f/education-learning-resources/implementation-recipes.md) — Organizes complex computer vision tasks into modular recipes and guided workflows for reproducible model building.

### Artificial Intelligence & ML

- [Object Detection and Tracking](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/computer-vision/object-detection-tracking.md) — Provides implementations for identifying objects via bounding boxes and tracking their movement across video frames.
- [Object Detection](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/computer-vision/object-detection-tracking/object-detection.md) — Locates and identifies items within an image by generating bounding boxes and class labels. ([source](https://github.com/microsoft/computervision-recipes/tree/staging/scenarios))
- [Image Segmentation](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/image-segmentation.md) — Provides techniques for partitioning images into precise object boundaries and masks. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/detection))
- [Segmentation Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/image-segmentation/segmentation-model-training.md) — Develops segmentation systems by fine-tuning pre-trained backbones on custom annotated datasets. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/segmentation))
- [Image Classification](https://awesome-repositories.com/f/artificial-intelligence-ml/image-classification.md) — Implements supervised machine learning techniques to assign category labels to images. ([source](https://github.com/microsoft/computervision-recipes/tree/staging/scenarios))
- [Classification Training](https://awesome-repositories.com/f/artificial-intelligence-ml/image-classification/classification-training.md) — Provides implementations for building single or multi-label image classification models. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/classification))
- [Fine-Tuning Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/fine-tuning-and-customization/fine-tuning-pipelines.md) — Implements workflows for adapting pre-trained neural network backbones to custom datasets.
- [Vision Workflows](https://awesome-repositories.com/f/artificial-intelligence-ml/pytorch-training-frameworks/vision-workflows.md) — Provides Python-based guides for training, fine-tuning, and validating neural networks for visual data processing.
- [Model Deployment](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-models/model-deployment.md) — Provides a workflow for packaging trained vision models into containers for scalable inference.
- [Object Pose Estimations](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/computer-vision/object-pose-estimations.md) — Locates anatomical keypoints on the human body to determine posture and orientation. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/detection))
- [Representation Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-training/representation-learning.md) — Trains deep neural networks to compute image representations for identifying similar images. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/similarity))
- [Crowd](https://awesome-repositories.com/f/artificial-intelligence-ml/density-estimation/crowd.md) — Provides tools to estimate human density and count individuals within varied scene environments. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/README.md))
- [Human Activity Recognition](https://awesome-repositories.com/f/artificial-intelligence-ml/human-activity-recognition.md) — Analyzes video sequences to identify, categorize, and timestamp specific human activities. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/action_recognition))
- [Image Retrieval Systems](https://awesome-repositories.com/f/artificial-intelligence-ml/image-retrieval-systems.md) — Executes rapid searches for similar images using nearest neighbor search algorithms. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/similarity))
- [Keypoint Detection](https://awesome-repositories.com/f/artificial-intelligence-ml/keypoint-detection.md) — Identifies points of interest on objects using models that detect both the object and its keypoints. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/keypoints))
- [Training Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/keypoint-detection/training-pipelines.md) — The computer vision library builds custom models to localize specific points of interest on objects using a mask-based framework. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/keypoints))
- [Model Fine-Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/fine-tuning-and-customization/model-fine-tuning.md) — The computer vision library adjusts pre-trained models using custom datasets to improve accuracy for specific tasks. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/README.md))
- [Vision Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/training-frameworks/model-training-frameworks/vision-model-training.md) — The computer vision library trains high-accuracy models for identifying and locating objects in custom datasets. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/detection))
- [Hyperparameter Optimization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-optimization/training-efficiency/hyperparameter-optimization.md) — Utilizes grid search and parallel sweeping to find optimal model parameters for accuracy and speed. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/classification))
- [Retrieval Re-ranking](https://awesome-repositories.com/f/artificial-intelligence-ml/retrieval-re-ranking.md) — Implements k-reciprocal re-ranking to refine the accuracy of image retrieval results.
- [Hard Negative Mining](https://awesome-repositories.com/f/artificial-intelligence-ml/sampling-strategies/negative/hard-negative-mining.md) — Increases model precision by sampling difficult negative examples during the training process. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/detection))
- [Video Object Tracking](https://awesome-repositories.com/f/artificial-intelligence-ml/video-object-tracking.md) — Identifies and follows multiple distinct objects across video frames using tracking algorithms. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/tracking))

### Part of an Awesome List

- [Action Recognition Training](https://awesome-repositories.com/f/awesome-lists/ai/action-recognition-models/action-recognition-training.md) — Builds and evaluates models for activity classification using custom datasets or benchmark fine-tuning. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/action_recognition))
- [Model Recipes](https://awesome-repositories.com/f/awesome-lists/ai/deep-learning-models/model-recipes.md) — Provides modular, step-by-step implementations for common vision tasks like object detection and action recognition.

### Development Tools & Productivity

- [Notebook Execution Environments](https://awesome-repositories.com/f/development-tools-productivity/code-execution-environments/notebook-execution-environments.md) — Uses notebook-driven prototyping to iteratively develop and validate computer vision algorithms.

### Graphics & Multimedia

- [Image Similarity Estimation](https://awesome-repositories.com/f/graphics-multimedia/image-similarity-estimation.md) — Implements a framework for computing visual similarity and executing fast image retrieval. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/utils_cv))

### DevOps & Infrastructure

- [GPU Provisioning Services](https://awesome-repositories.com/f/devops-infrastructure/cloud-infrastructure/cloud-computing-serverless/gpu-provisioning-services.md) — Automates the provisioning of GPU-enabled virtual machines pre-configured with necessary vision libraries. ([source](https://github.com/microsoft/computervision-recipes/tree/staging/contrib))
- [Containerized Training Environments](https://awesome-repositories.com/f/devops-infrastructure/deployment-management-strategies/execution-platforms-and-targets/deployment-environments/containerized-training-environments.md) — Creates containerized images with standardized dependencies for CPU and GPU-based model training and testing. ([source](https://github.com/microsoft/computervision-recipes/tree/staging/docker))
- [Kubernetes Deployments](https://awesome-repositories.com/f/devops-infrastructure/kubernetes-deployments.md) — Provides configurations for orchestrating containerized vision models on Kubernetes for REST API inference.
- [Kubernetes Application Deployments](https://awesome-repositories.com/f/devops-infrastructure/kubernetes-deployments/kubernetes-application-deployments.md) — Packages trained vision models into containers for automated deployment to managed Kubernetes clusters. ([source](https://github.com/microsoft/computervision-recipes/blob/master/scenarios/classification/22_deployment_on_azure_kubernetes_service.ipynb))

### Web Development

- [Model Inference APIs](https://awesome-repositories.com/f/web-development/service-hosting/model-inference-apis.md) — Hosts trained models on cloud containers to expose them as REST APIs for scalable inference. ([source](https://github.com/microsoft/computervision-recipes/blob/staging/scenarios/classification))
