# Computer Vision Object Detection Libraries

> Search results for `computer vision library for object detection` on awesome-repositories.com. 113 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/computer-vision-library-for-object-detection

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/computer-vision-library-for-object-detection).**

## Results

- [jbhuang0604/awesome-computer-vision](https://awesome-repositories.com/repository/jbhuang0604-awesome-computer-vision.md) (23,074 ⭐) — This project is a comprehensive, community-driven repository that serves as a centralized catalog for computer vision research and development. It functions as a structured index of academic papers, open-source software libraries, public datasets, and educational tutorials, providing a navigation point for the complex landscape of modern vision technology.

The repository distinguishes itself through a taxonomy-based indexing system that maps the relationships between foundational research, influential academic figures, and their corresponding software implementations. By utilizing a lightweig
- [d2l-ai/d2l-en](https://awesome-repositories.com/repository/d2l-ai-d2l-en.md) (29,001 ⭐) — This project is an educational platform and research toolkit designed to teach deep learning through a combination of mathematical theory, visual diagrams, and executable code. It provides a comprehensive environment for building, training, and evaluating neural networks, grounding complex concepts in interactive computational notebooks that allow for hands-on experimentation.

The framework distinguishes itself by interleaving theoretical foundations—including linear algebra, calculus, and probability—with practical implementations across multiple industry-standard libraries. It supports flex
- [abhineet123/deep-learning-for-tracking-and-detection](https://awesome-repositories.com/repository/abhineet123-deep-learning-for-tracking-and-detection.md) (2,508 ⭐) — This project is a curated research repository and structured index focused on deep learning techniques for object detection and tracking. It serves as a centralized archive for academic papers, datasets, and software implementations, providing a cohesive resource for studying methodologies used in image and video analysis.

The repository distinguishes itself through a systematic approach to knowledge management, utilizing hierarchical file organization and metadata-driven tagging to categorize technical literature. By indexing domain-specific datasets and cross-referencing academic resources,
- [ailab-cvc/yolo-world](https://awesome-repositories.com/repository/ailab-cvc-yolo-world.md) (6,425 ⭐) — YOLO-World is a vision-language framework and open-vocabulary object detection model. It identifies objects in images and video based on free-form text prompts without requiring predefined category labels.

The system enables the identification of arbitrary objects by fusing image features with text embeddings. It includes a specialized tool for automated image labeling, which generates bounding box annotations for custom datasets using text-based prompts.

The project provides a deployment pipeline for converting models into quantized ONNX and TFLite formats, supporting real-time inference on
- [microsoft/airsim](https://awesome-repositories.com/repository/microsoft-airsim.md) (17,956 ⭐) — AirSim is a high-fidelity simulation platform designed for the development and testing of autonomous vehicles. Built as a plugin for game engines, it provides a physics-based environment that models vehicle dynamics and sensor data, serving as a foundation for robotics research, computer vision training, and reinforcement learning.

The platform distinguishes itself through its support for hardware-in-the-loop and software-in-the-loop testing, allowing developers to validate control logic and firmware against real-world signals or concurrent processes. It offers extensive programmatic control
- [google-research/vision_transformer](https://awesome-repositories.com/repository/google-research-vision-transformer.md) (12,584 ⭐) — This project is a research library and toolkit for deep learning computer vision, focused on implementing transformer and mixer-based architectures for image classification. It processes visual data by converting images into sequences of patches, allowing standard attention mechanisms to capture global dependencies without relying on traditional convolutional operations.

The framework distinguishes itself through its support for multimodal embedding analysis, which maps images and text into a shared latent vector space. This capability enables zero-shot classification and cross-modal retrieva
- [paddlepaddle/paddlex](https://awesome-repositories.com/repository/paddlepaddle-paddlex.md) (6,163 ⭐) — PaddleX is a PaddlePaddle-based framework for building, deploying, and fine-tuning AI model pipelines, with pre-built support for computer vision, OCR, document analysis, and time series tasks. It offers a toolkit of ready-to-use pipelines for image classification, object detection, segmentation, and pose estimation, alongside an end-to-end OCR document analysis pipeline that extracts text, tables, formulas, and layout information. The platform also includes a dedicated time series forecasting pipeline for analyzing historical data to detect anomalies, classify patterns, and predict future val
- [msracver/relation-networks-for-object-detection](https://awesome-repositories.com/repository/msracver-relation-networks-for-object-detection.md) (1,104 ⭐) — Relation Networks for Object Detection
- [virus01official/object-library](https://awesome-repositories.com/repository/virus01official-object-library.md) (6 ⭐) — An Object Library for LOVE2D
- [keras-team/keras](https://awesome-repositories.com/repository/keras-team-keras.md) (64,094 ⭐) — Keras is a high-level deep learning framework designed for constructing and training neural networks through the composition of modular, functional layers. It serves as a comprehensive modeling toolkit that provides standardized procedures for defining, evaluating, and deploying complex architectures. By utilizing a directed acyclic graph approach, the framework allows users to build intricate models with multiple inputs, outputs, and shared layers, ensuring consistent numerical execution through functional state management.

The project distinguishes itself as a multi-backend machine learning
- [open-mmlab/mmdetection](https://awesome-repositories.com/repository/open-mmlab-mmdetection.md) (32,756 ⭐) — This project is a modular research toolkit designed for developing, training, and evaluating deep learning models for object detection, segmentation, and video instance tracking. It provides a flexible training engine that manages complex neural network execution, including distributed training, custom lifecycle hooks, and weight optimization. The framework is built around a hierarchical configuration system that allows users to define architectures, data pipelines, and training hyperparameters through composable, inheritable files.

The project distinguishes itself through its highly modular
- [anuragreddygv323/computer-vision-projects](https://awesome-repositories.com/repository/anuragreddygv323-computer-vision-projects.md) (107 ⭐) — Computer Vision Basics - Building Your Own Custom Object Detector - Content-Based Image Retrieval - Image Classification and Machine Learning - Face Recognition - Automatic License Plate Recognition - Hadoop + Big Data - Deep Learning - Raspberry Pi Projects - Image Descriptors - Computer Vision…
- [roboflow/trackers](https://awesome-repositories.com/repository/roboflow-trackers.md) (2,565 ⭐) — This project is a multi-object tracking library and computer vision toolkit designed to maintain consistent identity IDs for objects across video frames. It provides a motion-based object tracking system that converts raw detections into stable temporal tracks, enabling the analysis of object movement and behavior over time.

The toolkit distinguishes itself through advanced identity maintenance, utilizing Kalman filters for linear motion tracking and sparse optical flow for camera motion estimation. It features multi-stage object association to recover occluded objects and non-linear motion t
- [bjarten/computer-vision-nd](https://awesome-repositories.com/repository/bjarten-computer-vision-nd.md) (134 ⭐) — Projects and exercises for the Udacity Computer Vision Nanodegree
- [facebookresearch/sam2](https://awesome-repositories.com/repository/facebookresearch-sam2.md) (19,389 ⭐) — This project is a foundation model and research toolkit designed for promptable object segmentation and temporal tracking. It provides a unified framework for isolating specific regions or objects within both static images and dynamic video sequences.

The system distinguishes itself through a streaming memory architecture that maintains temporal consistency by storing and retrieving object features across frames. This mechanism allows the model to resolve occlusions and preserve object identity even when targets move out of view or change appearance. By utilizing a shared backbone for both im
- [amusi/awesome-object-detection](https://awesome-repositories.com/repository/amusi-awesome-object-detection.md) (7,499 ⭐) — Awesome Object Detection based on handong1587 github: https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html
- [cocodataset/cocoapi](https://awesome-repositories.com/repository/cocodataset-cocoapi.md) (6,377 ⭐) — This project is a toolkit and API designed for parsing, manipulating, and visualizing image annotations for computer vision tasks. It provides a programming interface to load and organize Common Objects in Context annotations, specifically for object detection, image segmentation, and keypoint estimation.

The library includes tools for converting formatted JSON files into data structures that support the analysis of pixel-level masks and skeletal markers. It enables the visual verification of ground truth accuracy by rendering bounding boxes, segmentation masks, and keypoint markers directly
- [fastai/fastai](https://awesome-repositories.com/repository/fastai-fastai.md) (27,862 ⭐) — Fastai is a high-level deep learning library built on PyTorch that provides a unified interface for managing the entire machine learning lifecycle. It functions as a comprehensive training toolkit, abstracting hardware management and automating complex training loops to simplify the construction and execution of neural network models.

The framework is distinguished by its notebook-centric development environment and a type-dispatching data pipeline that automatically applies transformations based on input data formats. It emphasizes transfer learning through discriminative layer-wise optimiza
- [rafaelpadilla/object-detection-metrics](https://awesome-repositories.com/repository/rafaelpadilla-object-detection-metrics.md) (5,098 ⭐) — This project is an object detection evaluation library and benchmarking tool designed to calculate precision, recall, and average precision for computer vision models. It provides a suite of utilities for parsing bounding box coordinates from text files and calculating spatial overlap to determine detection accuracy.

The toolkit features a command line interface for comparing ground truth files against model predictions. It includes a precision-recall curve generator to visualize the relationship between precision and recall across different confidence thresholds and an intersection over unio
- [gyhandy/text2image-for-detection](https://awesome-repositories.com/repository/gyhandy-text2image-for-detection.md) (21 ⭐) — DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection
- [d2l-ai/d2l-zh](https://awesome-repositories.com/repository/d2l-ai-d2l-zh.md) (78,493 ⭐) — This project is an open-source, interactive educational platform designed to teach deep learning through a comprehensive, code-first curriculum. It provides a structured learning path that covers foundational mathematics, modern neural network architectures, and practical optimization techniques, enabling practitioners to master complex artificial intelligence concepts through hands-on experimentation.

The platform distinguishes itself by integrating technical explanations with executable Jupyter notebooks. This design allows readers to modify code and hyperparameters in real-time, facilitati
- [qubvel/segmentation_models](https://awesome-repositories.com/repository/qubvel-segmentation-models.md) (4,917 ⭐) — This is an image segmentation framework and masking toolkit for constructing binary and multi-class neural network architectures. It serves as a deep learning encoder wrapper that integrates pre-trained convolutional neural network architectures into semantic segmentation models.

The library enables the use of pre-trained backbones to isolate complex patterns and leverages transfer learning to accelerate training. It provides a collection of overlap-based loss functions and precision metrics specifically designed to evaluate and refine the accuracy of image masks.

The toolkit covers the full
- [microsoft/onnxruntime](https://awesome-repositories.com/repository/microsoft-onnxruntime.md) (19,347 ⭐) — This project is a cross-platform machine learning inference engine designed to execute pre-trained models across diverse operating systems and hardware environments. It functions as a standardized execution framework that manages the entire lifecycle of model inference, from loading and graph optimization to hardware-accelerated execution and generative sequence management.

The runtime distinguishes itself through a highly modular architecture that decouples model logic from hardware-specific kernels. By utilizing an execution provider abstraction, it enables developers to offload computation
- [ashishpatel26/500-ai-machine-learning-deep-learning-computer-vision-nlp-projects-with-code](https://awesome-repositories.com/repository/ashishpatel26-500-ai-machine-learning-deep-learning-computer-vision-nlp-projects.md) (34,579 ⭐) — This repository serves as a comprehensive, curated collection of open-source implementations focused on artificial intelligence, machine learning, and computer vision. It functions as a centralized knowledge base and technical resource index, providing students and professional engineers with a structured directory of code examples for educational and practical reference.

The project distinguishes itself through a community-driven curation model, relying on manual updates and contributions to maintain a relevant and expansive archive. By organizing these resources into categorized lists, the
- [charmve/computer-vision-in-action](https://awesome-repositories.com/repository/charmve-computer-vision-in-action.md) (2,851 ⭐) — A computer vision closed-loop learning platform where code can be run interactively online. 学习闭环《计算机视觉实战演练：算法与应用》中文电子书、源码、读者交流社区（持续更新中 ...） 📘 在线电子书 https://charmve.github.io/computer-vision-in-action/   👇项目主页
- [megvii-basedetection/yolox](https://awesome-repositories.com/repository/megvii-basedetection-yolox.md) (10,504 ⭐) — YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
- [alicevision/meshroom](https://awesome-repositories.com/repository/alicevision-meshroom.md) (12,562 ⭐) — Meshroom is a node-based photogrammetry software designed to transform collections of two-dimensional images into three-dimensional models and scene geometry. It provides a visual interface for constructing and managing modular data pipelines, allowing users to automate complex computer vision tasks such as feature extraction, depth map estimation, and mesh generation.

The software distinguishes itself through a distributed computational framework that dispatches resource-intensive tasks across local hardware or remote render farms. By utilizing a directed acyclic graph execution model, it en
- [kuanhungchen/awesome-tiny-object-detection](https://awesome-repositories.com/repository/kuanhungchen-awesome-tiny-object-detection.md) (0 ⭐) — A curated list of ``Tiny Object Detection`` papers and related resources.
- [packtpublishing/opencv-computer-vision-projects-with-python](https://awesome-repositories.com/repository/packtpublishing-opencv-computer-vision-projects-with-python.md) (128 ⭐) — OpenCV-Computer-Vision-Projects-with-Python
- [paddlepaddle/paddledetection](https://awesome-repositories.com/repository/paddlepaddle-paddledetection.md) (14,243 ⭐) — PaddleDetection is an object detection framework designed for the end-to-end development, training, and deployment of computer vision models. It provides a comprehensive library of modular neural network architectures and pipelines that support object detection, instance segmentation, and multi-object tracking tasks.

The project distinguishes itself through a configuration-driven approach that decouples model components like backbones and heads, allowing for the flexible assembly of custom vision workflows. It incorporates advanced techniques such as anchor-free detection logic, joint detecti
- [clearml/clearml](https://awesome-repositories.com/repository/clearml-clearml.md) (6,740 ⭐) — ClearML is a comprehensive MLOps platform designed to manage the end-to-end machine learning lifecycle, from initial experimentation to production deployment. It provides a suite of integrated tools including a pipeline orchestrator for automating workflows, an experiment tracking tool for logging hyperparameters and metrics, and a metadata-driven data versioning system for managing large-scale datasets and model artifacts.

The platform is distinguished by its advanced compute management and serving capabilities. It features a GPU compute manager that supports fractional resource slicing and
- [the-ai-summer/gans-in-computer-vision](https://awesome-repositories.com/repository/the-ai-summer-gans-in-computer-vision.md) (78 ⭐) — GANs in computer vision AI Summer article series
- [allegroai/clearml](https://awesome-repositories.com/repository/allegroai-clearml.md) (6,733 ⭐) — ClearML is a comprehensive MLOps platform designed to manage the entire machine learning lifecycle. It functions as an experiment tracking tool, a data versioning system, and a pipeline orchestrator, while providing infrastructure for GPU cluster management and model serving.

The platform is distinguished by its ability to handle hybrid-cloud compute scheduling and fractional GPU allocation, allowing multiple workloads to share a single hardware accelerator. It employs a metadata-based approach to data versioning, using virtual views to track large datasets and artifacts without duplicating r
- [nerox8664/awesome-computer-vision-models](https://awesome-repositories.com/repository/nerox8664-awesome-computer-vision-models.md) (543 ⭐) — A list of popular deep learning models related to classification, segmentation and detection problems
- [itseez/opencv](https://awesome-repositories.com/repository/itseez-opencv.md) (89,221 ⭐) — OpenCV is an open-source computer vision library and visual analysis toolkit. It provides a framework for processing static images and dynamic video frames to analyze visual data and extract information using deep learning.

The project functions as a real-time image processing framework, enabling the execution of vision algorithms on live video streams for immediate analysis and data processing.

The toolkit covers a broad range of capabilities including image pattern recognition, real-time video analysis, and visual data extraction. It also supports automated visual inspection for detecting
- [ultralytics/ultralytics](https://awesome-repositories.com/repository/ultralytics-ultralytics.md) (58,468 ⭐) — Ultralytics is a comprehensive computer vision framework designed for training, validating, and deploying deep learning models across a wide range of visual recognition tasks. It provides a unified interface for core operations including object detection, instance segmentation, pose estimation, and image classification. By utilizing a modular architecture, the platform allows users to swap model components to balance inference speed and accuracy requirements for diverse applications.

The framework distinguishes itself through its support for real-time processing and flexible deployment. It in
- [facebookresearch/detectron](https://awesome-repositories.com/repository/facebookresearch-detectron.md) (26,370 ⭐) — Detectron is a PyTorch object detection framework and computer vision research platform. It provides implementations of neural network architectures for locating and identifying objects in images, including Mask R-CNN for generating instance segmentation masks and RetinaNet for one-stage detection.

The platform supports computer vision prototyping and object detection research through the deployment of pre-trained baseline models. This allows for the rapid implementation and evaluation of visual recognition systems.

Its capabilities cover image object localization and instance segmentation w
- [maudzung/super-fast-accurate-3d-object-detection](https://awesome-repositories.com/repository/maudzung-super-fast-accurate-3d-object-detection.md) (1,125 ⭐) — Super Fast and Accurate 3D Object Detection based on 3D LiDAR Point Clouds (The PyTorch implementation)
- [cmu-perceptual-computing-lab/openpose](https://awesome-repositories.com/repository/cmu-perceptual-computing-lab-openpose.md) (34,145 ⭐) — OpenPose is a real-time pose estimation engine designed to detect and track human body, face, hand, and foot landmarks. It functions as a multi-person motion tracker, identifying the spatial coordinates of multiple individuals simultaneously within video streams or static images. Beyond two-dimensional detection, the software acts as a three-dimensional kinematics processor, reconstructing spatial movement data from single or multiple synchronized camera perspectives.

The system distinguishes itself through a bottom-up approach that utilizes part-affinity fields to associate body parts across
- [googlechrome/lighthouse](https://awesome-repositories.com/repository/googlechrome-lighthouse.md) (30,355 ⭐) — Lighthouse is an automated diagnostic tool that evaluates web pages against industry standards for performance, accessibility, and search engine optimization. It functions as a programmatic analysis engine and a command-line utility, allowing developers to integrate comprehensive web quality checks directly into continuous integration pipelines and local development workflows.

The project distinguishes itself through a modular architecture that utilizes artifact-based data collection to ensure consistent analysis across different environments. It supports a headless execution mode for automat
- [boostorg/compute](https://awesome-repositories.com/repository/boostorg-compute.md) (1,654 ⭐) — A C++ GPU Computing Library for OpenCL
- [accumulatemore/cv](https://awesome-repositories.com/repository/accumulatemore-cv.md) (21,907 ⭐) — This project is a comprehensive deep learning framework and educational platform designed for constructing, training, and evaluating neural network architectures. It provides a modular environment for building models through tensor operations and automatic differentiation, supporting a wide range of tasks from image classification and object detection to sequential data processing.

Beyond its core technical capabilities, the project distinguishes itself by integrating professional career development resources directly into its learning ecosystem. It offers structured guidance, resume reviews,
- [expo/expo](https://awesome-repositories.com/repository/expo-expo.md) (50,111 ⭐) — Expo is a universal mobile framework designed to build native iOS and Android applications from a single codebase using web-standard technologies. It provides a comprehensive development environment that includes a unified runtime for testing, cloud-based infrastructure for compiling and signing native binaries, and automated tools for managing the entire mobile release lifecycle, including app store submission.

The framework distinguishes itself through a plugin-based native configuration engine that programmatically modifies project files, allowing developers to integrate native modules wit
- [afshinea/stanford-cs-230-deep-learning](https://awesome-repositories.com/repository/afshinea-stanford-cs-230-deep-learning.md) (7,028 ⭐) — This repository collects illustrated single-page cheat sheets that compress the core topics of Stanford's CS 230 deep learning course into visual reference summaries. The collection covers convolutional neural networks, recurrent neural networks, and practical training techniques, pairing schematic diagrams with mathematical notation to bridge intuition and formal understanding.

The cheat sheets are organized by subject area and link related concepts across topics, such as connecting vanishing gradients to LSTM gates, to reinforce the full deep learning workflow. Practical training advice on
- [kylelutz/compute](https://awesome-repositories.com/repository/kylelutz-compute.md) (1,655 ⭐) — Boost.Compute is a GPU/parallel-computing library for C++ based on OpenCL.
- [pytorch/vision](https://awesome-repositories.com/repository/pytorch-vision.md) (17,743 ⭐) — This project is a comprehensive computer vision library for the PyTorch ecosystem, providing a standardized collection of neural network architectures, datasets, and high-performance transformation utilities. It serves as a foundational framework for building, training, and deploying deep learning models, offering a centralized model registry that allows developers to instantiate architectures with pre-trained weights for tasks such as image classification, object detection, and semantic segmentation.

The library distinguishes itself through its modular approach to data and compute management
- [haifengl/smile](https://awesome-repositories.com/repository/haifengl-smile.md) (6,387 ⭐) — Smile is a comprehensive JVM machine learning library and statistical computing toolkit. It provides a suite of algorithms for classification, regression, and clustering, implemented natively for Java, Scala, and Kotlin. The project also functions as a deep learning framework, a natural language processing library, and an inference engine for large language models.

The library distinguishes itself through GPU acceleration via LibTorch bindings and support for the ONNX model interchange format. It includes specialized capabilities for large language model inference, featuring Byte-Pair Encodin
- [aladdinpersson/machine-learning-collection](https://awesome-repositories.com/repository/aladdinpersson-machine-learning-collection.md) (8,465 ⭐) — This project is a machine learning educational repository providing a collection of implementations and guides for machine learning and deep learning algorithms. It serves as a deep learning model library and a reference for training workflows, covering foundational machine learning, convolutional, recurrent, and transformer architectures.

The collection includes a generative adversarial network suite for synthesizing realistic images and performing image-to-image translation. It also functions as a computer vision implementation guide for object detection and semantic segmentation, alongside
- [apple/corenet](https://awesome-repositories.com/repository/apple-corenet.md) (6,999 ⭐) — Corenet is a deep learning training framework and computer vision model library designed for developing neural networks across vision, text, and audio modalities. It functions as a distributed training orchestrator for scaling workloads across multiple compute nodes and provides a multimodal data pipeline for processing image, text, and video data.

The project includes a model conversion toolkit for transforming weights and architectures between different machine learning frameworks. It also provides tools for optimizing model performance on Apple Silicon and reducing response latency in gene
- [jrobchin/computer-vision-basics-with-python-keras-and-opencv](https://awesome-repositories.com/repository/jrobchin-computer-vision-basics-with-python-keras-and-opencv.md) (435 ⭐) — Full tutorial of computer vision and machine learning basics with OpenCV and Keras in Python.