AlphaPose is a deep learning pose estimation framework and PyTorch computer vision library designed for detecting and tracking human body, face, hand, and foot keypoints in images and videos. It provides a system for skeletal posture estimation and multi-person pose tracking. The project implements tools for three-dimensional human pose reconstruction, generating joint positions and body mesh shapes from two-dimensional image data. It also includes a multi-person pose tracker capable of maintaining the identity of multiple people across consecutive video frames. The framework covers a broad
MMPose is a PyTorch-based pose estimation toolbox and deep learning training pipeline designed for detecting 2D and 3D keypoints on humans, animals, and faces. It serves as a computer vision model zoo and a framework for both 2D pose estimation and 3D pose lifting. The project is distinguished by its modular architecture and extensibility, employing a registry-based system and hierarchical configurations to allow for custom algorithm integration and model pipeline customization. It supports diverse estimation paradigms, including top-down, bottom-up, and two-stage pose lifting workflows. The
Detectron2 is a PyTorch computer vision framework and visual recognition platform designed for training and deploying models for object detection, image segmentation, and visual recognition. It provides a research-oriented environment for training complex vision models with multi-GPU acceleration. The project includes a specialized object detection library for identifying and locating multiple objects via bounding boxes, as well as an image segmentation toolkit for creating pixel-level masks through instance, semantic, and panoptic segmentation. Additionally, it features a human pose estimati
This project is a modular PyTorch framework for training and evaluating object detection and instance segmentation models. It serves as a computer vision research tool and a deep learning inference engine designed to identify object locations, classes, and pixel-level masks within images. The framework implements a two-stage inference pipeline that utilizes region proposal networks and a symmetric mask-head architecture. It provides specialized capabilities for instance segmentation, object bounding box detection, and human pose estimation via anatomical keypoint detection. The system includ