EasyMocap is a markerless 3D human motion capture system that recovers body, hand, and face poses from single or multi-view video without physical markers or suits. It uses parametric body models like SMPL, SMPL-X, and MANO, and leverages mirror reflections to resolve depth ambiguity in single-view pose estimation, improving accuracy by computing mirror surface normals from vanishing points.
The system distinguishes itself through mirror-assisted depth disambiguation, enabling accurate 3D pose reconstruction from a single RGB image or video that includes a mirror reflection. It also supports multi-view triangulation and bundle adjustment calibration for synchronized camera setups, and can fit parametric models to 2D keypoints and silhouettes for robust 3D pose recovery. Reconstructed motion data can be exported to standard animation formats such as BVH and ASF/AMC.
Additional capabilities include CNN-based pose initialization, deformable mesh tracking, and a real-time visualization pipeline for immediate feedback during capture. The project also provides a manual annotation tool for labeling bounding boxes, keypoints, and segmentation masks to create ground-truth data.