30 open-source projects similar to xlite-dev/lite.ai.toolkit, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Lite.ai.toolkit alternative.
CompreFace is a facial recognition system designed for human face detection, identification, and biometric identity verification. It provides a registry of known people and the ability to match faces in images against this database to determine a specific identity. The system extracts facial landmarks to map geometry and analyzes physical attributes including age, gender, and head pose. It can also verify whether two different images belong to the same individual. The project is implemented as a microservice-based deployment utilizing a REST API gateway and a PostgreSQL metadata store. It in
Deepface is a comprehensive deep learning library for facial recognition and demographic analysis. It provides a modular pipeline that handles the entire lifecycle of facial processing, including detection, geometric alignment, and the transformation of facial images into high-dimensional numerical vector embeddings for identity verification and similarity comparison. The library distinguishes itself through a model ensemble approach, which combines predictions from multiple pre-trained neural networks to improve classification accuracy and reduce bias. It also integrates advanced security fe
This is a Python facial recognition library designed to detect, encode, and identify human faces in images and video. It functions as a biometric identification tool that converts facial features into numerical encodings to compare and match identities. The library provides a computer vision command line interface for batch processing face detection and recognition tasks across image directories. It also supports a GPU accelerated vision API that utilizes CUDA and NVIDIA hardware to increase the speed of facial analysis and identification. Its capabilities cover human face detection and faci
tensorrtx is a computer vision inference engine and model implementation library designed for graphics processor acceleration. It provides a framework for optimizing deep learning models through a GPU inference optimizer, a deep learning model converter for transforming weights from frameworks like TensorFlow and PyTorch, and a custom plugin library to implement operations not natively supported by the TensorRT API. The project distinguishes itself through a comprehensive collection of pre-defined network implementations, ranging from various YOLO versions and DETR transformers for object det
face-api.js is a TensorFlow.js face recognition library and browser-based computer vision API. It provides tools for performing face detection, recognition, and landmark prediction within browsers and Node.js. The library includes a biometric identity descriptor generator that creates numerical vectors to compare identity and similarity between images. It features a facial landmark detection tool for mapping sixty-eight specific coordinate points on a face, as well as an age and gender estimation model. Its capabilities cover real-time facial analysis, including the recognition of facial exp
Human is a TensorFlow.js computer vision library used for face, body, and hand tracking within the browser or Node.js. It provides a framework for human pose and gesture tracking, facial recognition, and biometric liveness detection to verify a live human presence. The project distinguishes itself through a full suite of identity and motion tools, including a facial recognition framework that generates embeddings for similarity matching and a background segmenter for separating humans from their environment. It incorporates a liveness detector to prevent spoofing during facial analysis. The
Faceai is a computer vision toolkit designed for facial analysis, identity recognition, and image processing. It provides integrated engines for detecting human faces in static images and live video streams, matching facial encodings against identity databases, and mapping facial landmarks to understand geometric structure and alignment. The project enables real-time augmented reality applications, such as applying virtual makeup and digital accessories by scaling assets to detected facial coordinates. It also includes a suite for digital image restoration capable of removing noise, erasing w
PaddleX is a PaddlePaddle-based framework for building, deploying, and fine-tuning AI model pipelines, with pre-built support for computer vision, OCR, document analysis, and time series tasks. It offers a toolkit of ready-to-use pipelines for image classification, object detection, segmentation, and pose estimation, alongside an end-to-end OCR document analysis pipeline that extracts text, tables, formulas, and layout information. The platform also includes a dedicated time series forecasting pipeline for analyzing historical data to detect anomalies, classify patterns, and predict future val
jetson-inference is a set of libraries and tools for executing optimized deep learning models on embedded GPU hardware. Its primary purpose is to enable real-time computer vision and AI inference at the edge with low latency and high throughput. The project distinguishes itself through high-performance streaming analytics and the ability to execute concurrent AI pipelines on auto-grade silicon. It provides specialized support for multi-sensor stream processing, utilizing zero-copy data transport to load camera frames directly into GPU memory. The codebase covers a broad surface of capabiliti
PaddleDetection is an object detection framework designed for the end-to-end development, training, and deployment of computer vision models. It provides a comprehensive library of modular neural network architectures and pipelines that support object detection, instance segmentation, and multi-object tracking tasks. The project distinguishes itself through a configuration-driven approach that decouples model components like backbones and heads, allowing for the flexible assembly of custom vision workflows. It incorporates advanced techniques such as anchor-free detection logic, joint detecti
Ultralytics is a comprehensive computer vision framework designed for training, validating, and deploying deep learning models across a wide range of visual recognition tasks. It provides a unified interface for core operations including object detection, instance segmentation, pose estimation, and image classification. By utilizing a modular architecture, the platform allows users to swap model components to balance inference speed and accuracy requirements for diverse applications. The framework distinguishes itself through its support for real-time processing and flexible deployment. It in
This repository serves as a comprehensive collection of reference implementations for the PyTorch machine learning library. It provides practical examples for building, training, and deploying deep learning models, functioning as a toolkit for developers to explore neural network architectures and training workflows. The project distinguishes itself by offering concrete demonstrations of complex machine learning operations, ranging from computer vision tasks like object detection and depth estimation to the training of large-scale transformer models. These examples illustrate how to implement
Face-recognition.js is a computer vision software development kit for Node.js that provides tools for detecting, mapping, and identifying human faces within images and video streams. It functions as a bridge to high-performance native libraries, enabling developers to perform complex facial analysis tasks directly within JavaScript and TypeScript environments. The library distinguishes itself by combining deep learning inference with geometric landmark mapping. It utilizes pre-trained neural networks to extract facial feature vectors and employs Euclidean distance calculations to determine th
This project is a comprehensive instructional resource and course for building neural networks using PyTorch. It covers the fundamental building blocks of deep learning, including tensor manipulation, automatic differentiation, and the construction of modular neural network components. The repository serves as a technical guide for several specialized domains. It provides implementation details for computer vision tasks such as image classification, object detection, and semantic segmentation, as well as natural language processing workflows involving transformers, recurrent networks, and gen
mmagic is a multimodal training pipeline and framework for generative AI, focusing on visual synthesis and restoration. It provides the infrastructure to build and train models for tasks such as text-to-image and text-to-video generation, 3D-aware content synthesis, and high-fidelity image translation using diffusion models and generative adversarial networks. The project distinguishes itself through specialized capabilities for generative model personalization, including techniques for fine-tuning subjects and styles. It also supports advanced visual manipulations such as latent space interp
ImageAI is a Python computer vision library providing a suite of tools for image classification, object detection, and video analytics. It functions as an integrated framework for locating and labeling objects in static images and video streams, utilizing deep learning models for identification and categorization. The project includes a model training toolkit that allows for the creation of custom classifiers and detectors through scratch training or transfer learning. It features a GPU-accelerated inference engine to increase processing speed for vision tasks and includes specialized utiliti
This project is a PyTorch implementation of the YOLOv4 object detection framework. It provides a system for training and deploying neural networks that identify and locate multiple objects within images and video streams. The framework includes tools for converting trained weights into universal formats and hardware-specific optimized engines, specifically supporting ONNX and TensorRT. It features a TensorRT inference optimizer to reduce latency and increase throughput, as well as a model architecture compatible with NVIDIA DeepStream streaming analytics pipelines. The system covers model tr
YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning to high-speed inference and deployment. The framework utilizes a modular neural architecture, allowing users to swap backbone and head components to tailor models for specific visual tasks. What distinguishes this project is its focus on production-ready deployment and model ef
facenet-pytorch is a facial recognition library for PyTorch that provides pretrained neural networks for detecting faces and extracting facial embeddings. It includes an MTCNN face detector for locating faces and landmarks, alongside an InceptionResnet face encoder to convert facial images into high-dimensional vectors for identity verification. The project provides tools for identity recognition by comparing facial embeddings using cosine similarity. It also supports facial video tracking to maintain identity consistency across consecutive frames and allows for the fine-tuning of pretrained
jeelizFaceFilter is a browser-based computer vision engine and WebGL face tracking library designed for AR filters and real-time facial movement tracking. It functions as a neural network face detector that identifies multiple faces and monitors mouth movements and rotation within a web browser. The system distinguishes itself through a model-swappable detection pipeline, allowing the exchange of neural network weights to balance accuracy and performance across different camera angles and devices. It features real-time lighting synchronization to match the illumination of 3D overlays with the
YOLOv9 is a real-time computer vision framework and deep learning model designed for image classification, object detection, and instance segmentation. It functions as both a vision model and a trainer, allowing for the optimization of neural network weights on custom datasets using single or multiple GPUs. The framework utilizes programmable gradient information to perform high-speed identification and location of multiple objects within images and video streams. It extends beyond bounding box detection to provide instance segmentation and panoptic segmentation, which labels every pixel in a
Gluon-CV is an MXNet computer vision library that provides a comprehensive collection of pre-implemented vision architectures and training pipelines. It serves as a deep learning research toolkit and a model zoo containing state-of-the-art pre-trained weights for image and video analysis. The project includes a specialized human pose estimation library and a model compression toolkit. These tools allow for the pruning and quantization of deep learning models to increase inference speed and facilitate deployment on constrained edge hardware. The library covers a broad range of vision capabili
Faceswap is a comprehensive framework for automated media manipulation and neural face synthesis. It provides a modular pipeline that manages the entire lifecycle of facial feature extraction, deep learning model training, and image conversion. By coordinating complex computer vision workflows, the system enables users to map facial identities between source and destination datasets while maintaining structural alignment and lighting consistency across video frames. The project distinguishes itself through a highly extensible plugin-based architecture that handles hardware-accelerated process
YOLOv6 is a single-stage deep learning framework designed for industrial object detection. It serves as a computer vision model trainer for identifying and locating objects within images, as well as an instance segmentation tool that delineates precise object boundaries using masks. The project includes a specialized mobile inference optimizer and a model quantization toolkit. These components focus on reducing model size and resolution to improve execution speed on ARM-based chipsets and converting models to low-precision formats to decrease file size. The framework covers a broad range of
This project is a computer vision system designed for the detection and identification of human faces within live video streams. It functions as a facial analysis pipeline that processes visual data to locate facial boundaries and match individuals against a stored database of known identities. The system utilizes a multi-stage neural network framework to isolate facial regions and extract unique identity characteristics. By converting facial image data into compact numerical vectors, it performs geometric similarity calculations to verify or identify subjects as they appear in motion. The s
AutoGluon is an automated machine learning framework and multimodal library designed to automate the end-to-end pipeline from data preprocessing to high-accuracy model training and validation. It functions as an automated model trainer for tabular, image, text, and time series data, as well as a tool for time series forecasting and foundation model finetuning. The project is distinguished by its ability to jointly process and fuse different data types, allowing for the construction of multimodal neural networks that integrate images, text, and structured tables. It supports zero-shot inferenc
Deep Java Library is a Java deep learning framework and JVM model inference engine. It provides a high-level API for building and deploying deep learning models within the Java ecosystem, acting as a cross-platform runtime for executing models across CPUs, GPUs, and mobile devices. The library is engine-agnostic, allowing users to switch between different deep learning engines such as PyTorch, TensorFlow, and MXNet while maintaining a single unified API. This enables the deployment of the same model across different backends without changing the application code. The framework supports the f
Detectron2 is a PyTorch computer vision framework and visual recognition platform designed for training and deploying models for object detection, image segmentation, and visual recognition. It provides a research-oriented environment for training complex vision models with multi-GPU acceleration. The project includes a specialized object detection library for identifying and locating multiple objects via bounding boxes, as well as an image segmentation toolkit for creating pixel-level masks through instance, semantic, and panoptic segmentation. Additionally, it features a human pose estimati
SAHI is a sliced inference framework and computer vision pipeline designed to detect small objects in high-resolution images. It provides a system for dividing large images into overlapping patches to prevent the detail loss that typically occurs during standard model downscaling, alongside an image tiling utility and a COCO dataset toolkit. The project distinguishes itself by offering a model-agnostic prediction wrapper that standardizes different machine learning frameworks into a unified interface. This allows it to implement sliced inference and object detection across various model backe