30 open-source projects similar to lightningpixel/modly, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Modly alternative.
Hunyuan3D-2.1 is a generative 3D framework and image-to-3D pipeline that transforms single 2D images into textured 3D geometries. It functions as an asset generator that produces high-quality 3D meshes and textures using a flow-matching system. The project includes a specialized synthesizer for creating photorealistic textures with physically based rendering properties. These tools allow for the simulation of metallic reflections and light interactions on generated models. The system covers 3D asset pipeline automation through a sequence of shape generation and mesh refinement. It also provi
InstantMesh is a neural 3D reconstruction tool and single-image 3D mesh generator. It utilizes a sparse-view large reconstruction model to convert a single two-dimensional image into a three-dimensional object mesh. The system functions as a textured 3D mesh exporter, saving generated objects with either vertex colors or full texture maps for use in external rendering software. The framework covers a range of capabilities including feed-forward geometry inference, single-image depth estimation, and neural radiance fields. It also supports differentiable mesh rendering and workflows for spars
Threestudio is a 3D generative AI framework designed to create three-dimensional assets from text prompts and images. It provides specialized pipelines for text-to-3D generation and image-to-3D reconstruction, utilizing a neural radiance field trainer to produce geometry and textures. The framework is distinguished by its support for hybrid geometry backends, including signed distance functions, tetrahedra grids, and volume grids. It employs score distillation sampling to guide the generation process and features a modular plugin system for loading custom modules and nodes. The system covers
ComfyUI-3D-Pack is a suite of custom nodes for ComfyUI that enables 3D asset generation and rendering within a node-based workflow. It provides a set of tools for reconstructing textured three-dimensional meshes and volumetric scenes from single images, multi-view images, or text prompts. The system includes a Gaussian splatting generator for creating high-fidelity volumetric 3D scene representations and a multi-view image generator to produce consistent image sets for reconstruction. It also features a single image 3D mesh tool to build geometry from a single 2D source. The toolset covers 3
jetson-inference is a set of libraries and tools for executing optimized deep learning models on embedded GPU hardware. Its primary purpose is to enable real-time computer vision and AI inference at the edge with low latency and high throughput. The project distinguishes itself through high-performance streaming analytics and the ability to execute concurrent AI pipelines on auto-grade silicon. It provides specialized support for multi-sensor stream processing, utilizing zero-copy data transport to load camera frames directly into GPU memory. The codebase covers a broad surface of capabiliti
TRELLIS is a 3D generative AI model and latent diffusion framework designed to transform natural language descriptions or reference images into textured 3D assets. It operates as a text-to-3D asset generator that utilizes structured latent representations to produce high-quality 3D meshes, Gaussians, and Radiance Fields. The system functions as a multi-format 3D decoder, converting internal representations into standard exchange formats such as GLB and PLY. It also serves as a 3D asset editing tool, enabling the modification of specific regions of generated objects through targeted text or im
Unity MCP is a plugin that connects the Unity Editor to AI assistants through the Model Context Protocol, enabling natural language control over scene manipulation, object creation, and editor workflows. It allows developers to generate C# scripts, modify GameObjects and components, create UI layouts, and manage assets by issuing commands through an AI interface, effectively turning the editor into a conversational development environment. The plugin distinguishes itself through a comprehensive automation system that can execute multi-step tasks from a design document, record and replay edito
Positron is a data science integrated development environment and AI-powered code editor designed for polyglot development, specifically supporting Python and R. It functions as a remote compute workspace that separates the user interface from the execution kernel via SSH or container integration. The environment features a deep integration of large language models that provide context-aware suggestions and automated data analysis by accessing real-time interpreter state, in-memory objects, and plot outputs. It distinguishes itself through a polyglot runtime bridge that enables cross-language
WaveTerm is a cross-platform terminal emulator that integrates artificial intelligence, graphical widgets, and remote session management into a unified, block-based workspace. By rendering the interface through a web-based engine, it allows users to organize their development environment into a grid of resizable, independent blocks that can host shells, interactive web content, and system monitoring tools. The platform distinguishes itself by embedding intelligent models directly into the command-line interface, enabling automated code generation, terminal output analysis, and multimodal file
Kilocode is an autonomous engineering platform designed to orchestrate AI agents for complex software development tasks. It functions as a comprehensive system for automating coding, testing, and repository management by integrating directly with your codebase and terminal. The platform provides a unified gateway for model orchestration, allowing for the management of agentic workflows, event-driven automation, and persistent session state across distributed development environments. The platform distinguishes itself through its federated task management and policy-based access control, which
pifuhd is a 3D human reconstruction framework that generates high-resolution 3D meshes of people from a single 2D image. It utilizes pixel-aligned implicit functions to map image pixels to 3D space, predicting surface occupancy and distance to create detailed geometry. The system includes a pipeline for creating digital human assets, moving from 2D image feature projection to the extraction of discrete triangular meshes. It features specialized tools for refining these models, including a post-processor that removes geometric artifacts by isolating the largest connected component of the mesh.
Terrain3D is a high-performance toolkit for creating, sculpting, and painting editable 3D landscapes within the Godot game engine. It provides a set of tools for generating landmasses and converting external geography data into optimized 3D terrain meshes. The system features a foliage manager that places vegetation across landscapes using levels of detail and shadow impostors to maintain performance. It also includes a sculpting tool that supports the creation of landmasses with holes and multi-layered texture painting. The project covers a broad capability surface including heightmap impor
Lorax is a GPU-accelerated inference server and multi-adapter engine designed for serving large language models. It functions as a high-throughput system capable of deploying models via Kubernetes and managing the dynamic swapping of Low-Rank Adaptation adapters per request. The server distinguishes itself through multi-adapter dynamic batching, which allows requests using different adapter weights to be processed in a single GPU forward pass. It employs just-in-time adapter loading and weighted adapter merging to maximize throughput and enable multi-tasking without sacrificing performance.
pgai is a PostgreSQL AI toolkit and framework designed to integrate large language models and vector embeddings directly into a database. It serves as a bridge for executing machine learning model requests and performing text-to-SQL translations within standard database queries. The project provides an automated vector embedding pipeline that handles the loading, parsing, and chunking of text from tables and unstructured documents. This system utilizes a background worker to synchronize embeddings automatically as source data changes and includes specialized tools for building retrieval-augme
DreamGaussian is a generative system and converter designed to create textured three-dimensional models from text or images using Gaussian Splatting. It functions as a pipeline for transforming two-dimensional inputs into high-fidelity 3D assets. The project provides specific workflows for converting 3D Gaussian point clouds into standard textured mesh formats compatible with external 3D software. It supports the generation of textured meshes from single images via volumetric refinement and UV texture optimization, as well as the creation of 3D models from text prompts through intermediate im
Shap-E is a generative 3D modeling system that creates three-dimensional digital assets from natural language descriptions or two-dimensional images. It functions as a generative model capable of producing three-dimensional implicit functions and assets. The project includes a 3D latent encoder that converts trimeshes and 3D models into latent representations using point clouds and multiview renders. It utilizes an image-to-3D generator to produce assets from synthetic view images and a text-to-3D generator to build shapes from text prompts. The system implements a pipeline involving latent
This project is a diffusion-based 3D generator and image-to-3D reconstruction system. It translates natural language descriptions or two-dimensional images into three-dimensional assets using neural radiance fields and diffusion models. The system utilizes score-distillation sampling and diffusion-based guidance to refine 3D shapes without requiring 3D training data. It includes specialized tools for transforming neural representations into exportable meshes with texture and material data, as well as a pipeline for iterative optimization of geometry and textures. The project covers a broad r
AITemplate is an ahead-of-time deep learning compiler that translates PyTorch neural networks into standalone C++ source code. It functions as a PyTorch to C++ compiler and a GPU kernel fusion engine, producing self-contained executable binaries that run inference without requiring a Python interpreter or deep learning framework runtime. The project generates optimized CUDA and HIP C++ code specifically for NVIDIA TensorCores and AMD MatrixCores. It focuses on maximizing throughput for half-precision floating-point operations through a system that combines multiple neural network operators in
FlashMLA is an LLM attention kernel library and inference acceleration library providing a collection of high-performance CUDA kernels. It implements multi-head latent attention mechanisms designed to reduce memory overhead and increase throughput during the forward and backward passes of large language model inference. The library utilizes quantized cache attention kernels to improve computation efficiency across both sparse and dense token processing. It specifically optimizes the prefill and decoding phases of model inference through these latent attention implementations. The project cov
TRELLIS.2 is a generative image-to-3D system that creates high-resolution 3D assets with physically based rendering materials from 2D images. It utilizes a sparse voxel representation to handle complex topologies and internal structures without relying on iso-surface fields. The project features a structured latent space representation that maps geometry and texture attributes to maintain visual fidelity. It employs an optimization-free geometry reconstruction process to decode latent representations directly into voxel grids and includes a PBR texture generator for synthesizing base color, r
BentoML is a machine learning model serving framework and GPU-accelerated inference server designed to package, deploy, and scale AI models as production-ready REST APIs. It functions as an AI model lifecycle manager and an inference graph orchestrator, enabling the chaining of multiple models and custom logic into complex pipelines for advanced task sequences. The framework distinguishes itself through a dynamic batching engine that optimizes GPU throughput and an artifact-based packaging system that bundles model weights and dependencies into immutable archives for consistent deployment. It
Vercel is a cloud platform for building, deploying, and scaling web applications. It provides a unified infrastructure that automates the build process by detecting project frameworks and distributing static and dynamic content through a global content delivery network. The platform executes application logic using serverless functions that scale automatically based on real-time traffic demand. The platform distinguishes itself through a centralized AI gateway that proxies requests to multiple model providers, enabling standardized authentication, observability, and cost tracking. It supports
Hunyuan3D-2 is a machine learning framework designed to convert two-dimensional images into fully realized, textured three-dimensional meshes. It utilizes a generative artificial intelligence model to perform both shape construction and surface texture synthesis, enabling the automated creation of digital assets. The system distinguishes itself through a modular generative pipeline that separates geometry reconstruction from texture mapping. It employs multi-view image projection and latent diffusion techniques to ensure geometric consistency, while providing a plugin-based bridge architectur
Text Embeddings Inference is a high-performance inference server designed to host text embedding and sequence classification models as scalable API endpoints. It provides a vector embedding API to convert text into dense representations and a cross-encoder reranking server for scoring the relevance of document sequences against a query. The project features a GPU-accelerated inference engine that utilizes dynamic batching and specialized kernels to maximize throughput. It offers a high-performance binary interface via gRPC as an alternative to standard HTTP to reduce network latency and seria
This project is a framework for developing and orchestrating autonomous software agents within JVM-based applications. It provides a toolkit for embedding artificial intelligence directly into business logic, enabling agents to perform complex tasks through dynamic, goal-oriented planning rather than rigid state machines. By leveraging declarative annotations, the framework allows developers to define agent capabilities and integrate them into existing object-oriented domain models. The framework distinguishes itself through a vendor-neutral abstraction layer that allows for the seamless swap
AlphaFold3 is a biomolecular structure prediction model and bioinformatics structural analysis tool. It uses a deep learning system to predict the three-dimensional shapes of proteins, DNA, RNA, and ligands. The system functions as a diffusion-based protein folding model that predicts the spatial coordinates of biomolecular atoms and interactions. It utilizes a GPU-accelerated inference pipeline to process genetic sequences and structural templates for molecular modeling. The project covers structural bioinformatics analysis and protein interaction modeling to determine the physical arrangem
ipex-llm is an acceleration library and inference engine designed to optimize the execution and finetuning of large language models on Intel GPUs and NPUs. It provides a HuggingFace compatible model backend and a dedicated quantization toolkit for converting model weights into low-bit precision formats. The project facilitates distributed inference by splitting large model workloads across multiple accelerators using pipeline and tensor parallelism. It enables the deployment of models on Intel Arc, Flex, and Max GPUs to increase throughput and reduce latency. The library covers a broad range
WhisperLive is a real-time speech-to-text server that converts live audio streams into text using Whisper models. It functions as a backend service that receives microphone input via WebSockets and provides incremental transcriptions with word-level timestamps. The system utilizes a GPU-accelerated inference engine and a keyword-boosted transcription API to improve the recognition accuracy of domain-specific jargon, acronyms, and product names. It also includes a speaker diarization tool that clusters audio embeddings to identify and label different participants within a recording. Additiona
Casibase is an open-source platform that orchestrates multi-turn conversations with large language models and manages retrieval-augmented knowledge bases from a single interface. It provides a unified system for connecting to over 30 AI model providers, ingesting documents into vector embeddings for semantic search, and running autonomous agent loops that can drive a browser, search the web, execute commands, and integrate with external tools. The platform distinguishes itself by combining AI conversation management with infrastructure and application orchestration capabilities. It includes a
Evolution API is a collection of system components including a WhatsApp API gateway, a multi-channel messaging bridge, and a conversational AI orchestrator. It functions as an event-driven messaging middleware that links messaging platforms with large language models and external applications to automate text and audio responses. The project provides a self-hosted marketing automation platform for executing customer relationship workflows and outreach campaigns. It further distinguishes itself by routing chat conversations between different messaging services and customer support tools throug