30 open-source projects similar to modstart-lib/aigcpanel, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Aigcpanel alternative.
cc-wf-studio is a suite of tools for visually designing, refining, and exporting AI agent workflows. It provides a visual automation orchestrator and an LLM agent workflow designer that allow users to create multi-agent sequences and tool integrations using a drag-and-drop canvas. The project features a converter that transforms these visual agent designs into markdown-formatted commands and skills for use with artificial intelligence coding assistants. It also includes an AI-driven workflow editor that enables the modification of agent logic through natural language conversations. The platf
TaskingAI is an AI agent orchestrator and application platform used to build, deploy, and scale AI-native applications. It functions as a multi-tenant backend as a service, providing the infrastructure to host and manage independent AI agent instances across multiple users or organizations on a shared architecture. The platform features a visual workflow builder and project management console, allowing users to configure agent logic and test conversation workflows through a graphical interface before moving them to a production environment. The system orchestrates large language models by st
This Python SDK provides a comprehensive toolkit for synthetic audio generation, voice cloning, and the development of conversational AI agents. It enables the creation of lifelike spoken audio from text, the replication of human voices through custom cloning, and the deployment of real-time voice agents capable of interacting with external large language models. The library distinguishes itself through deep integration of conversational AI capabilities, including the design of agent personas and the execution of real-time actions via APIs. It supports professional-grade audio production thro
LiveTalking is an interactive talking head engine and AI avatar management platform designed to synchronize synthetic speech with facial movements. It functions as a real-time orchestrator that connects large language models and text-to-speech services to neural-rendered digital humans. The project distinguishes itself through low-latency streaming capabilities and the ability to handle real-time conversational interruptions. It supports advanced audio-visual customization, including human voice cloning and the ability to drive avatar expressions using real-time webcam data. The platform cov
mini-swe-agent is an autonomous software engineering system designed to develop features and fix bugs by combining large language models with a bash interface. It operates as an agentic framework that executes coding tasks and documentation updates through a continuous cycle of model reasoning and tool execution. The project differentiates itself with a strong focus on safety and evaluation, utilizing container-based sandbox execution via Docker or Singularity to isolate command execution. It includes a batch-parallel evaluation harness to measure code-fixing accuracy against standardized sof
This project is a scalable, containerized pipeline designed to transform digital documents and image-based ebooks into narrated audiobooks. It functions as an end-to-end production platform that integrates text-to-speech synthesis, optical character recognition, and automated workflow management to convert various file formats into spoken audio. The system distinguishes itself through advanced linguistic analysis and voice synthesis capabilities, including the ability to identify characters within a text and assign them distinct voice profiles for multi-speaker narration. Users can further pe
Promptflow is a development framework and orchestrator for building applications powered by large language models. It functions as a suite of tools for designing, orchestrating, and deploying AI workflows by linking prompts, custom Python code, and language models into executable sequences. The project is distinguished by a visual AI workflow designer that allows for the creation of directed acyclic graphs of logic nodes. It provides a dedicated prompt engineering environment for versioning and comparing templates, alongside stateful execution tracing to record function calls and variable val
Eino is an AI agent development kit and LLM application framework designed for building autonomous agents and orchestrating complex language model workflows. It serves as a multi-agent orchestration engine and workflow orchestrator, providing a graph-based execution model to route data between models, tools, and retrievers. The framework distinguishes itself through a robust set of multi-agent coordination patterns, including supervisor-led management, sequential flows, and autonomous reasoning loops like ReAct. It features advanced agent execution controls such as active turn preemption, che
This project is a Java-based framework integration that provides an AI agent runtime, a graph-based AI workflow engine, and an LLM orchestration framework for Spring applications. It enables the development of stateful autonomous agents and the implementation of retrieval-augmented generation systems using document processing and vector databases. The framework distinguishes itself through a graph-based workflow runtime for designing complex AI pipelines with conditional routing and persistent state. It supports multi-agent orchestration via service-discovery coordination and provides human-i
BrowserOS is an AI agent browser orchestrator and automation framework designed to manage browser state and execute complex web workflows. It functions as a local AI browser assistant and a Model Context Protocol controller, enabling the control of browser tabs, windows, and navigation through programmable AI agents and standardized context protocols. The system distinguishes itself through a graph-based visual workflow builder for creating repeatable automation sequences and the use of markdown-based files to define agent personalities and task recipes. It supports multi-provider orchestrati
Llama-stack is a standardized orchestration stack and generative AI API gateway. It provides a unified communication layer and a consistent interface for deploying, managing, and interacting with various large language model providers and deployments. The system functions as an agent framework that manages tool execution and versioned skill bundles to automate complex tasks. It includes a batch processing system for handling large volumes of asynchronous requests through offline processing and a vector database interface for storing and searching documents to enable retrieval augmented genera
KServe is an open platform for deploying and serving generative and predictive AI models on Kubernetes. It defines inference services as custom resources with declarative YAML specifications, enabling a Kubernetes-native approach to model deployment and lifecycle management. The platform leverages Knative-based serverless scaling for automatic scale-to-zero and revision management, and supports a pluggable serving runtime architecture that maps model formats to containerized execution environments. KServe distinguishes itself through model-aware autoscaling that scales replicas based on token
Botpress is a conversational AI builder and LLM agent platform used to design chatbot workflows and orchestrate agents powered by large language models. It provides a framework for managing the entire lifecycle of these agents, from initial creation through to deployment across various production environments. The platform includes a custom integration SDK for developing and publishing third-party connectors that extend agent capabilities. These tools allow for the creation of custom plugins that connect AI agents to external APIs and third-party services. The system supports both visual des
Duix-Avatar is an AI digital human toolkit used to create, clone, and animate realistic virtual personas. It functions as a digital persona cloning tool and a text-to-speech animation API that converts written text or audio into synthetic voice and facial motion markers. The framework provides an offline video generation engine that renders digital human animations and lip-synced videos on local hardware. It includes a specialized lip sync engine to synchronize mouth movements with audio waveforms and a pipeline for extracting facial and vocal features from source media to create synthetic re
Langflow is a low-code platform for designing and deploying multi-step AI agent pipelines and large language model sequences. It provides a visual environment to map logic and data flow between components, serving as an orchestrator for managing conversations and data retrieval across multiple autonomous agents. The platform distinguishes itself through a drag-and-drop interface that allows for the construction of complex AI pipelines without extensive boilerplate code. It enables the conversion of these internal workflows into standardized tools for external connectivity via the Model Contex
This project is a development platform for managing the lifecycle of generative artificial intelligence models. It provides a unified environment for accessing, fine-tuning, and deploying large language models, serving as an orchestrator that handles the integration of diverse models into custom applications. The platform distinguishes itself by offering a managed infrastructure for hosting and scaling models, which removes the requirement for manual server maintenance or configuration. It includes integrated tools for supervised fine-tuning and vector embedding optimization, allowing for the
This project is an LLM autonomous agent framework and orchestration tool designed to build goal-driven agents that automate complex workflows. It functions as a system for converting high-level objectives into a series of autonomous actions and managing the coordination of multiple specialized agents to solve multi-step problems. The framework features a tool integration layer that parses structured model outputs into executable functions and external API calls. It utilizes a non-blocking execution pipeline to manage task orchestration through recursive loops and asynchronous event handling.
EMO is an AI portrait animator and audio-to-video diffusion model designed to generate expressive talking head videos. It transforms a single static portrait image and an audio track into a synchronized video of a person speaking. The system focuses on digital human synthesis, producing high-fidelity facial movements and emotional cues. It synchronizes lip movements and facial gestures to match spoken voice recordings to create realistic portrait animations. The framework utilizes a diffusion process and a cross-modal alignment mechanism to ensure timing between audio signals and visual land
Plano is an AI agent orchestrator and LLM gateway proxy that unifies access to multiple AI providers through a single interoperable interface. It functions as a model routing engine that decouples applications from specific vendors using semantic aliases, allowing traffic to be shifted between providers without modifying application code. The system distinguishes itself with intent-based agent routing, which directs prompts to specialized agents based on semantic analysis. It features an interceptor-based filter chain system that acts as guardrail middleware to enforce safety policies, rewrit
Wenda is an LLM orchestration platform and custom workflow engine designed to manage multiple language model backends through a unified interface. It functions as a self-hosted AI gateway that enables the execution of complex task sequences and automated conversation flows. The system utilizes JavaScript plugins to orchestrate workflows and trigger external API calls. It supports retrieval augmented generation by injecting relevant data from vector stores and offline files into prompts to increase response accuracy. The platform is built for private network deployments, featuring multi-user
F5-TTS is a text-to-speech system that utilizes a flow matching engine and diffusion transformers to generate fluent synthetic speech. It functions as a multilingual speech synthesizer and neural training framework, providing tools for voice cloning and high-performance inference serving. The project distinguishes itself through a voice cloning toolkit capable of mimicking specific speaker characteristics and tones from reference audio clips. It supports cross-lingual generation, allowing for the synthesis of audio across various global languages or the mixing of multiple languages within a s
Dia is a generative AI audio tool and text-to-speech synthesis engine designed for the production-ready deployment of machine learning models. It provides a framework for creating lifelike synthetic speech by conditioning generation on reference audio samples to replicate specific vocal characteristics, emotional tones, and delivery styles. The system distinguishes itself through its ability to perform custom voice cloning and precise control over audio output. Users can adjust generation parameters such as temperature and guidance scale to modify the pacing, creativity, and style of the synt
Neutts is a neural text-to-speech engine designed for real-time streaming output on edge devices such as phones and laptops. It supports voice cloning from short audio references, enabling zero-shot reproduction of a target speaker's voice, and can be fine-tuned or retrained from scratch for custom voices and styles. The system distinguishes itself through a decoder-only architecture that halves memory and accelerates generation on constrained hardware, combined with quantized model inference for reduced memory footprint. Its streaming decoder loop interleaves synthesis with playback, deliver
Summarize is a command line tool and multimodal content extractor designed to generate concise summaries from web pages, documents, and media files. It functions as an orchestrator that connects developer tools to various language model providers to process and condense information. The system provides specialized capabilities for audio and video processing, including transcription with speaker identification and the extraction of timestamped visual markers from video slides. It also includes a translation utility to convert generated summaries and extracted text into different target languag
This project is an agentic retrieval-augmented generation platform and orchestration framework designed to connect large language models to private enterprise data. It serves as a self-hosted AI gateway that integrates vector databases and external tools to automate complex information retrieval and generation tasks. The system differentiates itself through an AI agent workflow builder that orchestrates multiple specialized agents with distinct roles to solve multi-step problems. It includes a dedicated vector database integration interface for indexing private documents and a secure sandbox
Rasa is a chatbot development platform and conversational AI framework used to design, deploy, and integrate multi-turn conversational agents. It functions as an LLM orchestration engine and NLU dialogue manager, combining large language model fluency with structured business logic to control agent behavior. The framework enables the development of conversational assistants that automate text and voice interactions. It allows for the definition of conversational flows using flexible sequences and provides tools to inspect agent decisions to debug and validate the internal reasoning process.
RealtimeTTS is a real-time text-to-speech engine and stream processor designed to convert text or token streams into audio playback with minimal latency. It provides a programmatic interface for managing audio streams, synthesis progress, and the integration of local or cloud-based speech engines. The system includes a neural voice cloning tool that generates synthetic speech by extracting acoustic features from reference audio samples. It utilizes a provider-based abstraction to route synthesis requests across different neural models and cloud APIs. The project covers a range of functional
Open-Higgsfield-AI is a generative AI content studio and visual workflow orchestrator. It provides a unified interface for creating photorealistic images and videos, utilizing a node-based editor to chain multiple image, video, and audio models into automated content pipelines. The system functions as an AI video animation tool and local GPU inference engine, allowing users to run generative models on local hardware or remote servers. It includes specialized capabilities for audio-driven lip synchronization and cinematic camera controls to adjust virtual lens and focal settings. The platform
This project is an expressive text-to-speech foundation model and voice cloning system designed to synthesize human-like speech with emotional nuance and high fidelity. It functions as a finetunable speech model that can generate audio mimicking a specific person using a reference voice sample. The system distinguishes itself through a high-performance inference engine that utilizes memory caching and hardware compilation to reduce latency during the audio generation process. It further allows for synthesis quality improvements by training the language model on custom datasets consisting of a