30 open-source projects similar to intelligentnode/intelliserver, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best IntelliServer alternative.
Agenta is a Prompt Ops lifecycle manager and prompt management platform that decouples prompt engineering from application code. It serves as a centralized system for developing, versioning, and deploying prompt templates and model configurations across different environments. The platform functions as an AI agent orchestrator with a visual interface for building agent workflows and connecting models to external tools. It further acts as an evaluation framework and observability tool, utilizing OpenTelemetry to capture execution traces, monitor latency, and track token costs. The system cove
MNN is a high-performance inference engine and framework designed for on-device machine learning. It provides a comprehensive environment for executing, optimizing, and deploying neural network models directly on mobile and resource-constrained edge devices. The framework distinguishes itself through a robust model optimization toolkit that supports quantization, compression, and structural graph manipulation to minimize memory footprint and maximize execution speed. It features a modular architecture that abstracts hardware-specific backends, allowing models to run efficiently across diverse
Langchain-Chatchat is a system for building retrieval-augmented generation applications and autonomous AI agents. It integrates a knowledge base management system and an agent framework to enable language models to interact with private documents and execute multi-step tasks through external tools. The platform supports local deployment of language models on private infrastructure to operate without an internet connection. It includes a multimodal AI platform that combines vision models for image analysis with text-to-image generation capabilities. The system provides a web-based conversatio
Opik is an observability and evaluation platform designed for generative AI applications and agentic workflows. It provides a centralized environment for tracing execution flows, managing prompt templates, and monitoring production performance, allowing teams to gain visibility into complex model interactions and tool usage without requiring manual application code changes. The platform distinguishes itself through its integrated approach to the AI development lifecycle, combining distributed trace instrumentation with automated evaluation frameworks. It supports model-as-a-judge scoring, syn
Embedchain is an LLM memory management framework and RAG orchestration engine designed to provide AI agents with a persistent storage layer. It functions as a long-term memory pipeline that extracts facts from unstructured interactions and stores them as permanent knowledge base entries to retain user preferences and interaction history across sessions. The system employs a hybrid vector database interface that combines semantic embeddings with traditional keyword search. It utilizes an entity-linking knowledge graph to connect related information points and applies temporal ranking to distin
Evidently is an AI observability platform and evaluation framework designed to quantify the performance of machine learning models and large language models. It functions as a monitoring tool for detecting data drift and quality degradation in tabular datasets, while providing a specialized analyzer for the faithfulness and correctness of retrieval augmented generation systems. The project distinguishes itself through an evaluation framework that utilizes judge models and custom rubrics to score language model outputs. It includes tools for iterative prompt optimization and the generation of
This project is a conversational AI bot that integrates large language models into WeChat accounts to provide automated responses in private and group chats. Built on the WeChaty bot framework, it functions as a bridge that enables real-time conversational interactions between a messaging account and an AI model. The system acts as an AI multimedia gateway and context manager, supporting the generation of images from text and the transcription of audio files within the chat interface. It tracks interaction histories to manage token limits and maintains coherent conversations through custom sy
gpustack is a GPU cluster management platform and LLM inference orchestrator. It functions as a centralized system for pooling and orchestrating graphics processing units across local servers and cloud environments, serving as a heterogeneous compute manager for diverse hardware and software configurations. The system provides a secure AI model deployment gateway that serves models as scalable services using key-based authentication. It includes a GPU resource scheduler that balances workloads across accelerators and coordinates multiple inference engines to map specific AI models to compatib
Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
LangChain is a framework for building applications that chain large language models with external data sources and third-party tools. It serves as an orchestrator for autonomous agents that use language models to plan and execute multi-step tasks, while providing a toolkit for linking interoperable AI components into sequences to prototype complex model behaviors. The project provides a model agnostic integration layer, allowing users to switch between different language model providers using a standardized interface. It also includes tools for observability and evaluation to track the perfor
Seamlessly integrate LLMs as Python functions
Langfuse is an open-source observability and evaluation platform designed for language model applications. It provides a centralized system for tracking execution traces, monitoring performance metrics, and managing prompt templates. By capturing hierarchical units of work and telemetry data, the platform enables developers to debug complex application lifecycles and analyze token usage, latency, and model interactions in production environments. The platform distinguishes itself through an integrated evaluation framework that allows for systematic benchmarking and automated scoring of model
Dify is an open-source platform for building, orchestrating, and deploying generative AI applications and autonomous agents. It provides a visual development environment that allows users to design complex, multi-step logic chains and conversational flows, which can then be published as APIs, web interfaces, or embedded widgets. The platform acts as a centralized infrastructure layer, managing model connections, prompt templates, and knowledge retrieval to support scalable AI-powered services. What distinguishes the platform is its focus on stateful application design and workflow orchestrati
Langroid is a multi-agent orchestration framework and tool integration suite designed for building complex AI applications. It serves as a multi-modal integration layer that connects diverse local and remote language models with an agentic retrieval-augmented generation system. The project distinguishes itself through a collaborative message-exchange paradigm, allowing specialized agents to delegate tasks hierarchically and coordinate via structured communication. It features an advanced state management system for conversational AI, including the ability to rewind and prune conversation hist
The platform for LLM evaluations and AI agent testing
Easiest and laziest way for building multi-agent LLMs applications.
The TypeScript library for building AI applications.
Make your LLM agent and chat with it simple and fast!
AutoRAG is an automation layer and optimization tool for retrieval-augmented generation. It provides a framework for measuring pipeline performance through an evaluation system and an automated search strategy that identifies the most effective combinations of retrieval and generation modules. The system distinguishes itself through AutoML-style optimization, using hyperparameter grid searches and automated trials to find the highest performing architectural configuration for a specific dataset. It includes a specialized dataset generator that creates synthetic question-answer pairs and groun
MemFree - Hybrid AI Search Engine & AI Page Generator
Guidance is a control framework and generation orchestrator for large language models. It provides a programming layer to steer model outputs through structured templates, schema enforcement, and logical flow management. The framework distinguishes itself by interleaving model generation with local code execution, enabling the use of loops and conditional branching within a single session. It employs grammar-based token constraints and regular expressions to force models to sample only from tokens that satisfy a specific structural format, ensuring strict adherence to predefined data models.
Semantic Kernel is an artificial intelligence orchestration framework designed to integrate large language models with existing codebases. It functions as an agentic workflow engine, providing a standardized interface that connects generative models to traditional application logic, data sources, and external tools to automate complex, multi-step business tasks. The platform distinguishes itself through a modular plugin architecture and a planner-based reasoning engine that decomposes high-level goals into executable sequences of functions. By utilizing a connector-based abstraction layer, it
MindSQL: A Python Text-to-SQL RAG Library simplifying database interactions. Seamlessly integrates with PostgreSQL, MySQL, SQLite, Snowflake, and BigQuery. Powered by GPT-4 and Llama 2, it enables natural language queries. Supports ChromaDB and Faiss for context-aware responses.
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
LLocalSearch is a privacy-focused search engine and agent framework that uses locally hosted large language models to search the internet and aggregate answers. It functions as a retrieval augmented generation interface where all queries and processing remain on the user's own hardware to ensure data privacy and remove dependency on external cloud API providers. The system employs a chain of autonomous agents that perform recursive internet searches, calling search tools multiple times to gather and synthesize information. It coordinates these models to reason through complex queries, providi
Outlines is a guided generation framework designed to enforce structural constraints on large language model output in real time. It serves as a structured output generator that ensures model responses adhere to predefined JSON schemas, regular expressions, or fixed sets of choices to produce predictable and parsable results. The project provides an interface for tool calling by extracting structured function parameters from natural language prompts for programmatic execution. It also includes a prompt templating engine that decouples prompt logic from application code through reusable templa
Evals is a framework designed for automating, managing, and executing repeatable benchmarking suites to analyze the quality and performance of language models. It provides a platform for running standardized tests to measure model accuracy and track behavioral changes over time. The system distinguishes itself through a modular architecture that uses a standardized adapter layer to normalize inputs and outputs, allowing different models to be swapped and tested interchangeably. It supports the creation of custom benchmarks using proprietary data, enabling quality assurance on sensitive tasks
This project is an artificial intelligence gateway that functions as a centralized middleware layer for managing, securing, and observing interactions with language, vision, and audio models. It provides a unified interface that standardizes requests across multiple providers, enabling teams to integrate AI capabilities into their applications through a consistent set of tools and protocols. The gateway distinguishes itself through its comprehensive infrastructure governance and traffic management capabilities. It allows for policy-driven routing, automated failover, and load balancing across
Promptify is a suite of tools designed for model evaluation, prompt management, token cost tracking, structured extraction, and unified API gateway access. It provides a standardized interface to manage requests and responses across multiple large language model providers. The project features a prompt management platform for engineering and versioning prompts with structured output validation. It includes a dedicated evaluation framework to measure model performance using precision, recall, and f1 scores against labeled datasets, alongside a token cost tracker to monitor the financial expens