30 open-source projects similar to embeddedllm/jamaibase, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best JamAIBase alternative.
This project is a data processing engine and AI application platform designed for building production-grade machine learning workflows. It provides a unified programming model that handles both historical batch data and live stream ingestion, enabling the development of real-time ETL pipelines and scalable data transformation workflows. The framework distinguishes itself through differential dataflow execution, which propagates only changes through a pipeline rather than recomputing entire datasets. It supports distributed state management across worker nodes and utilizes incremental stream p
Vanna is a Python framework designed to build conversational interfaces that translate natural language into executable database queries. It functions as an enterprise-grade toolkit that connects language models to relational databases, allowing users to retrieve information through conversational prompts rather than manual code. The system maintains context across interactions by utilizing vector databases to store historical query patterns and schema metadata. The framework distinguishes itself through a focus on security and schema-aware generation. It incorporates granular access control,
This repository is a collection of guides, notebooks, and recipes for implementing advanced prompting techniques and workflow patterns with large language models. It serves as a prompt engineering guide, an evaluation suite for scoring prompt quality, and a framework for orchestrating agents and integrating external tools. The project provides implementation patterns for building applications with Claude, specifically focusing on coordinating multiple models to split complex tasks between high-reasoning and high-efficiency agents. It includes technical demonstrations for multimodal data proce
The open-source RAG platform: built-in citations, deep research, 22+ file formats, partitions, MCP server, and more.
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
RAGChecker: A Fine-grained Framework For Diagnosing RAG
Botpress is a conversational AI builder and LLM agent platform used to design chatbot workflows and orchestrate agents powered by large language models. It provides a framework for managing the entire lifecycle of these agents, from initial creation through to deployment across various production environments. The platform includes a custom integration SDK for developing and publishing third-party connectors that extend agent capabilities. These tools allow for the creation of custom plugins that connect AI agents to external APIs and third-party services. The system supports both visual des
bRAG-langchain is a framework for building retrieval augmented generation pipelines using LangChain to connect documents with language models. It functions as a vector store orchestrator that manages document indexing and retrieval strategies to improve context accuracy. The system implements an advanced retrieval pipeline featuring a semantic query router that directs natural language inputs to specific data sources or prompts. It includes a metadata filtering engine that translates natural language queries into structured schemas to narrow search results. The project covers hybrid search o
WrenAI is a platform designed to enable natural language interaction with relational and analytical databases. By combining a text-to-SQL engine with semantic data modeling, it allows users to explore structured data through plain language questions, removing the requirement for manual code generation. The system functions by grounding natural language requests in a predefined business logic layer rather than raw database schemas. This semantic approach, supported by context-aware prompt engineering, ensures that generated queries remain consistent and accurate across an organization. The pla
Langchain-Chatchat is a system for building retrieval-augmented generation applications and autonomous AI agents. It integrates a knowledge base management system and an agent framework to enable language models to interact with private documents and execute multi-step tasks through external tools. The platform supports local deployment of language models on private infrastructure to operate without an internet connection. It includes a multimodal AI platform that combines vision models for image analysis with text-to-image generation capabilities. The system provides a web-based conversatio
FlagEmbedding is a comprehensive toolkit designed for training, benchmarking, and deploying embedding models, retrieval systems, and augmented generation pipelines. It provides the necessary infrastructure to transform text into high-dimensional vector representations and organize them into searchable structures for semantic search applications. The framework distinguishes itself through specialized capabilities for fine-tuning pre-trained embedding and reranking models on domain-specific datasets. By allowing users to adapt models to unique vocabularies and specialized retrieval tasks, it en
Kotaemon is an orchestration framework designed for building modular, agentic workflows that integrate document processing, retrieval-augmented generation, and multi-step reasoning. It provides a comprehensive platform for developing document-based question answering systems, allowing users to chain language models, prompt templates, and external tools into complex, automated pipelines. The system distinguishes itself through a highly modular architecture that emphasizes component-based composition and schema-driven data exchange. It supports autonomous agents capable of decomposing complex q
Opik is an observability and evaluation platform designed for generative AI applications and agentic workflows. It provides a centralized environment for tracing execution flows, managing prompt templates, and monitoring production performance, allowing teams to gain visibility into complex model interactions and tool usage without requiring manual application code changes. The platform distinguishes itself through its integrated approach to the AI development lifecycle, combining distributed trace instrumentation with automated evaluation frameworks. It supports model-as-a-judge scoring, syn
=3.10.1-blue"> Streamlined and promptable Fast GraphRAG framework designed for interpretable, high-precision, agent-driven retrieval workflows.
Deepeval is a framework for testing and evaluating large language model applications. It provides a suite of tools for executing automated regression tests, validating model output quality against defined standards, and tracing the execution of complex agent workflows. By integrating these capabilities into development pipelines, the platform ensures consistent performance and reliability throughout the software lifecycle. The platform distinguishes itself through its focus on programmatic validation and observability. It utilizes secondary language models to score output quality and employs
Haystack is an orchestration framework designed for building complex search and generative AI pipelines. It functions as an agentic workflow engine, enabling the construction of automated sequences that allow AI agents to perform multi-step reasoning and data analysis. The framework utilizes a modular, component-based architecture that connects processing steps into directed acyclic graphs. By employing a provider-agnostic integration layer, it decouples core logic from specific external AI services and vector databases, allowing for the flexible exchange of underlying technologies. This desi
Represent, send, store and search multimodal data
DB-GPT is an agentic data analysis platform and business intelligence AI that functions as a large language model data assistant. It provides a text-to-SQL interface and a sandboxed code execution environment to translate natural language into executable database queries and Python scripts. The platform utilizes iterative agentic reasoning to plan and execute multi-step data analysis workflows through tool calls. It features a modular skill-based extension system that allows domain knowledge and analysis workflows to be packaged into reusable functional components. The system integrates data
Ragas is an evaluation framework and performance benchmark designed to quantify the quality of retrieval augmented generation pipelines. It functions as an application optimizer to identify bottlenecks in language model workflows using automated metrics and model-based scoring. The framework includes a system for generating synthetic datasets that mimic production scenarios and edge cases to create realistic test cases. It enables reference-free assessment, allowing the evaluation of response quality by analyzing grounding in the provided context without requiring gold-standard labels. The s
A neurosymbolic perspective on LLMs
This project is a high-performance library designed for the similarity search and clustering of dense vectors across massive datasets. It functions as a vector similarity search engine, providing the necessary tools to organize complex numerical data into specialized structures that facilitate rapid retrieval and efficient querying of millions of records. The library distinguishes itself through a variety of advanced indexing and compression techniques, including hierarchical navigable small worlds for logarithmic time complexity and inverted file indexing to partition vector spaces into mana
sqlite-vec is a C-based vector library and SQLite extension that adds virtual tables for storing and querying high-dimensional embeddings. It functions as a database plugin for performing nearest neighbor searches using distance metrics such as L2, cosine, and Hamming distance. The project provides a portable embedding store that supports deployment across Android, iOS, desktop environments, and web browsers via WebAssembly. It distinguishes itself by converting numerical arrays into compact binary formats and utilizing quantization to reduce the memory footprint and storage size of vector in
Open-source tool to visualise your RAG 🔮
Graphiti is a backend framework and memory server designed to provide artificial intelligence agents with persistent, time-aware knowledge graph storage. It functions as a memory layer that enables agents to maintain context across long-term interactions by recording and evolving structured data over time. The system distinguishes itself through a specialized temporal graph database that tracks how entities and relationships change using validity windows. By combining semantic vector similarity, keyword matching, and graph topology traversal, the engine performs hybrid retrieval to locate rel
Build - Rapid Experiment - Evaluate - Observability