30 open-source projects similar to superlinked/superlinked, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Superlinked alternative.
ZenML is an extensible machine learning orchestration framework designed to manage the end-to-end lifecycle of data pipelines and AI agent workflows. It functions as a durable orchestrator that executes machine learning tasks as directed acyclic graphs, ensuring that every step is containerized for consistent performance across local, cloud, and hybrid infrastructure. By decoupling pipeline code from underlying compute and storage backends, the platform allows developers to define infrastructure-agnostic stacks that remain portable across diverse environments. The project distinguishes itself
This project is a feature-rich Go client library designed for interacting with Redis. It serves as a comprehensive interface for managing remote data stores, enabling developers to execute standard database commands, handle complex data structures, and perform asynchronous operations within Go applications. The library distinguishes itself through its support for advanced Redis capabilities, including connection pooling, pipelining, and transactional integrity. It provides specialized primitives for managing distributed clusters, including automated topology updates and request routing to sha
pgai is a PostgreSQL AI toolkit and framework designed to integrate large language models and vector embeddings directly into a database. It serves as a bridge for executing machine learning model requests and performing text-to-SQL translations within standard database queries. The project provides an automated vector embedding pipeline that handles the loading, parsing, and chunking of text from tables and unstructured documents. This system utilizes a background worker to synchronize embeddings automatically as source data changes and includes specialized tools for building retrieval-augme
SuperduperDB is an AI agent orchestrator and database-integrated machine learning platform. It serves as a framework for building stateful AI agents and retrieval-augmented generation applications by integrating large language models directly with database backends. The project enables the deployment of self-hosted AI infrastructure and the management of language models on private hardware using local checkpoints. It distinguishes itself by allowing users to attach AI components directly to data fields, triggering model execution and automated transformations based on database insertions and
This project is a high-performance BERT embedding service and inference server designed to map text sequences into fixed-length numerical vectors. It functions as a machine learning microservice and distributed model server that decouples request handling from heavy computation. The system utilizes a ZeroMQ messaging infrastructure to provide low-latency communication between distributed clients and the inference server. It incorporates server-side batch processing and GPU workload scaling to maximize hardware utilization and manage high request volumes. The platform supports semantic search
Zep is a long-term memory layer and persistent storage system for large language model applications. It functions as a memory service and vector database orchestrator that manages chat history, user preferences, and context retrieval to reduce hallucinations in AI agents. The system maintains a temporal knowledge graph that stores interaction data as dated facts to track how user preferences and environments evolve over time. It combines these knowledge graphs with a store for persisting unstructured message data at the user and session levels. The platform provides capabilities for AI conte
This project is a knowledge base plugin and RAG context manager that uses a local vector database interface to enable semantic search and relationship mapping. It transforms text into numerical vectors to find semantically related notes and excerpts based on conceptual meaning rather than keyword matches. The system differentiates itself through a semantic graph visualizer that maps notes into clusters to reveal conceptual connections. It also features a context manager capable of bundling local notes and excerpts into reusable packs to provide grounded factual bases for large language model
Searchkick is an integration library and wrapper that connects application models to search engines such as Elasticsearch and OpenSearch. It functions as a search index synchronizer, automatically mirroring database records to a search server to enable full-text and vector retrieval. The project provides a high-level interface for implementing keyword search, semantic vector search, and hybrid search. It distinguishes itself through the ability to combine traditional keyword matching with vector embeddings using reranking and fusion techniques to improve precision. The library covers the end
LangChain4j is a framework and library for building applications powered by large language models on the JVM. It provides a unified API for developing AI agents, implementing retrieval augmented generation, and integrating generative AI capabilities into professional software built with frameworks like Spring Boot or Quarkus. The project enables the creation of autonomous agents that can reason through tasks, manage memory, and execute external tools to achieve specific goals. It differentiates itself through a unified model interface that allows developers to switch between multiple model pr
Quiver is a framework for integrating retrieval augmented generation into applications. It provides a generative AI integration layer that connects large language models with vector stores to produce context-aware responses based on custom data. The project features a knowledge base pipeline that parses diverse file types into searchable embeddings and a vector database orchestrator to manage data across different storage implementations. It utilizes a provider-agnostic model interface, allowing users to switch between various external AI providers or local models through a single unified sys
MIRIX is an AI agent state orchestrator and long-term memory system designed to provide persistent context for large language models. It functions as a multi-modal AI memory pipeline that processes text, voice, and screen captures into structured knowledge stores, including a dedicated screen activity knowledge base. The project distinguishes itself by integrating a multi-modal observation pipeline that monitors desktop activity in real-time to build a searchable history of user actions. It utilizes a multi-tiered memory hierarchy—separating episodic, semantic, procedural, and core stores—and
Metaflow is a Python machine learning framework and MLOps workflow orchestrator designed to manage the lifecycle of data pipelines from local prototyping to production. It serves as a distributed compute manager and an experiment tracking system, enabling the creation of reproducible pipelines that transition between development and high-availability production environments. The framework distinguishes itself through an integrated checkpointing system that automatically persists intermediate data artifacts to remote storage, allowing failed runs to be resumed from the last successful step. It
This project is a privacy-focused, self-hosted metasearch engine that aggregates results from a wide array of web, academic, and media sources into a single, unified interface. By acting as a proxy between the user and external search providers, it strips identifying headers and tracking parameters from requests, ensuring that search activity remains anonymous and protected from third-party profiling. The platform distinguishes itself through a modular, plugin-based architecture that allows for extensive customization of search behavior, result filtering, and interface branding. It supports a
This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to provide context-aware responses for chat and completion requests. The system distinguishes itself through a database-agnostic abstraction layer that supports various storage backends, ranging from local disk storage to enterprise-grade vector databases. It offers flexible deployment
Memgraph is an in-memory, distributed graph database designed for high-performance labeled property graph management. It utilizes a Cypher query engine for declarative data retrieval and manipulation, providing a scalable knowledge graph backend that integrates vector search and graph traversals. The system distinguishes itself as a real-time graph analytics platform, employing native C++ and CUDA implementations to execute complex network analysis and dynamic community detection on streaming data. It provides specialized support for AI integration, including GraphRAG capabilities, the constr
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
WooCommerce is a comprehensive eCommerce framework for WordPress that transforms websites into fully functional online stores for physical and digital goods. It serves as a digital storefront manager for product catalogs, inventory, and customer orders across retail and wholesale business models. The system functions as a payment gateway integrator, connecting shops to diverse processors for credit cards, digital wallets, and subscriptions. It also operates as an order fulfillment system for calculating shipping rates, generating labels, and coordinating delivery via third-party couriers, whi
This project is a retrieval-augmented generation pipeline designed for building custom ChatGPT plugins that allow language models to query private or professional documents. It implements a full retrieval workflow, from processing and indexing document chunks to retrieving relevant context for natural language queries. The system distinguishes itself through a hybrid retrieval approach that combines dense vector embeddings with sparse keyword matching, further refined by a two-stage semantic re-ranking process. It includes specialized data privacy tools for screening personally identifiable i
Argilla is a collaborative AI feedback tool and data curation management system. It serves as a human-in-the-loop dataset platform designed to coordinate workforce annotators and domain experts in labeling, rating, and refining data samples for machine learning projects. The platform focuses on large language model dataset curation and reinforcement learning from human feedback workflows. It provides a shared workspace for integrating human expertise into AI development to validate model outputs and correct data errors. The system manages the end-to-end machine learning data pipeline, includ
SynapseML is an Apache Spark machine learning library designed for building and scaling machine learning workflows and data pipelines across distributed clusters. It serves as a distributed machine learning pipeline framework and a distributed inference engine for executing hardware-accelerated predictions and deep learning tasks on large-scale datasets. The project functions as a cloud AI integration layer, allowing users to apply pretrained artificial intelligence services for text, vision, and speech within distributed pipelines. It also includes a dedicated suite of tools for distributed
Whoogle-search is a self-hosted, containerized metasearch engine designed to provide search results while stripping away advertisements, tracking scripts, and cookies. It functions as a privacy-focused proxy that fetches results from major search providers, ensuring that user activity remains isolated from the original service providers. The platform distinguishes itself through granular traffic management and request-level security. It masks user identity by rotating browser identification strings and routing queries through intermediate proxies. Users can further customize their experience
sqlite-vec is a C-based vector library and SQLite extension that adds virtual tables for storing and querying high-dimensional embeddings. It functions as a database plugin for performing nearest neighbor searches using distance metrics such as L2, cosine, and Hamming distance. The project provides a portable embedding store that supports deployment across Android, iOS, desktop environments, and web browsers via WebAssembly. It distinguishes itself by converting numerical arrays into compact binary formats and utilizing quantization to reduce the memory footprint and storage size of vector in
FastGPT is a comprehensive platform for building, deploying, and managing context-aware artificial intelligence applications. It provides a unified environment that integrates custom data sources with language models, utilizing a retrieval-augmented generation engine to ground responses in accurate, domain-specific information. The system is designed for enterprise-scale use, featuring multi-tenant architecture, administrative controls, and secure authentication protocols including OAuth 2.0 and custom single sign-on integration. The platform distinguishes itself through a visual, node-based
Searx is a privacy-respecting metasearch engine and search result aggregator. It functions as a self-hosted search proxy that queries diverse web services, databases, and local indices to present a single unified list of results. The project prevents user tracking and profiling by acting as an intermediary between the client and search services. It strips identifying information from queries, removes tracker URLs and HTTP referrers from outgoing links, and can route traffic through proxies or the Tor network to mask user identity. The system supports multilingual search and result filtering
This project is a reactive, offline-first NoSQL database engine designed for JavaScript applications. It provides a robust framework for managing application state by synchronizing data across browsers, mobile devices, and server-side runtimes. By treating local storage as the primary source of truth, it enables applications to remain functional without network connectivity, automatically reconciling changes with remote backends once a connection is restored. The database distinguishes itself through a modular architecture that supports cross-environment synchronization and high-performance d
Pagefind is a static site search engine that indexes HTML files to provide a browser-based search experience without the need for a backend server or API. It consists of a multilingual search indexer and a set of prebuilt, customizable user interface components for rendering search inputs and result lists. The system is designed for global content, utilizing a multilingual search indexer that detects page languages and creates independent index bundles to provide language-specific stemming and results. It further optimizes performance by using a compressed index and offloading query execution
Redis is a high-performance in-memory key-value store that functions as a distributed cache, message broker, and NoSQL database. It provides sub-millisecond read and write access to data stored in RAM and can operate as a vector database for indexing high-dimensional embeddings. The system supports a wide range of data storage and synchronization primitives, including the management of strings, hashes, lists, sets, and JSON documents. It enables real-time data operations through atomic transactions, hybrid persistence using snapshots and append-only logs, and high-availability configurations
This project provides a framework for managing multi-agent systems, designed to automate complex software development, infrastructure, and business workflows. It functions as a multi-agent workflow orchestrator that routes tasks to domain-specific workers while maintaining state persistence and infrastructure automation. By leveraging large language models, the system decomposes high-level objectives into actionable plans, ensuring that complex operations are executed with consistency and reliability. The framework distinguishes itself through its hierarchical agent registry and policy-driven
Kotaemon is an orchestration framework designed for building modular, agentic workflows that integrate document processing, retrieval-augmented generation, and multi-step reasoning. It provides a comprehensive platform for developing document-based question answering systems, allowing users to chain language models, prompt templates, and external tools into complex, automated pipelines. The system distinguishes itself through a highly modular architecture that emphasizes component-based composition and schema-driven data exchange. It supports autonomous agents capable of decomposing complex q
FoundationDB is an ACID-compliant distributed transactional key-value store. It functions as a scalable database engine that ensures strict serializability and data consistency across a cluster of servers using a shared-nothing architecture. The system is distinguished by its multi-region replication capabilities, allowing data to be synchronized across different datacenters for high availability and disaster recovery. It utilizes optimistic concurrency control to manage distributed transactions and employs a majority-based coordination system to maintain cluster state. The platform provides