Why is surrealdb/surrealdb a recommended Multi-Model Vector Storage GitHub Repositories repository?

Keeps vector embeddings alongside structured data and graph relationships within a single database to simplify data management.

Why is postgresml/postgresml a recommended Multi-Model Vector Storage GitHub Repositories repository?

Provides low-latency storage that combines vectors, text, and numeric data to serve as model inputs.

Why is cocoindex-io/cocoindex a recommended Multi-Model Vector Storage GitHub Repositories repository?

Natively handles typed multi-dimensional vectors from simple arrays to multi-vector embeddings for multimodal AI pipelines.

Why is chonkie-inc/chonkie a recommended Multi-Model Vector Storage GitHub Repositories repository?

Automatically selects and instantiates embedding providers based on model names through a registered handler system.

4 Repos

Awesome GitHub RepositoriesMulti-Model Vector Storage

Storage systems that combine vector embeddings with structured and graph data.

Distinguishing note: Focuses on the integration of vectors into a multi-model database.

Explore 4 awesome GitHub repositories matching data & databases · Multi-Model Vector Storage. Refine with filters or upvote what's useful.

Finde die besten Repos mit KI.Wir suchen mit KI nach den am besten passenden Repositories.

surrealdb/surrealdb
surrealdb/surrealdb
32,397Auf GitHub ansehen
SurrealDB is a multi-model database engine designed to store and query document, graph, relational, and vector data within a single ACID-compliant platform. It functions as an AI-native data store, integrating vector search, graph traversal, and machine learning model execution directly into its query layer. By providing a unified declarative query language, the platform eliminates the need for external middleware to synchronize data across different storage models. The platform distinguishes itself through its ability to manage agent memory and complex workflows natively. It allows developer
Keeps vector embeddings alongside structured data and graph relationships within a single database to simplify data management.
Rustbackend-as-a-servicecloud-databasedatabase
Auf GitHub ansehen32,397
postgresml/postgresml
postgresml/postgresml
6,801Auf GitHub ansehen
PostgresML is a machine learning database extension for PostgreSQL that integrates model training and inference directly into the database. It functions as an in-database AI platform and vector database, enabling the execution of large language models and natural language processing tasks on stored records without exporting data to external services. The system distinguishes itself by utilizing GPU acceleration to minimize latency during model predictions and employing a hybrid storage engine that maintains relational data alongside high-dimensional vectors. It allows for the building and fin
Provides low-latency storage that combines vectors, text, and numeric data to serve as model inputs.
Rust
Auf GitHub ansehen6,801
cocoindex-io/cocoindex
cocoindex-io/cocoindex
6,117Auf GitHub ansehen
Cocoindex is an incremental data processing engine that builds and maintains live indexes for AI agents, with a core focus on codebase indexing and knowledge graph extraction. The engine uses a function-graph execution model where user-defined Python functions are composed into a directed acyclic graph, and it processes data incrementally so only changed source records or code paths are re-computed, avoiding full recomputation at any scale. It supports automatic schema inference from transformation pipeline type annotations and provides full data lineage tracing, tagging every output record wi
Natively handles typed multi-dimensional vectors from simple arrays to multi-vector embeddings for multimodal AI pipelines.
Rustagentic-data-frameworkaiai-agents
Auf GitHub ansehen6,117
chonkie-inc/chonkie
chonkie-inc/chonkie
4,170Auf GitHub ansehen
Chonkie ist eine Text-Chunking-Bibliothek, die für Retrieval-Augmented-Generation-Pipelines (RAG) konzipiert wurde. Sie fungiert als semantischer Text-Splitter und RAG-Ingestion-Pipeline und transformiert Rohtext in eingebettete Segmente für die Speicherung in Vektordatenbanken. Das Projekt zeichnet sich durch spezialisierte Splitting-Strategien aus, einschließlich eines AST-basierten Code-Splitters zur Bewahrung logischer Grenzen im Quellcode und eines semantischen Text-Splitters, der Embedding-Modelle verwendet, um Grenzen basierend auf der Bedeutung zu bestimmen. Es bietet zudem einen Vektordatenbank-Ingestor, um die Generierung von Embeddings und deren Export in verschiedene Speicher zu automatisieren. Die Bibliothek deckt ein breites Spektrum an Funktionen ab, einschließlich Dokumenten-Parsing via OCR und Markdown-Extraktion, einer Vielzahl von Splitting-Methoden wie Token-Count und hierarchische Segmentierung sowie Workflow-Orchestrierung durch wiederverwendbare Pipelines. Sie unterstützt eine breite Palette an Vektorspeicher-Integrationen, einschließlich Qdrant, Milvus, Weaviate und Elasticsearch, sowie den Datenexport in JSON- und Hugging-Face-Datensätze. Nutzer können diese Operationen über eine Kommandozeilenschnittstelle ausführen oder das System als containerisierten API-Dienst bereitstellen.
Automatically selects and instantiates embedding providers based on model names through a registered handler system.
Pythonaichonkiechunker
Auf GitHub ansehen4,170

Awesome Multi-Model Vector Storage GitHub Repositories

surrealdb/surrealdb

postgresml/postgresml

cocoindex-io/cocoindex

chonkie-inc/chonkie

Unter-Tags erkunden