awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Vector Databases · Awesome GitHub Repositories

9 repos

Awesome GitHub RepositoriesVector Databases

Storage engines and infrastructure designed to index, store, and retrieve high-dimensional embeddings for semantic search.

Explore 9 awesome GitHub repositories matching data & databases · Vector Databases. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Database Management Systems
  4. Vector Databases

Awesome Vector Databases GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • nomic-ai/gpt4all

    nomic-ai/gpt4all

    77,146GitHubView on GitHub↗

    GPT4All is a cross-platform runtime environment designed to execute large language models directly on local consumer hardware. By leveraging an optimized C++ inference backend, it enables private, offline AI interactions without requiring an internet connection or external cloud services. The project provides a compreh

    C++ai-chatllm-inference
  • mlabonne/llm-course

    mlabonne/llm-course

    75,340GitHubView on GitHub↗

    This project is a comprehensive educational curriculum and engineering handbook focused on the lifecycle of large language models. It serves as a structured knowledge base for machine learning practitioners, covering the fundamental mathematical and architectural principles of transformer-based sequence modeling, as we

    courselarge-language-modelsllm
  • redis/redis

    redis/redis

    73,096GitHubView on GitHub↗

    Redis is an in-memory, key-value database designed to provide sub-millisecond latency for read and write operations. It functions as a versatile data platform, serving as a distributed cache, a message broker, a NoSQL document store, and a vector database. The system utilizes an event-driven, single-threaded loop to pr

    Ccachecachingdatabase
  • twitter/the-algorithm

    twitter/the-algorithm

    72,764GitHubView on GitHub↗

    The algorithm is a distributed recommendation engine pipeline designed to construct and serve personalized content timelines. It functions as a multi-stage orchestration layer that aggregates candidate content from diverse social graphs and high-dimensional embedding spaces, processing user interaction data to deliver

    Scala
  • pathwaycom/pathway

    pathwaycom/pathway

    59,684GitHubView on GitHub↗

    Pathway is a high-performance data processing framework designed for building unified batch and streaming pipelines. It functions as an orchestrator for complex data transformations, utilizing a differential dataflow engine to process updates incrementally. By treating static datasets and continuous event streams with

    Pythonbatch-processingdata-analyticsdata-pipelines
  • zylon-ai/private-gpt

    zylon-ai/private-gpt

    57,116GitHubView on GitHub↗

    This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov

    Python
  • pathwaycom/llm-app

    pathwaycom/llm-app

    56,311GitHubView on GitHub↗

    This project is a data processing engine and AI application platform designed for building production-grade machine learning workflows. It provides a unified programming model that handles both historical batch data and live stream ingestion, enabling the development of real-time ETL pipelines and scalable data transfo

    Jupyter Notebookchatbothugging-facellm
  • appwrite/appwrite

    appwrite/appwrite

    54,884GitHubView on GitHub↗

    Appwrite is a backend-as-a-service platform that provides a unified development environment for building full-stack applications. It integrates essential infrastructure components—including authentication, databases, storage, and serverless functions—into a single, centralized interface to simplify application developm

    TypeScriptandroidappwritebackend
  • Mintplex-Labs/anything-llm

    Mintplex-Labs/anything-llm

    54,751GitHubView on GitHub↗

    This platform serves as a comprehensive environment for managing private language models, document knowledge bases, and automated agent workflows within secure local infrastructure. It functions as a document-aware workspace that enables users to ingest diverse file formats into searchable repositories, ensuring that a

    JavaScriptai-agentscustom-ai-agentsdeepseek

Explore sub-tags

  • Chroma IntegrationsSupport for local disk-based vector storage using Chroma.
  • Local Embedding ProvidersServices that generate vector embeddings from text locally on the host machine.
  • Milvus IntegrationsConfiguration and connectivity for Milvus vector stores.
  • PostgreSQL Vector StoresConfigurations for using PostgreSQL with vector extensions as a knowledge base.
Similarity Search Engines
Mechanisms for retrieving data based on geometric proximity in vector space.
  • Vector Database IntegrationsTools and configurations for connecting applications to vector stores to enable similarity search and data retrieval.
  • Vector Document IndexingAutomated workflows for indexing documents into vector databases to support real-time search and retrieval.
  • Vector Search FrameworksSpecialized tools for low-latency retrieval of vector data in AI and RAG applications.
  • Vector Storage ImplementationsEducational resources or code patterns for building custom vector storage engines.
  • Vector-Database-Backed RetrievalsSystems that use vector indices to perform semantic similarity searches for context retrieval.