15 مستودعات
Systems for executing and optimizing data retrieval queries.
Distinguishing note: Focuses on server-side execution logic rather than database-level indexing.
Explore 15 awesome GitHub repositories matching data & databases · Query Processing. Refine with filters or upvote what's useful.
Cheat.sh is a command line knowledge base that provides instant access to programming syntax, code snippets, and technical documentation. Designed to minimize context switching, it functions as a developer productivity tool that allows users to retrieve information directly within their terminal or code editor. The service distinguishes itself through a terminal-agnostic interface that relies on standard input and output streams, ensuring compatibility across various shell environments and operating systems. It supports persistent query sessions to maintain workflow continuity and offers a co
Executes search logic and content formatting on the host machine to minimize client-side requirements.
Apache Flink is a distributed processing engine designed for both high-throughput, low-latency data streams and finite batch workloads. It functions as a stateful stream processor and a SQL stream processing engine, providing a unified runtime to execute relational queries and event-based transformations. The system is distinguished by its ability to manage persistent operator state to ensure exactly-once processing guarantees and consistency during failures. It features specialized capabilities for complex event processing to detect temporal patterns and handles out-of-order events using eve
Processes data streams and batches using a language-integrated API for selections, filters, and joins.
Chat2DB is an AI-powered SQL client and multi-database GUI manager designed for managing various relational and NoSQL database systems. It serves as a visual database management tool and a natural language to SQL interface, allowing users to convert plain text descriptions into executable and optimized queries. The platform distinguishes itself through automated business intelligence capabilities, which include the generation of real-time data visualization dashboards and AI-driven data analysis from spreadsheets. To ensure data privacy, it supports secure local AI deployment, enabling large
Executes and optimizes complex queries against massive datasets for enterprise-scale environments.
localGPT is a private AI knowledge base and retrieval-augmented generation application. It provides a local document indexer, a hybrid search engine, and an inference interface to enable chatting with private documents and managing a self-hosted information repository without sending data to external servers. The system distinguishes itself through a dual-pass verification pipeline that ensures generated answers are grounded in retrieved sources, accompanied by explicit source attribution. It employs a hybrid retrieval approach combining semantic vector search with keyword matching and rerank
Breaks complex user requests into multiple sub-queries executed in parallel to synthesize a final comprehensive answer.
Vercel is a cloud platform for building, deploying, and scaling web applications. It provides a unified infrastructure that automates the build process by detecting project frameworks and distributing static and dynamic content through a global content delivery network. The platform executes application logic using serverless functions that scale automatically based on real-time traffic demand. The platform distinguishes itself through a centralized AI gateway that proxies requests to multiple model providers, enabling standardized authentication, observability, and cost tracking. It supports
Provides advanced conversational capabilities for handling complex, multi-step user queries.
Entity Framework Core is an object-relational mapper that enables developers to interact with database systems using strongly-typed code. It serves as a comprehensive data access framework, providing a unified interface for mapping application objects to relational and non-relational database schemas while managing the lifecycle of data operations through a central context. The project distinguishes itself through a provider-based architecture that decouples core data access logic from specific database engines, allowing for consistent interaction across diverse storage systems. It features a
Processes query results in application memory when server-side translation is unavailable or explicitly requested.
EdgeDB is a graph-relational database that combines a PostgreSQL backend with a graph-based schema and query language. It functions as an object-relational mapper and graph query engine, allowing data to be modeled as objects and links to align storage with modern programming language structures. The system features a composable query language designed to retrieve deeply nested or interconnected data without the use of manual SQL joins. It includes an integrated AI-driven data retrieval solution with built-in support for vector embeddings. The platform provides a schema migration tool for tr
Enables the retrieval and manipulation of deeply nested or interconnected data without complex joins.
Local Deep Research is an autonomous research system consisting of an LLM research agent, a local model orchestrator, and a multi-engine search aggregator. It is designed to execute deep research by decomposing complex questions into atomic facts and synthesizing cited reports from academic, technical, and private document sources. The system features an encrypted research workspace that ensures zero-knowledge privacy through isolated, per-user encrypted databases. It utilizes a local RAG knowledge base to index research sources into searchable vector stores, allowing for retrieval-augmented
Decomposes complex research questions into smaller, atomic sub-queries to enable targeted multi-engine searches.
Albert is a keyboard launcher that opens files, applications, and runs commands by typing search queries into a search bar. It functions as a keyboard-driven workflow tool, enabling users to navigate their file system, launch installed applications, and execute shell commands without touching a mouse. The launcher processes user input through a plugin-based modular architecture, where functionality is extended by dynamically loaded C++ and Python plugins. Queries are dispatched to all enabled handlers in parallel, with results merged and ranked by a combination of match quality and historical
Processes user input through registered query handlers that return relevant results from various sources.
This project is a reference implementation and application template for Retrieval-Augmented Generation (RAG). It integrates Azure OpenAI with Azure AI Search to enable conversational chat interfaces that provide grounded responses based on private enterprise data. The system is distinguished by its multimodal AI interface, allowing it to process and reason over combined text, image, and PDF content. It employs a hybrid search architecture that combines vector and keyword retrieval with semantic reranking to prioritize the most relevant documents for prompt augmentation. The project covers a
Decomposes complex user requests into targeted sub-queries to retrieve precise information from memory.
Vespa is a distributed search engine, vector database, and machine learning ranking engine. It serves as an AI search platform designed to handle large-scale document indexing and complex query processing across a cluster of nodes, combining keyword retrieval with high-dimensional embedding storage for semantic similarity search. The platform distinguishes itself by integrating machine learning models directly into the search pipeline to perform real-time inference and ranking. It converts these models into ranking expressions to score and order results based on relevance, while providing a s
Executes search query logic and dispatches results through a middleware layer managing the request-response cycle.
MindSearch is an LLM-based multi-agent search engine that decomposes complex user questions into targeted sub-queries and routes each to a specialized agent for parallel investigation. The system orchestrates multiple agents through a large language model, coordinating their tasks and interpreting search results to produce coherent answers from multiple sources. The project provides a configurable search backend interface that allows switching between Google, DuckDuckGo, Brave, and Bing search APIs by updating a configuration attribute. It includes a terminal-based debug interface for testing
Splits complex user questions into parallel sub-queries handled by specialized agents.
GraphQL-Ruby هو مكتبة Ruby لبناء واجهات برمجة تطبيقات GraphQL بمخطط مكتوب بقوة ومحرك تنفيذ استعلام مخصص. يوفر إطار عمل شاملاً لربط كائنات التطبيق بنظام أنواع رسمي، مما يتيح جلب البيانات المهيكلة من خلال أدوات حل (Resolvers) محددة. يتميز المشروع بآليات متقدمة للأداء والتسليم، بما في ذلك محمل بيانات (Data Loader) للتجميع والتخزين المؤقت لمنع أنماط استعلام N+1. يدعم تسليم البيانات عالي الأداء من خلال بث الاستجابة التزايدي، واستجابات الاستعلام المؤجلة، وجلب البيانات المتوازي باستخدام الألياف (Fibers). بالإضافة إلى ذلك، يوفر دعماً أصلياً لاتفاقيات Relay، بما في ذلك مساعدين متخصصين للاتصالات وتحديد الكائنات. تغطي المكتبة مساحة واسعة من إدارة واجهة برمجة التطبيقات، وتتميز بالتحكم الدقيق في الوصول، وإصدار المخطط للحفاظ على التوافق مع الإصدارات السابقة، والتحديثات في الوقت الفعلي عبر الاشتراكات. كما تتضمن أدوات إدارة حركة المرور لحماية موارد الخادم، مثل تحديد تعقيد الاستعلام وتحديد معدل الطلب. يتم دعم التطوير وقابلية المراقبة من خلال أدوات تحليل AST، وتتبع التنفيذ، وأدوات اختبار متخصصة للتحقق من التحميل المجمع.
Processes fields across multiple objects in a single batch to reduce memory usage for large nested lists.
Zeebe هو محرك سير عمل سحابي أصلي وآلة حالة موزعة مصممة لتنسيق العمليات التجارية باستخدام معايير BPMN وDMN. يعمل كمحرك سير عمل gRPC عالي الأداء ينفذ عمليات تجارية معقدة من خلال بنية تدفق أحداث مجزأة. يعمل النظام أيضاً كمنسق لوكلاء النماذج اللغوية الكبيرة، حيث ينسق تفكير الذكاء الاصطناعي واستخدام الأدوات ضمن العمليات التجارية الحتمية. يتميز المحرك بشبكة وسيطة من نظير إلى نظير ونموذج تكرار بيانات قائم على الإجماع يضمن التوافر العالي وتحمل الأخطاء. يستخدم مجموعة وسطاء مجزأة لتحقيق قابلية التوسع الأفقي ويستخدم ضغطاً عكسياً للطلب التكيفي لتنظيم تدفق الأوامر الوارد ومنع التحميل الزائد للنظام. تغطي المنصة سطحاً واسعاً من الإمكانيات التشغيلية، بما في ذلك مراقبة التنفيذ في الوقت الفعلي مع خرائط حرارة الأداء، واتخاذ القرارات التجارية المؤتمتة عبر جداول القرار، وتنفيذ المهام الموزعة من خلال نموذج عامل مهمة يعتمد على الاقتراع. كما يوفر أدوات لعزل موارد المستأجرين المتعددين، والتحكم في الوصول القائم على الهوية، وتكامل واجهات برمجة تطبيقات الويب الخارجية والدوال بدون خادم. يمكن نشر النظام عبر بيئات مختلفة، بما في ذلك Kubernetes وDocker، ويتم إدارته من خلال مزيج من واجهة سطر الأوامر وواجهة برمجة تطبيقات REST برمجية.
Retrieves real-time process state and data via the cluster interface for monitoring and analytics.
Memary is a memory-augmented agent framework that stores and retrieves contextual information from a knowledge graph to personalize responses and maintain long-term memory across interactions. It automatically captures all agent interactions and stores them as structured memories without requiring explicit instrumentation, then injects top-ranked user entities and themes into the active context window to tailor agent responses dynamically. The framework distinguishes itself through a multi-retriever memory search that combines COLBERT reranking with recursive graph queries across databases, e
Splits user queries into sub-questions to retrieve more targeted information from memory stores.