8 dépôts
Capabilities for querying and narrowing down document sets based on criteria.
Distinguishing note: Focuses on the filtering logic applied to database queries.
Explore 8 awesome GitHub repositories matching data & databases · Document Filtering. Refine with filters or upvote what's useful.
Payload is a headless content management system and application framework that uses a code-first approach to define data schemas and administrative interfaces. By utilizing a centralized, type-safe configuration object, it automatically generates database schemas, API endpoints, and a fully customizable admin panel. The system is built on a database-agnostic architecture, allowing it to interface with various storage engines while providing a unified, type-safe API for server-side operations, REST, and GraphQL. What distinguishes Payload is its deep extensibility and developer-centric design.
Filters returned document fields to optimize database performance and reduce payload size.
NeDB is a JavaScript embedded NoSQL document store designed for Node.js and the browser. It functions as an in-memory data store with the option to persist documents to a local file system, ensuring data survives application restarts. The project utilizes a MongoDB-compatible API to perform data operations, allowing it to serve as a lightweight document indexing system and a persistent file database without requiring a separate database server. Capabilities include querying, inserting, updating, and deleting documents, as well as the ability to create indexes on specific fields to accelerate
Retrieves documents using equality, comparison, and logical operators to filter records.
TinaCMS is a headless content management framework that bridges local Git-based file storage with a visual, in-context editing interface. By treating your repository as the single source of truth, it enables developers to manage content as structured data files while providing editors with a browser-based dashboard to modify website content directly within a live preview. The framework distinguishes itself by transforming local files into a unified GraphQL API, which powers both the administrative interface and the application's data retrieval layer. This architecture allows for compile-time
Restricts selectable documents in reference fields based on property values to improve navigation in large datasets.
elasticsearch-dump is a command line tool for importing, exporting, and transferring data between Elasticsearch and OpenSearch instances. It functions as an index dump utility that saves documents, mappings, and analyzers to local files or standard output. The tool enables the movement of data between clusters using local files as an intermediary and can flatten nested JSON documents into CSV files for external analysis. It allows for the modification or anonymization of documents during the transfer process through the use of custom JavaScript functions. The utility covers data extraction a
Allows the use of search queries to filter and select specific subsets of documents for export.
AIOS is an LLM agent operating system and orchestration kernel designed to manage memory, resource scheduling, and tool execution for multiple autonomous AI agents. It serves as a comprehensive framework for developing and deploying agents, featuring a dedicated resource manager that coordinates model backends, GPU memory, and isolated kernel instances. The system distinguishes itself through a semantic memory engine that uses vector search and autonomous clustering for long-term knowledge management, and a semantic file system that allows users to control computer files and system operations
Searches file collections using text queries and keyword filters to retrieve relevant documents.
ExecuTorch is a lightweight C++ runtime for deploying PyTorch models on mobile, embedded, and edge hardware. It provides an ahead-of-time compilation pipeline that exports, quantizes, and lowers model graphs into compact serialized programs, then executes them through a minimal runtime with hardware acceleration and on-device large language model inference capabilities. The project distinguishes itself through a hardware accelerator delegate system that partitions model subgraphs and offloads computation to specialized backends including NPUs, GPUs, and DSPs from Apple, Arm, Intel, MediaTek,
Provides a utility to decode classification logits into top-1 labels for vision model outputs.
AdalFlow est un framework d'agents IA autonomes et une bibliothèque d'applications LLM conçue pour construire des flux de travail modulaires. Il sert d'interface agnostique au modèle et d'orchestrateur de pipeline RAG, permettant aux utilisateurs de développer des agents ReAct qui utilisent un raisonnement itératif et l'exécution d'outils externes pour résoudre des tâches complexes. Le projet se distingue par un système d'optimisation de prompt qui utilise la descente de gradient textuelle pour affiner automatiquement les templates de prompt et les exemples few-shot. Il traite le feedback du modèle comme un signal différentiable, permettant une forme de rétropropagation LLM pour améliorer itérativement la qualité de sortie basée sur des métriques d'évaluation. Le framework couvre une large surface de capacités, incluant la génération augmentée par récupération (RAG) avec recherche vectorielle sémantique et reranking, le traçage d'exécution basé sur les spans pour l'observabilité, et l'analyse structurée pilotée par schéma. Il fournit une couche de communication unifiée pour de nombreux fournisseurs de modèles propriétaires et open source, et prend en charge la conversion de fonctions Python en interfaces d'outils standardisées. Le système est implémenté en Python et s'intègre avec MLflow pour le suivi et l'analyse des flux de travail.
Restricts retrieved documents using SQL-like conditions or database-specific metadata filters.
Codesearch is an indexed code search engine and large-scale source indexer designed to execute regular expressions across extensive source code trees. It functions as a tool for finding specific text patterns in large codebases by analyzing and indexing massive volumes of source files for rapid retrieval. The system utilizes a specialized trigram-based search index to accelerate complex regular expression queries. This indexing approach filters candidate documents via three-character sequences before applying full regular expression scans to ensure high performance on large datasets. The eng
Identifies potential matches by executing regular expression queries against an optimized index to narrow document sets.