19 repositorios
Techniques for improving database query performance and data retrieval efficiency.
Distinguishing note: Focuses on performance tuning for list views and document linking.
Explore 19 awesome GitHub repositories matching data & databases · Query Optimizations. Refine with filters or upvote what's useful.
Agent-skills is a collection of structured instructions and behavioral personas designed to standardize how AI coding agents perform engineering tasks. It functions as a workflow orchestrator that maps natural language intent to repeatable technical sequences and verification checklists. The project distinguishes itself through the use of specialized markdown-defined roles, such as security auditors or test engineers, to apply targeted domain expertise. It employs an evidence-based verification model that requires runtime data or passing tests as mandatory exit criteria to ensure AI-generated
Provides instructions for improving database query efficiency by eliminating N+1 patterns and implementing indexing.
Payload is a headless content management system and application framework that uses a code-first approach to define data schemas and administrative interfaces. By utilizing a centralized, type-safe configuration object, it automatically generates database schemas, API endpoints, and a fully customizable admin panel. The system is built on a database-agnostic architecture, allowing it to interface with various storage engines while providing a unified, type-safe API for server-side operations, REST, and GraphQL. What distinguishes Payload is its deep extensibility and developer-centric design.
Optimizes list view performance by refining data retrieval queries.
Cube is a semantic data layer that provides a unified framework for defining business metrics, dimensions, and relationships across diverse data sources. By acting as a headless business intelligence engine, it transforms raw data into a governed model that can be queried via SQL, REST, and GraphQL interfaces. This architecture ensures consistent data definitions and logic across all downstream analytical applications and reporting tools. The platform distinguishes itself through its integrated conversational AI capabilities, which allow users to explore data using natural language. It orches
Implements pre-aggregation strategies and workload-aware settings to reduce total compute costs and improve response times.
Presto is a distributed SQL query engine designed for high-performance analytical processing across heterogeneous data sources. It functions as a data federation platform and massively parallel processing engine, allowing users to execute interactive queries against diverse storage systems without requiring data migration. By mapping remote metadata and structures to a unified relational namespace, it enables seamless cross-platform analysis through a standard SQL interface. The engine distinguishes itself through a pluggable connector architecture and a shared-nothing distributed processing
Arranges table data during write operations using specific sort orders to make filtering and searching more efficient.
VictoriaMetrics is a high-performance, scalable time series database and observability platform designed for long-term storage and analysis of metric, log, and trace data. It functions as a unified backend for monitoring ecosystems, offering full compatibility with industry-standard protocols and query languages. The system is built to handle massive data volumes through a distributed architecture that supports horizontal scaling and efficient data lifecycle management. The platform distinguishes itself through a storage engine that utilizes consistent hashing for data sharding and log-struct
Identifies execution bottlenecks by tracing query paths and formatting expressions to ensure efficient data retrieval.
Dask es un framework de computación paralela y un programador de tareas distribuido diseñado para escalar flujos de trabajo de ciencia de datos en Python desde máquinas individuales hasta grandes clústeres. Funciona como un gestor de recursos de clúster que orquesta la lógica computacional representando las tareas y sus dependencias como grafos acíclicos dirigidos. Esta arquitectura permite al sistema automatizar la distribución de cargas de trabajo a través del hardware disponible mientras gestiona requisitos de ejecución complejos. El proyecto se distingue por un motor de evaluación perezosa que difiere las operaciones de datos hasta que se solicitan explícitamente, permitiendo la optimización global del grafo y una asignación eficiente de recursos. Incorpora el volcado de datos consciente de la memoria para evitar fallos del sistema al procesar conjuntos de datos que exceden la memoria disponible, y utiliza la fusión de grafos de tareas para combinar secuencias de operaciones en pasos de ejecución únicos, minimizando la sobrecarga de programación y la comunicación entre nodos. La plataforma proporciona una superficie de capacidades integral para el análisis de datos a gran escala, incluyendo soporte para aprendizaje automático distribuido, integración de computación de alto rendimiento y procesamiento de datos en paralelo. Ofrece herramientas extensas para la gestión del ciclo de vida del clúster, perfilado de rendimiento y monitoreo en tiempo real de la ejecución de tareas. Los usuarios pueden desplegar estos entornos en diversas infraestructuras, incluyendo hardware local, proveedores de nube, sistemas en contenedores y clústeres de computación de alto rendimiento.
Analyzes and transforms computation graphs to reduce data movement and minimize input-output operations.
Mybatis-PageHelper is a pagination plugin and persistence framework extension for MyBatis. It functions as a physical pagination engine that automatically appends limit and offset clauses to SQL queries to retrieve specific record subsets from a data source. The project optimizes data retrieval by modifying SQL statements at runtime to reduce memory overhead. It implements database pagination and data set windowing to manage the retrieval of paginated data within Java applications. The system utilizes a MyBatis interceptor chain for dynamic SQL rewriting and employs database dialects to ensu
Optimizes performance by limiting the amount of data fetched from the database during large record requests.
This project is a comprehensive guide to architectural standards and coding patterns for developing maintainable applications within the Laravel framework. It focuses on clean code standards, applying the single responsibility and DRY principles to ensure codebase predictability and consistency. The guide emphasizes decoupling components by moving business logic into service layers and shifting input validation into dedicated request classes to keep controllers lean. It advocates for the use of a service container and dependency injection to reduce class coupling and improve testability. The
Provides techniques for improving database query performance and data retrieval efficiency.
This project is a comprehensive library for numerical linear algebra and scientific computing, designed to provide optimized routines for matrix decomposition, statistical modeling, and high-performance data analysis. It serves as both a toolkit for solving complex linear systems and an educational resource for understanding the fundamental algorithms behind matrix factorizations and numerical solvers. The library distinguishes itself through a focus on randomized numerical linear algebra, utilizing probabilistic algorithms and approximate methods to perform dimensionality reduction and matri
Structures computations to minimize data movement between memory hierarchies for quicker retrieval.
Boto3 is the AWS SDK for Python, providing a programmatic interface for managing and automating AWS cloud infrastructure and services. It serves as a cloud management API client and resource manager for provisioning, configuring, and scaling virtual servers, databases, and storage. The library enables the implementation of infrastructure-as-code through declarative templates and scripts, allowing for the deployment of identical resource stacks across multiple accounts and geographic regions. It also provides a framework for coordinating distributed workflows, serverless functions, and contain
Organizes transferred data into partitions to optimize search efficiency and query performance.
Odin is a compiled, statically typed systems programming language designed for high-performance software development. It focuses on pragmatic low-level memory control, providing a toolset for manual memory management and precise control over hardware utilization. The language is distinguished by its flexible memory model, which includes custom allocators and precise data layout capabilities to optimize resource usage. It features a comprehensive foreign function interface for importing assembly files and linking with external libraries using configurable calling conventions. The type system
Supports organizing records as arrays of structures or structures of arrays to maximize hardware acceleration.
This project is a software engineering style guide and a curated collection of architectural patterns and coding standards. It provides a multi-language coding standard to ensure maintainable software across Ruby, Python, JavaScript, and Swift. The project establishes a development workflow specification for version control, continuous integration, and peer review to maintain a linear project history. It also includes a web accessibility framework based on ARIA and WCAG standards, using design tokens and semantic HTML patterns to build inclusive interfaces. The guides cover a broad range of
Implements techniques like foreign key indexing and selective column retrieval to optimize query speed.
Delta is a lakehouse table format that brings ACID transactions and data warehouse consistency to large scale data lakes on cloud object storage. It serves as an ACID transaction manager, coordinating atomic commits and serializable isolation for concurrent reads and writes across distributed compute engines. The project provides a multi-engine interoperability layer that uses format translation to allow diverse SQL engines and processing frameworks to read and write the same tables. It functions as a data versioning system, utilizing a transaction log to enable time travel, historical snapsh
Applies advanced sorting and data skipping techniques to reduce the volume of scanned data.
Jeesite is a full-stack low-code development framework designed for building enterprise administrative portals using Spring Boot, MyBatis, and Vue. It functions as a comprehensive platform for creating administrative dashboards with integrated role-based access control and organizational data permission systems. The framework distinguishes itself through a combination of automated CRUD code generation and an integrated RAG platform that connects large language models to enterprise data via vector stores. It further incorporates a BPMN-based workflow engine to automate complex business process
Automatically optimizes database queries and filtered lists using class-level metadata.
Bullet is an Active Record performance monitor and query profiler for Ruby on Rails applications. It serves as a diagnostic utility to identify inefficient database access patterns, flag redundant requests, and suggest eager loading strategies to improve response times. The tool specifically detects N+1 queries, missing counter caches, and unused eager loading. It monitors these patterns across both standard web requests and background jobs, identifying records that are fetched but never accessed to reduce memory usage and query overhead. Analysis is supported by a system that intercepts dat
Identifies and fixes N+1 queries and missing counter caches to improve application response times and reduce database load.
This project is a MongoDB database driver and object-relational mapper that brings MongoDB support to the Laravel Eloquent model and query builder. It provides a NoSQL model mapper that allows MongoDB collections to be mapped to object-oriented models using the Active Record pattern. The integration enables the use of a fluent query builder for constructing queries and aggregation pipelines without writing raw database syntax. It supports schema-less model integration, allowing applications to manage unstructured data while maintaining compatibility with standard object-oriented patterns. Th
Facilitates the creation of efficient queries and index management to optimize document retrieval performance.
Thorium is a web browser built from the Chromium project, designed for high performance and expanded compatibility. It utilizes aggressive compiler optimizations and CPU-specific instruction sets, such as AVX2 and SIMD, to increase page rendering and JavaScript execution speeds. The project distinguishes itself by providing custom builds that enable modern web browsing on legacy versions of Windows and Linux. It further diverges from standard browser implementations by integrating Widevine DRM and native support for high-efficiency media formats, including HEVC and JPEG XL. Broad capabilitie
Implements strategies to reduce execution overhead by enhancing data locality within software loops.
Mooncake es una plataforma de servicio de modelos de lenguaje grandes (LLM) desagregados y un almacén distribuido de clave-valor diseñado para infraestructura de inferencia de alto rendimiento. Funciona como un orquestador de memoria GPU y un sistema de gestión de caché KV que agrupa y transfiere cachés de clave-valor a través de clústeres para acelerar la inferencia. El sistema se distingue por separar las fases de prellenado (prefill) y decodificación (decode) de la inferencia en clústeres de hardware distintos para optimizar la utilización de recursos. Utiliza una caché distribuida RDMA de alto rendimiento con transferencias de copia cero para mover datos entre nodos de cómputo, evitando la CPU para reducir la latencia y la sobrecarga. La plataforma cubre áreas de capacidad amplias, incluyendo agrupación de memoria distribuida, enrutamiento de memoria de aceleradores mediante CXL y descarga de almacenamiento multinivel a SSDs. Gestiona el estado del clúster a través de servicios de coordinación de metadatos e implementa gobernanza de recursos mediante protección de objetos basada en arrendamiento y desalojo de caché basado en marcas de agua. El software está empaquetado para despliegue en contenedores con soporte para redes de host y mapeo de dispositivos de hardware.
Assigns preferred storage segments for object allocation to minimize network overhead and increase speed.
MongoEngine es un mapeador de objetos-documentos (ODM) para Python que traduce registros de base de datos en objetos para proporcionar una interfaz orientada a objetos para la persistencia de datos. Sirve como gestor de documentos y validador de esquemas para MongoDB, mapeando clases a documentos para imponer tipos de datos y reglas de validación. El proyecto proporciona un sistema de queryset de carga perezosa (lazy-loaded) para filtrar, ordenar y agregar colecciones utilizando sintaxis Pythonica. Gestiona estructuras de datos complejas a través de características como la herencia de documentos, el manejo recursivo de documentos incrustados y la vinculación de objetos basada en referencias. La librería cubre amplias capacidades, incluyendo migración de esquemas, búsqueda de texto completo y la gestión de archivos binarios grandes a través del sistema de archivos GridFS. También incluye herramientas para la optimización de índices de base de datos, perfilado del rendimiento de consultas y hooks de ciclo de vida basados en señales para automatizar la lógica durante los eventos de documentos.
Improves data retrieval speeds through the use of indexes, query profiling, and efficient filtering.