19 Repos
Techniques for improving database query performance and data retrieval efficiency.
Distinguishing note: Focuses on performance tuning for list views and document linking.
Explore 19 awesome GitHub repositories matching data & databases · Query Optimizations. Refine with filters or upvote what's useful.
Agent-skills is a collection of structured instructions and behavioral personas designed to standardize how AI coding agents perform engineering tasks. It functions as a workflow orchestrator that maps natural language intent to repeatable technical sequences and verification checklists. The project distinguishes itself through the use of specialized markdown-defined roles, such as security auditors or test engineers, to apply targeted domain expertise. It employs an evidence-based verification model that requires runtime data or passing tests as mandatory exit criteria to ensure AI-generated
Provides instructions for improving database query efficiency by eliminating N+1 patterns and implementing indexing.
Payload is a headless content management system and application framework that uses a code-first approach to define data schemas and administrative interfaces. By utilizing a centralized, type-safe configuration object, it automatically generates database schemas, API endpoints, and a fully customizable admin panel. The system is built on a database-agnostic architecture, allowing it to interface with various storage engines while providing a unified, type-safe API for server-side operations, REST, and GraphQL. What distinguishes Payload is its deep extensibility and developer-centric design.
Optimizes list view performance by refining data retrieval queries.
Cube is a semantic data layer that provides a unified framework for defining business metrics, dimensions, and relationships across diverse data sources. By acting as a headless business intelligence engine, it transforms raw data into a governed model that can be queried via SQL, REST, and GraphQL interfaces. This architecture ensures consistent data definitions and logic across all downstream analytical applications and reporting tools. The platform distinguishes itself through its integrated conversational AI capabilities, which allow users to explore data using natural language. It orches
Implements pre-aggregation strategies and workload-aware settings to reduce total compute costs and improve response times.
Presto is a distributed SQL query engine designed for high-performance analytical processing across heterogeneous data sources. It functions as a data federation platform and massively parallel processing engine, allowing users to execute interactive queries against diverse storage systems without requiring data migration. By mapping remote metadata and structures to a unified relational namespace, it enables seamless cross-platform analysis through a standard SQL interface. The engine distinguishes itself through a pluggable connector architecture and a shared-nothing distributed processing
Arranges table data during write operations using specific sort orders to make filtering and searching more efficient.
VictoriaMetrics is a high-performance, scalable time series database and observability platform designed for long-term storage and analysis of metric, log, and trace data. It functions as a unified backend for monitoring ecosystems, offering full compatibility with industry-standard protocols and query languages. The system is built to handle massive data volumes through a distributed architecture that supports horizontal scaling and efficient data lifecycle management. The platform distinguishes itself through a storage engine that utilizes consistent hashing for data sharding and log-struct
Identifies execution bottlenecks by tracing query paths and formatting expressions to ensure efficient data retrieval.
Dask ist ein Framework für paralleles Rechnen und ein verteilter Task-Scheduler, der darauf ausgelegt ist, Python-Data-Science-Workflows von einzelnen Maschinen auf große Cluster zu skalieren. Es fungiert als Cluster-Ressourcenmanager, der die Berechnungslogik orchestriert, indem Aufgaben und deren Abhängigkeiten als gerichtete azyklische Graphen dargestellt werden. Diese Architektur ermöglicht es dem System, die Verteilung von Workloads auf verfügbare Hardware zu automatisieren und gleichzeitig komplexe Ausführungsanforderungen zu verwalten. Das Projekt zeichnet sich durch eine Lazy-Evaluation-Engine aus, die Datenoperationen verzögert, bis sie explizit angefordert werden, was eine globale Graphoptimierung und effiziente Ressourcenzuweisung ermöglicht. Es integriert speicherbewusstes Data-Spilling, um Systemabstürze bei der Verarbeitung von Datensätzen zu verhindern, die den verfügbaren Speicher überschreiten, und nutzt Task-Graph-Fusion, um Sequenzen von Operationen in einzelne Ausführungsschritte zu kombinieren, wodurch Scheduling-Overhead und Inter-Node-Kommunikation minimiert werden. Die Plattform bietet eine umfassende Oberfläche für die Datenanalyse im großen Maßstab, einschließlich Unterstützung für verteiltes maschinelles Lernen, Integration in das Hochleistungsrechnen und parallele Datenverarbeitung. Sie bietet umfangreiche Werkzeuge für das Cluster-Lebenszyklusmanagement, Performance-Profiling und die Echtzeitüberwachung der Aufgabenausführung. Benutzer können diese Umgebungen über verschiedene Infrastrukturen hinweg bereitstellen, einschließlich lokaler Hardware, Cloud-Anbietern, containerisierten Systemen und Hochleistungsrechner-Clustern.
Analyzes and transforms computation graphs to reduce data movement and minimize input-output operations.
Mybatis-PageHelper is a pagination plugin and persistence framework extension for MyBatis. It functions as a physical pagination engine that automatically appends limit and offset clauses to SQL queries to retrieve specific record subsets from a data source. The project optimizes data retrieval by modifying SQL statements at runtime to reduce memory overhead. It implements database pagination and data set windowing to manage the retrieval of paginated data within Java applications. The system utilizes a MyBatis interceptor chain for dynamic SQL rewriting and employs database dialects to ensu
Optimizes performance by limiting the amount of data fetched from the database during large record requests.
This project is a comprehensive guide to architectural standards and coding patterns for developing maintainable applications within the Laravel framework. It focuses on clean code standards, applying the single responsibility and DRY principles to ensure codebase predictability and consistency. The guide emphasizes decoupling components by moving business logic into service layers and shifting input validation into dedicated request classes to keep controllers lean. It advocates for the use of a service container and dependency injection to reduce class coupling and improve testability. The
Provides techniques for improving database query performance and data retrieval efficiency.
This project is a comprehensive library for numerical linear algebra and scientific computing, designed to provide optimized routines for matrix decomposition, statistical modeling, and high-performance data analysis. It serves as both a toolkit for solving complex linear systems and an educational resource for understanding the fundamental algorithms behind matrix factorizations and numerical solvers. The library distinguishes itself through a focus on randomized numerical linear algebra, utilizing probabilistic algorithms and approximate methods to perform dimensionality reduction and matri
Structures computations to minimize data movement between memory hierarchies for quicker retrieval.
Boto3 is the AWS SDK for Python, providing a programmatic interface for managing and automating AWS cloud infrastructure and services. It serves as a cloud management API client and resource manager for provisioning, configuring, and scaling virtual servers, databases, and storage. The library enables the implementation of infrastructure-as-code through declarative templates and scripts, allowing for the deployment of identical resource stacks across multiple accounts and geographic regions. It also provides a framework for coordinating distributed workflows, serverless functions, and contain
Organizes transferred data into partitions to optimize search efficiency and query performance.
Odin is a compiled, statically typed systems programming language designed for high-performance software development. It focuses on pragmatic low-level memory control, providing a toolset for manual memory management and precise control over hardware utilization. The language is distinguished by its flexible memory model, which includes custom allocators and precise data layout capabilities to optimize resource usage. It features a comprehensive foreign function interface for importing assembly files and linking with external libraries using configurable calling conventions. The type system
Supports organizing records as arrays of structures or structures of arrays to maximize hardware acceleration.
This project is a software engineering style guide and a curated collection of architectural patterns and coding standards. It provides a multi-language coding standard to ensure maintainable software across Ruby, Python, JavaScript, and Swift. The project establishes a development workflow specification for version control, continuous integration, and peer review to maintain a linear project history. It also includes a web accessibility framework based on ARIA and WCAG standards, using design tokens and semantic HTML patterns to build inclusive interfaces. The guides cover a broad range of
Implements techniques like foreign key indexing and selective column retrieval to optimize query speed.
Delta is a lakehouse table format that brings ACID transactions and data warehouse consistency to large scale data lakes on cloud object storage. It serves as an ACID transaction manager, coordinating atomic commits and serializable isolation for concurrent reads and writes across distributed compute engines. The project provides a multi-engine interoperability layer that uses format translation to allow diverse SQL engines and processing frameworks to read and write the same tables. It functions as a data versioning system, utilizing a transaction log to enable time travel, historical snapsh
Applies advanced sorting and data skipping techniques to reduce the volume of scanned data.
Jeesite is a full-stack low-code development framework designed for building enterprise administrative portals using Spring Boot, MyBatis, and Vue. It functions as a comprehensive platform for creating administrative dashboards with integrated role-based access control and organizational data permission systems. The framework distinguishes itself through a combination of automated CRUD code generation and an integrated RAG platform that connects large language models to enterprise data via vector stores. It further incorporates a BPMN-based workflow engine to automate complex business process
Automatically optimizes database queries and filtered lists using class-level metadata.
Bullet is an Active Record performance monitor and query profiler for Ruby on Rails applications. It serves as a diagnostic utility to identify inefficient database access patterns, flag redundant requests, and suggest eager loading strategies to improve response times. The tool specifically detects N+1 queries, missing counter caches, and unused eager loading. It monitors these patterns across both standard web requests and background jobs, identifying records that are fetched but never accessed to reduce memory usage and query overhead. Analysis is supported by a system that intercepts dat
Identifies and fixes N+1 queries and missing counter caches to improve application response times and reduce database load.
This project is a MongoDB database driver and object-relational mapper that brings MongoDB support to the Laravel Eloquent model and query builder. It provides a NoSQL model mapper that allows MongoDB collections to be mapped to object-oriented models using the Active Record pattern. The integration enables the use of a fluent query builder for constructing queries and aggregation pipelines without writing raw database syntax. It supports schema-less model integration, allowing applications to manage unstructured data while maintaining compatibility with standard object-oriented patterns. Th
Facilitates the creation of efficient queries and index management to optimize document retrieval performance.
Thorium is a web browser built from the Chromium project, designed for high performance and expanded compatibility. It utilizes aggressive compiler optimizations and CPU-specific instruction sets, such as AVX2 and SIMD, to increase page rendering and JavaScript execution speeds. The project distinguishes itself by providing custom builds that enable modern web browsing on legacy versions of Windows and Linux. It further diverges from standard browser implementations by integrating Widevine DRM and native support for high-efficiency media formats, including HEVC and JPEG XL. Broad capabilitie
Implements strategies to reduce execution overhead by enhancing data locality within software loops.
Mooncake ist eine disaggregierte Plattform für das Serving von Large Language Models und ein verteilter Key-Value-Store, der für eine hochperformante Inferenz-Infrastruktur konzipiert wurde. Es fungiert als GPU-Speicher-Orchestrator und KV-Cache-Managementsystem, das Key-Value-Caches über Cluster hinweg bündelt und überträgt, um die Inferenz zu beschleunigen. Das System zeichnet sich dadurch aus, dass es die Prefill- und Decode-Phasen der Inferenz in separate Hardware-Cluster trennt, um die Ressourcennutzung zu optimieren. Es nutzt einen hochperformanten verteilten RDMA-Cache mit Zero-Copy-Transfers, um Daten zwischen Rechenknoten zu verschieben und dabei die CPU zu umgehen, um Latenz und Overhead zu reduzieren. Die Plattform deckt breite Funktionsbereiche ab, einschließlich verteiltem Memory-Pooling, Beschleuniger-Speicher-Routing via CXL und Multi-Tier-Storage-Offloading auf SSDs. Es verwaltet den Cluster-Status durch Metadaten-Koordinationsdienste und implementiert Ressourcen-Governance mittels lease-basierter Objektschutzmechanismen und wasserzeichenbasierter Cache-Eviction. Die Software ist für containerisierte Deployments verpackt, mit Unterstützung für Host-Networking und Hardware-Device-Mapping.
Assigns preferred storage segments for object allocation to minimize network overhead and increase speed.
MongoEngine ist ein Python-Object-Document-Mapper, der Datenbankeinträge in Objekte übersetzt, um eine objektorientierte Schnittstelle für die Datenpersistenz bereitzustellen. Er dient als Dokument-Manager und Schema-Validator für MongoDB und bildet Klassen auf Dokumente ab, um Datentypen und Validierungsregeln durchzusetzen. Das Projekt bietet ein Lazy-Loaded-Queryset-System zum Filtern, Sortieren und Aggregieren von Sammlungen unter Verwendung einer Python-Syntax. Es verwaltet komplexe Datenstrukturen durch Funktionen wie Dokumentenvererbung, rekursive Handhabung eingebetteter Dokumente und referenzbasierte Objektverknüpfung. Die Bibliothek deckt breite Funktionen ab, einschließlich Schema-Migration, Volltextsuche und die Verwaltung großer Binärdateien über das GridFS-Dateisystem. Sie enthält zudem Tools zur Optimierung von Datenbankindizes, zur Profilierung der Abfrageleistung und signalbasierte Lifecycle-Hooks, um Logik während Dokumentenereignissen zu automatisieren.
Improves data retrieval speeds through the use of indexes, query profiling, and efficient filtering.