55 रिपॉजिटरी
Mechanisms for filtering data based on field conditions.
Distinguishing note: Focuses on record-level data filtering for visualizations.
Explore 55 awesome GitHub repositories matching data & databases · Data Filtering. Refine with filters or upvote what's useful.
Polars is a high-performance columnar data processing library designed for efficient analytical workflows. It functions as a structured data library that organizes information into typed columns, utilizing the Apache Arrow memory format to enable zero-copy data sharing and cache-friendly, vectorized operations. The engine is built to handle large-scale tabular datasets, providing both local and distributed analytical runtimes that scale from single-machine environments to multi-node clusters. The project distinguishes itself through a sophisticated lazy query engine that constructs abstract e
Removes rows from datasets by applying boolean expressions that satisfy specified conditions.
Umami is a self-hosted, privacy-focused web analytics platform designed to provide full control over infrastructure and user data. It captures website traffic and visitor behavior through anonymous tracking methods that avoid cookies, browser fingerprinting, and the storage of personally identifiable information. The platform distinguishes itself through a comprehensive suite of behavioral analysis tools, including session replays, heatmaps, and cohort-based retention reporting. It features a multi-tenant architecture that allows teams to manage multiple websites within a single, collaborativ
Applies filters to custom session properties to focus analysis on specific user groups.
Filament is a full-stack framework for building administrative panels and management interfaces within the Laravel ecosystem. It provides a declarative, component-based architecture that allows developers to construct complex, data-driven applications using server-side configuration objects rather than manual HTML. By inspecting database model structures and relationships, the framework automates the generation of CRUD interfaces, forms, and data tables, significantly reducing boilerplate code. The project distinguishes itself through a highly modular and extensible design that supports custo
Provides built-in filtering controls to manage the visibility of soft-deleted records.
This project is a human resources management system built using Spring Boot and Vue. It serves as a platform for managing employee records, professional titles, and organizational hierarchies. The system features a role-based access control framework that maps users to specific roles and resources to secure API endpoints and user interface elements. It includes a real-time communication hub utilizing WebSockets for internal corporate chat and system notifications, as well as a dedicated manager for defining and modifying nested organizational department structures. Additional capabilities co
Performs targeted employee record searches using multiple specific criteria to refine results.
Vector is a high-performance observability data pipeline designed to collect, transform, and route logs, metrics, and traces across distributed infrastructure. It functions as a modular engine that decouples data ingestion from processing and transmission, utilizing a component-based architecture to connect diverse sources to multiple destinations. The project distinguishes itself through a focus on reliability and flow control. It implements backpressure-aware data movement to prevent data loss during traffic spikes and utilizes disk-backed event buffering to ensure durability during network
Drops or retains logs, metrics, and traces based on user-defined conditions to reduce noise.
Teable is a self-hosted relational data management tool and no-code PostgreSQL database. It provides a spreadsheet-like interface for managing and querying structured data, allowing users to interact with a professional database backend without writing manual SQL for every operation. The platform is an extensible low-code system that allows for the integration of custom plugins and extensions through a dedicated application bridge and marketplace. It enables the creation of tailored internal tools by adding new features or modifying behavior via these external extensions. The system covers a
Provides precise data visibility control through record-level filtering, sorting, and grouping.
Backtrader is a Python framework designed for the development, backtesting, and live execution of algorithmic trading strategies. It provides a comprehensive environment for quantitative finance, allowing users to simulate trading logic against historical market data or connect directly to brokerage platforms for automated real-time trading. The project distinguishes itself through a unified event-driven architecture that treats backtesting and live trading with the same API. This consistency is supported by a flexible data-feed abstraction layer that normalizes diverse financial sources, ena
Enables filtering of market data streams to exclude out-of-session hours or specific data points.
Eleventy is a JavaScript-based static site generator designed to transform templates, data files, and markdown into optimized HTML. It functions as a versatile template rendering engine and content management framework, allowing developers to aggregate data from diverse sources—including local files, databases, and external APIs—to populate structured web content. The project is distinguished by its template-engine-agnostic pipeline, which decouples the build process from specific rendering languages. This allows users to integrate multiple template formats, such as Liquid, Nunjucks, Handleba
The static site generator enables the selection of specific fields from the data cascade to include in programmatic exports for cleaner data integration.
FlameGraph is a performance profiling and visualization toolkit designed to identify bottlenecks in software execution. It functions as a processing engine that transforms raw stack trace samples into interactive, hierarchical diagrams. By representing aggregated execution frequency as nested rectangles, the tool allows developers to visualize hot code paths and analyze system behavior across both kernel and user-space environments. The project distinguishes itself through its ability to perform differential profile analysis, which highlights performance regressions or improvements by compari
Processes specific subsets of execution paths by applying text-based filtering to input data.
Faker is a Python library designed to generate realistic synthetic data for software testing, database prototyping, and privacy-preserving anonymization. It provides a comprehensive suite of tools to create diverse information types, including personal identities, financial records, geographic locations, and technical system metadata, allowing developers to populate environments with mock data that mimics real-world structures. The library is built on a modular provider architecture that supports dynamic method dispatch, enabling users to extend functionality by registering custom data genera
Replaces real user data with realistic synthetic alternatives to protect privacy during testing workflows.
This project is a library of source code implementations designed to solve algorithmic challenges and mathematical problems. It serves as a collection of solved LeetCode problems, providing a reference for data structure usage and efficient logic. The repository is a polyglot code collection, implementing the same algorithmic logic across various programming environments, including general-purpose languages, SQL for database queries, and Bash for shell scripting. The content covers a broad range of computational tasks, including data querying, text processing, and the implementation of compl
Implements multi-criteria record filtering using boolean flags and numeric thresholds.
chezmoi is a command-line utility designed to manage and synchronize system configuration files across multiple machines. It uses a local Git repository as the single source of truth, allowing users to track, version, and distribute dotfiles while maintaining a consistent state across diverse operating systems and hardware architectures. The project distinguishes itself through a declarative reconciliation model that computes the difference between the current filesystem and the desired state defined in the repository. It features a robust templating engine that processes configuration files
Restricts operations to specific types of files or actions during configuration execution.
WeClone is an end-to-end framework designed for the creation, training, and deployment of personalized conversational AI digital twins. By fine-tuning large language models on individual chat history, the platform enables the replication of unique communication styles, speech patterns, and conversational habits. The system manages the entire lifecycle of these digital avatars, from initial data preparation to final integration into messaging platforms for real-time interaction. The platform distinguishes itself through a comprehensive suite of data processing utilities that prepare raw messag
Evaluates chat record quality using inference models to automatically discard irrelevant data before training.
OpenObserve is a unified observability data platform designed to ingest, store, and analyze logs, metrics, and traces. It functions as a cloud-native monitoring tool that centralizes telemetry from diverse sources, including standard collectors and cloud service providers, into a single, scalable system. By utilizing a columnar storage engine backed by object storage, the platform enables efficient long-term data retention and high-performance analytical querying. The platform distinguishes itself through deep integration with artificial intelligence, allowing users to query data using natura
Directs data streams to specific destinations based on field values and business rules.
This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer. The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
Scores generated content using reward models and discards entries that fail to meet quality thresholds.
This is a mobile object database and NoSQL local data store that replaces relational tables with a schema-based model. It functions as a reactive data store, using live object observations and change notifications to trigger automatic user interface refreshes. The system provides built-in mobile cloud data synchronization to keep local datasets consistent with a remote server across multiple devices. It also includes security features for encrypted local storage, protecting sensitive on-disk data using at-rest encryption keys and fine-grained access control. Broad capabilities include object
Retrieves specific subsets of objects from the database based on type-safe search criteria.
VictoriaMetrics is a high-performance, scalable time series database and observability platform designed for long-term storage and analysis of metric, log, and trace data. It functions as a unified backend for monitoring ecosystems, offering full compatibility with industry-standard protocols and query languages. The system is built to handle massive data volumes through a distributed architecture that supports horizontal scaling and efficient data lifecycle management. The platform distinguishes itself through a storage engine that utilizes consistent hashing for data sharding and log-struct
Modifies, drops, or updates metric labels and filters data streams before storage to ensure data quality.
Kilocode is an autonomous engineering platform designed to orchestrate AI agents for complex software development tasks. It functions as a comprehensive system for automating coding, testing, and repository management by integrating directly with your codebase and terminal. The platform provides a unified gateway for model orchestration, allowing for the management of agentic workflows, event-driven automation, and persistent session state across distributed development environments. The platform distinguishes itself through its federated task management and policy-based access control, which
Executes soft-delete flows to remove personal information while maintaining necessary audit and financial records.
This tool is a command-line processor designed for querying, updating, and transforming structured data files. It functions as a versatile engine for manipulating YAML, JSON, TOML, and XML documents, allowing users to perform complex operations directly from the terminal. By utilizing a path-based expression language, it enables precise navigation and modification of data structures within configuration files and infrastructure-as-code workflows. What distinguishes this tool is its ability to perform in-place document mutations while preserving original formatting, comments, and metadata. It
Selects elements from collections by evaluating conditions against child nodes to return matching items.
This project is a comprehensive collection of web development reference guides and technical cheat sheets. It provides a curated set of markdown-based documentation designed to help developers quickly locate syntax patterns and API examples for common web technologies and programming languages. The repository serves as a specialized reference library covering several distinct technical domains. It includes extensive guides for CSS, focusing on selectors, Flexbox, Grid, and responsive layout properties, as well as a DevOps command reference for Docker, Kubernetes, AWS, Ansible, and general she
Provides techniques for filtering data views using search fields and predefined scopes.