24 dépôts
Collections of unique elements optimized for membership testing and set-theoretic operations like unions and intersections.
Distinguishing note: None of the provided candidates were relevant; this category specifically addresses set-based data management and operations.
Explore 24 awesome GitHub repositories matching data & databases · Set Data Structures. Refine with filters or upvote what's useful.
Dragonfly is a high-performance, multi-model in-memory data store designed to serve as a drop-in replacement for existing database infrastructures. By utilizing a multi-threaded, shared-nothing architecture and a fiber-based concurrency model, it maximizes CPU utilization and minimizes latency for read and write operations. The system supports a wide range of data structures, including strings, hashes, lists, sets, sorted sets, and JSON documents, while maintaining full compatibility with standard industry wire protocols and client libraries. What distinguishes Dragonfly is its focus on effic
Provides high-performance set operations including membership testing, unions, and intersections for unique data collections.
This project is a computer science educational resource and a library of common data structures and algorithms implemented in Swift. It serves as a practical reference for studying complexity and efficiency through solved algorithmic problems and conceptual guides. The collection includes implementations of linear and hierarchical data structures, such as stacks, queues, linked lists, and trees. It covers a wide range of computational patterns, including graph and pathfinding implementations, mathematical numerical methods, and data compression techniques. The project also provides implement
Implements hash tables and filters to store unique elements and key-value pairs with minimal latency.
This project is a comprehensive collection of common computer science algorithms and data structures implemented in Swift. It serves as an educational reference and library for studying computational complexity, algorithmic logic, and data structure engineering through practical code examples. The repository provides a wide suite of data structure implementations, including various types of linked lists, heaps, hash tables, and an extensive range of hierarchical trees such as Red-Black, B-Tree, and Splay trees. It also covers diverse sorting and searching techniques, from basic bubble sort to
A system for performing union, intersection, and difference operations to merge or isolate common elements.
Python is a high-level, interpreted programming language designed for readability and versatility. It operates via a bytecode-based virtual machine and manages memory automatically through reference-counting garbage collection. The language supports multiple programming paradigms, including object-oriented, imperative, and functional styles, and provides a comprehensive standard library for system operations, networking, and data handling. The language is distinguished by its dynamic nature, allowing for runtime object introspection and metaclass-driven class creation. It utilizes protocol-ba
Calculates unions, intersections, and differences between collections to analyze relationships between data groups.
This project is a comprehensive, community-maintained knowledge base and toolkit designed for competitive programming. It serves as a centralized repository for algorithmic theory, data structures, and mathematical techniques, providing a structured reference for informatics and collegiate programming competitions. The project distinguishes itself by integrating educational content with a robust suite of automation utilities. It provides a complete workflow for competitive programming, including tools for automated test case generation, solution verification, and direct interaction with onlin
Maintains unique ordered sets for efficient insertion and membership testing.
core-js is a comprehensive compatibility layer and standard library polyfill that implements ECMAScript proposals and stable language features across diverse JavaScript runtimes. It serves as a runtime environment shim to ensure consistent execution of global objects, iteration protocols, and standard library methods in older browsers or non-browser environments. The project is distinguished by its delivery models, offering both prototype-based global polyfilling and a pure-module implementation. This allows for the integration of modern functionality without modifying global prototypes to pr
Implements mathematical set operations including intersections, unions, and differences.
This project is a feature-rich Go client library designed for interacting with Redis. It serves as a comprehensive interface for managing remote data stores, enabling developers to execute standard database commands, handle complex data structures, and perform asynchronous operations within Go applications. The library distinguishes itself through its support for advanced Redis capabilities, including connection pooling, pipelining, and transactional integrity. It provides specialized primitives for managing distributed clusters, including automated topology updates and request routing to sha
Maintains collections of unique items and computes mathematical relationships like unions between multiple distinct sets.
This project is an educational resource designed for learning the Python programming language. It serves as a tutorial repository and programming guide, providing a collection of annotated scripts, code examples, and cheatsheets to help users master syntax and core fundamentals. The resource focuses on moving from basic language syntax to advanced implementation, with a particular emphasis on object-oriented programming, the use of the Python standard library, and scripting automation for business workflows. The content covers a broad range of programming capabilities, including control flow
Implements set operations such as union, intersection, and difference to find common or unique elements.
Cayley is a graph database engine designed for storing and querying interconnected data using a quad-based data model. It functions as an RDF quad store, managing information through subjects, predicates, objects, and labels. The system features a modular graph store architecture with pluggable backends, allowing it to swap between in-memory storage and various external persistent databases. It includes a GraphQL-inspired API and a dedicated data visualizer for the interactive exploration of nodes and edges. Query capabilities cover bidirectional path traversal and multi-syntax execution usi
Implements union, intersection, and exception operations to combine or subtract graph paths.
Rete is a framework for building interactive, node-based visual interfaces and dataflow programming environments. It provides a core engine that processes directed graphs, allowing developers to define modular logic where nodes represent operations and connections represent the flow of data or control. By decoupling the graph logic from the user interface, the framework enables the creation of custom visual editors that can be integrated into various frontend component libraries. The project distinguishes itself through a highly extensible, signal-driven architecture that supports complex req
Combines or modifies graph structures using union, difference, and intersection logic to manage collections of nodes and connections.
Garnet is a multi-threaded in-memory database and distributed key-value store. It functions as a high-performance remote cache store that implements the RESP wire protocol to maintain compatibility with existing Redis clients and libraries. The project is distinguished by a shared-memory architecture that enables parallel request processing across multiple cores for sub-millisecond latency. It features a tiered storage system that automatically offloads colder data from system memory to SSD or cloud storage layers, and includes a specialized vector search database for high-dimensional similar
Supports collections of unique elements with mathematical set operations like intersections and unions.
phpredis is a C-based native extension that bridges PHP applications with Redis servers for high-performance data storage and retrieval. It serves as an interface for manipulating strings, hashes, lists, sets, and sorted sets while providing a direct path for executing Redis commands and server-side scripts. The extension provides comprehensive support for distributed environments and high availability. It interfaces with Redis Cluster to distribute data across multiple nodes using hash slots and manages Redis Sentinel for service discovery and automatic failover. It also enables shared state
Implements unordered set operations, including the addition/removal of members and mathematical intersections and unions.
pysheeet est une bibliothèque de référence technique fournissant une collection organisée d'extraits de code et de modèles d'implémentation pour le développement Python avancé, l'intégration système et le calcul haute performance. Il sert de guide complet pour implémenter la programmation réseau de bas niveau, les extensions C natives, et la programmation asynchrone et concurrente. Le projet fournit des frameworks spécialisés pour le développement et le déploiement de grands modèles de langage, y compris des outils pour l'inférence GPU distribuée et le service haute performance. Il inclut également des modèles détaillés pour l'orchestration de clusters de calcul haute performance, couvrant l'allocation des ressources GPU et la gestion des charges de travail multi-nœuds. La bibliothèque couvre une large surface de capacités, y compris la communication réseau sécurisée et la cryptographie, l'ORM et la gestion de base de données, et l'implémentation de structures de données et d'algorithmes complexes. Elle fournit également des utilitaires pour la gestion de la mémoire, l'interopérabilité native via des interfaces de fonctions étrangères (FFI) et l'intégration au niveau du système d'exploitation.
Demonstrates building sets of unique elements using literals, constructors, and comprehensions.
This project is a comprehensive library of practical Python code examples and patterns. It provides a collection of scripts and snippets designed to demonstrate a wide range of programming tasks, from basic syntax to advanced implementation patterns. The repository focuses on several core domains, including the implementation of concurrency and multithreading examples, data analysis snippets for cleaning and manipulating tabular data, and various data visualization examples. It also covers automation scripts for file system management and a variety of general programming patterns. Additional
Demonstrates how to create an empty set using the set constructor.
JimuReport is an open-source reporting and dashboard engine designed to be embedded directly into Spring Boot applications. Its core identity centers on generating data reports and full-screen dashboards from natural language descriptions, eliminating the need for manual design. The platform also provides a conversational query interface that translates plain-language questions into database queries, returning results as tables and charts without requiring SQL knowledge. What distinguishes JimuReport is its integration of AI skills that can be installed with a single command, enabling report
Connects to SQL, API, JavaBean, JSON, and shared data sources to supply data for reports.
go-datastructures is a collection of thread-safe and lock-free data structures designed for high-performance concurrent applications in Go. It provides a modular library of specialized algorithmic toolsets, including a lock-free collection library and an immutable data structure library. The project distinguishes itself through a suite of persistent AVL trees and hash array mapped tries that use branch-copying to preserve previous versions. It also implements non-blocking hash maps, queues, and tries that enable linearizable snapshots and concurrent updates without the use of mutual exclusion
Provides dense and sparse bitmaps for fast bitwise comparisons and intersections between integer sets.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Creates a base64 encoded set of column values using optimized data structures like bitmaps or bloom filters for efficient membership testing.
Ce projet fournit une collection curatée de mots chinois à haute fréquence et non informatifs, issus de standards académiques et industriels. Il sert de jeu de données de référence et de collection de mots vides (stopwords) conçue pour être utilisée dans des tâches de traitement du langage naturel (NLP). Le dépôt se concentre sur le prétraitement du texte chinois pour réduire le bruit et améliorer la précision des modèles de machine learning. Il fournit des jeux de données filtrés spécifiquement pour la recherche d'informations en chinois, la préparation à l'analyse de sentiment et le nettoyage général des données. Le projet utilise des lexiques pré-compilés et un stockage en fichiers plats pour permettre un filtrage efficace des mots vides et une agrégation de vocabulaire pour les corpus chinois.
Uses hash set data structures to perform stopword membership checks with constant time complexity.
Ce projet est une implémentation TensorFlow d'un framework de traduction image-à-image basé sur des réseaux antagonistes génératifs conditionnels (cGAN). Il fournit les outils pour entraîner des modèles qui mappent des images d'entrée vers des images de sortie basées sur des modèles visuels appris, ainsi qu'un serveur pour traiter les requêtes de traduction d'image et servir des checkpoints de modèles entraînés aux clients web. Le framework inclut un système pour convertir les poids de modèles entraînés dans un format portable pour l'inférence basée sur le navigateur. Il dispose également d'un processus de validation qui génère des rapports comparatifs en analysant les jeux d'images d'entrée, de sortie et cibles en utilisant un checkpoint entraîné. La base de code couvre le pipeline complet depuis l'ingénierie des données, incluant la préparation du jeu de données d'images et le pipelining basé sur des paires, jusqu'à l'entraînement antagoniste conditionnel. Il prend en charge des transformations visuelles spécifiques telles que la colorisation d'images en niveaux de gris et la génération d'imagerie synthétique.
Generates comparative reports analyzing input, output, and target image sets using trained checkpoints.
dplyr est une bibliothèque de manipulation de données pour R qui fournit une grammaire pour transformer les data frames tabulaires. Elle fonctionne comme un processeur de data frames en mémoire et un outil d'algèbre relationnelle, utilisant un ensemble cohérent de verbes pour filtrer, sélectionner et résumer les données. Le projet inclut un moteur de traduction SQL qui convertit des expressions de manipulation de données de haut niveau en requêtes optimisées. Cela permet aux utilisateurs d'effectuer des transformations directement sur des bases de données relationnelles distantes et du stockage cloud sans rapatrier les données localement. La bibliothèque couvre une large gamme d'opérations tabulaires, incluant la mutation de colonnes, le sous-ensemble de lignes et la jointure de données relationnelles. Elle offre également des capacités pour l'analyse de données groupées, permettant de partitionner les jeux de données pour des agrégations et des résumés indépendants.
Provides tools to reduce datasets to summary forms by computing statistics for defined groups.