9 dépôts
Compression methods that use pre-trained dictionaries to improve ratios for small or similar data sets.
Distinct from Dictionary Generators: Focuses on the algorithmic use of dictionaries for compression, not configuration or system metadata dictionaries.
Explore 9 awesome GitHub repositories matching data & databases · Dictionary-Based Compression. Refine with filters or upvote what's useful.
Zstandard is a lossless data compression library and archive format designed for high compression ratios and fast real-time processing. It functions as a real-time data compressor and multi-threaded compression engine capable of distributing workloads across multiple CPU cores to increase throughput. The system features a dictionary-based compressor that trains on sample data to improve the compression ratio and speed of small files. It also provides long distance pattern matching to identify repeated sequences across large files. The library covers a broad range of capabilities including st
Features a dictionary-based compressor that trains on sample data to improve compression ratios for small files.
Brotli is a lossless data compression library and engine that uses dictionary coding and frequency analysis to reduce file sizes. It provides tools for shrinking data streams and files while ensuring every bit of original information is preserved for perfect restoration. The project focuses on optimizing web content and network bandwidth by reducing the size of HTML, CSS, and JavaScript files. It is designed for integration into web servers and browsers to improve the efficiency of data transmission. The library includes capabilities for both compressing and decompressing data streams. It al
Employs a static dictionary of common strings to represent frequent patterns without encoding them in the stream.
This project provides a lossless compression algorithm and a byte-oriented compression library designed for high-speed data reduction and maximum decompression speed. It functions as a stream-oriented compression engine, a software library for encoding and decoding data blocks, and a command-line tool for managing interoperable compressed frames. The system distinguishes itself through the use of predefined pattern dictionaries to improve compression ratios for small data sets and small packets. It supports multiple processing modes, including high-speed block compression for minimal latency
Uses pre-trained pattern dictionaries to improve compression ratios for small data sets and packets.
This project is a comprehensive collection of computer science implementations and an algorithm tutorial repository. It serves as a study guide and reference for competitive programming, providing executable code examples that demonstrate fundamental algorithmic problem solving and mathematical computation. The library covers a wide range of specialized domains, including cryptography and security primitives, lossless data compression techniques, and computational geometry for spatial analysis. It also features implementations of machine learning models, linear algebra operations, and formal
Implements Lempel-Ziv-Welch (LZW) compression using a dynamic dictionary to reduce data size.
Snappy is a high-performance lossless compression library implemented in C++. It provides data reduction methods that perfectly restore original information, focusing on system-level efficiency and processing velocity across different platforms. The library prioritizes high-speed data compression and decompression over achieving the maximum possible compression ratio. It is designed for real-time stream compression to reduce bandwidth usage without introducing significant processing latency. The implementation covers high-velocity data shrinking and rapid restoration. It includes resilient d
Implements speed-prioritized dictionary matching with limited-window searches to maintain high throughput.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Maps repeated column values to integer IDs to reduce storage footprint and accelerate query performance.
This is a high-performance Go compression library providing implementations of Zstandard, Snappy, and Huffman coding. It includes a parallel compression framework for distributing gzip and stream workloads across multiple CPU cores and a specialized Huffman codec optimized for modern CPU architectures. The library features a Zstandard implementation that supports custom dictionaries and allocation-free decoding, alongside a Snappy compatible encoder for high-throughput data processing. It provides specific tools for dictionary generation and optimization to improve compression ratios for smal
Uses pre-trained data samples as lookup tables to improve compression ratios for small or similar data blocks.
lz-string est une bibliothèque JavaScript et un outil en ligne de commande pour la compression et la décompression de données textuelles utilisant les algorithmes Lempel-Ziv. Il fournit des utilitaires pour réduire la taille des chaînes et des fichiers afin d'optimiser le stockage et la bande passante réseau. Le projet inclut des encodeurs spécialisés pour garantir la compatibilité des données compressées avec différentes couches de transport. Cela inclut des options pour l'encodage Base64, le mappage de chaînes UTF-16 pour une densité de stockage accrue, et la traduction sécurisée pour les URI afin d'être utilisée dans les adresses web. La bibliothèque prend en charge la compression de données côté client pour l'optimisation du stockage local et fournit une interface en ligne de commande pour effectuer des opérations de compression et de décompression sur des fichiers ou via l'entrée standard.
Implements compression by replacing repeated substrings with references to a dictionary of previously encountered patterns.
7-Zip is a data compression tool and file archiver designed for creating and extracting archives. It functions as a utility for reducing the storage requirements of files and folders through high-ratio lossless compression and managing the 7z open-standard format. The software provides capabilities for cross-platform archive management, allowing users to open and create various archive formats across different operating systems. It supports data backup and recovery, as well as the packaging of multiple files into single archives for distribution. The project implements a variety of compressi
Implements dictionary-based compression using a sliding window to replace repeated data sequences with short references.