13 repositorios
Formats and methods for encoding and decoding data for storage or transmission.
Distinguishing note: Focuses on JSON processing.
Explore 13 awesome GitHub repositories matching data & databases · Data Serialization. Refine with filters or upvote what's useful.
This project is a structured educational resource designed to guide developers through the mastery of the JavaScript programming language. It utilizes a progressive curriculum that organizes technical concepts into a daily learning path, allowing students to build foundational knowledge before advancing to complex application development. The resource distinguishes itself through a hands-on training model that combines detailed explanations with practical code challenges. By focusing on an interactive learning experience, it reinforces core language principles—such as data types, functional p
Covers JSON data processing for web applications.
This project is a comprehensive platform for quantitative investment research, machine learning, and algorithmic trading. It provides an end-to-end environment for developing, testing, and executing financial strategies, supporting the entire lifecycle from data ingestion and feature engineering to model training and backtesting. The system is distinguished by its configuration-driven workflow orchestration, which allows researchers to automate complex pipelines and manage experiments through declarative files. It features a high-performance data infrastructure that utilizes custom binary for
Provides mechanisms to store and reload complex datasets and models to disk for persistent research workflows.
Avalonia is a cross-platform desktop framework that enables the creation of native-feeling applications for Windows, macOS, and Linux from a single codebase. It functions as a declarative UI toolkit, allowing developers to define complex visual hierarchies and interface structures using a markup-based syntax that maps directly to underlying object properties. By utilizing the Model-View-ViewModel architectural pattern, the framework facilitates a clean separation between application logic and user interface layout, which simplifies unit testing and component maintenance. The framework disting
Serializes and deserializes clipboard data using custom mechanisms to handle object data.
This project is a generative speech synthesis engine that converts text into high-fidelity human speech. It utilizes a two-stage autoregressive transformer architecture that separates semantic token prediction from acoustic detail reconstruction to balance linguistic accuracy with audio quality. The system is designed to support multilingual output and conversational AI development, enabling the generation of context-aware speech that maintains flow across multiple dialogue turns. The platform distinguishes itself through a production-ready inference server that employs continuous batching to
Provides utilities to pack audio and text data into structured formats for training.
Sanic is an asynchronous Python web framework designed for building high-performance APIs and services. It operates as a production-ready ASGI web server, utilizing a non-blocking event loop to handle concurrent requests and maximize throughput. The framework is built to support scalable architectures, offering built-in worker process management to distribute traffic across available CPU cores. What distinguishes Sanic is its focus on modularity and developer-centric tooling. It features a blueprint-based system for organizing complex applications into pluggable components, alongside a robust
Defines custom functions for serializing and deserializing data formats like JSON to meet specific requirements.
fq es un procesador de datos binarios de línea de comandos utilizado para decodificar, transformar y analizar flujos de bytes crudos y datos a nivel de bit en formatos estructurados. Funciona como un motor de consulta binario funcional que permite filtrar y mapear estructuras binarias, así como un convertidor que traduce blobs binarios complejos y formatos de archivo propietarios a JSON, YAML o XML estándar. La herramienta se distingue como un manipulador de bits de bajo nivel capaz de realizar cortes a nivel de bit, operaciones a nivel de bit y hashing criptográfico en archivos crudos. También sirve como analizador de protocolos de red con la capacidad de reensamblar flujos TCP fragmentados y descifrar tráfico TLS para inspección a nivel de aplicación. El proyecto cubre amplias capacidades en el análisis binario y la transformación de datos, incluyendo soporte para definiciones de decodificadores personalizados y una amplia gama de formatos especializados como Mach-O, ASN1 BER y Avro OCF. Proporciona utilidades para la búsqueda en árboles binarios, decodificación de texto estructurado y serialización bidireccional entre formatos binarios y de texto. Los usuarios pueden interactuar con el sistema a través de una interfaz de línea de comandos y un REPL interactivo para pruebas de consulta en tiempo real.
Decodes Avro Object Container Format files using compression codecs to inspect stored data.
RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process continuous data streams. It functions as a streaming data lakehouse, combining the capabilities of a streaming SQL database with a platform that integrates streaming ingestion with open table formats. The system is distinguished by its use of the PostgreSQL wire protocol, allowing it to integrate with existing SQL tools and drivers. It employs a decoupled compute and storage architecture, persisting streaming state and materialized views in cloud object storage to enable independen
Parses Avro-serialized data using a schema registry to enable seamless data exchange between different languages.
Materialize is a streaming SQL database that continuously ingests live data from sources such as Kafka, Redpanda, PostgreSQL, and MySQL, and incrementally maintains materialized views. It provides a PostgreSQL-compatible query engine that accepts standard SQL over the PostgreSQL wire protocol, enabling any existing SQL client or BI tool to query real-time data. The system also includes a Model Context Protocol (MCP) server that exposes live materialized view data to AI agents, providing fresh context without polling. Materialize distinguishes itself through its ability to offer configurable c
Decodes Avro messages from Kafka topics using Confluent Schema Registry schemas for typed SQL columns.
Apache Hive is a SQL-on-Hadoop data warehouse that enables querying and managing petabytes of data stored in distributed storage such as HDFS and cloud storage services. It provides a familiar SQL interface for batch analytics and reporting, supported by a core set of components including the HiveServer2 Thrift service for remote query execution, the Hive Metastore Service for central metadata management, the Hive ACID Transaction Engine for concurrent read-write operations, and the Hive LLAP Interactive Engine for low-latency analytical processing. The WebHCat REST API offers an HTTP interfac
Reads and writes Avro-encoded data as Hive tables, inferring the table schema from the Avro schema and supporting nested structures.
CloudEvents is an open specification for describing event data in a common format across cloud platforms and services. It defines a standard structure and set of metadata attributes for events, enabling interoperability across different systems so producers and consumers can exchange events without custom translation. The specification provides a protocol-agnostic serialization framework that maps CloudEvents attributes and payloads to multiple serialization formats including JSON, Avro, and Protobuf, and defines transport bindings for mapping events onto protocols like HTTP, AMQP, Kafka, MQTT
Defines the type mapping table for serializing CloudEvents attributes into Avro primitives.
kcat es un cliente de interfaz de línea de comandos para Apache Kafka utilizado para producir, consumir y depurar mensajes utilizando el protocolo nativo. Proporciona un conjunto de herramientas para interactuar con clusters de Kafka, incluyendo un depurador de protocolos para inspeccionar metadatos del cluster y un gestor de transacciones para manejar lotes de mensajes atómicos. El proyecto cuenta con un decodificador de esquemas Avro especializado que convierte mensajes codificados en binario en JSON legible por humanos mediante la integración con registros de esquemas remotos o archivos locales. Además, incluye un simulador en memoria que permite probar la lógica del productor y consumidor simulando el comportamiento efímero del broker sin requerir infraestructura externa. El conjunto de herramientas cubre una amplia gama de operaciones de mensajería, incluyendo soporte para grupos de consumidores balanceados, búsqueda de offsets basada en marcas de tiempo y streaming de datos transaccionales desde la entrada estándar. También proporciona utilidades para la configuración de seguridad de la conexión y la inspección de metadatos del cluster.
Transforms binary Avro message keys and values into human-readable JSON text.
Racket es un lenguaje de programación de propósito general y multiparadigma de la familia Lisp, diseñado para la creación de lenguajes. Funciona como un banco de trabajo de lenguajes, proporcionando una plataforma para diseñar e implementar lenguajes de programación personalizados a través de un sistema flexible de macros y módulos. El sistema se distingue por ofrecer una suite integral para la ingeniería de semántica, permitiendo la construcción de subconjuntos de lenguajes especializados y capas educativas. Incluye herramientas para el diseño de lenguajes personalizados, como la generación de lexers y parsers, así como la capacidad de definir reglas de expansión de módulos y selección dinámica de lenguaje en tiempo de lectura. El proyecto proporciona un entorno de desarrollo integrado con un editor incorporado, depurador visual y un gestor de paquetes de software. Su superficie de capacidades se extiende a una biblioteca estándar de propósito general que cubre renderizado de gráficos 2D, procesamiento de datos binarios, integración con SQL y bases de datos deductivas, y la construcción de interfaces gráficas de usuario. El entorno admite la compilación de código fuente en archivos ejecutables independientes para su distribución.
Provides serialization and deserialization of data using the Apache Avro protocol based on JSON schemas.
Arroyo is a high-performance stream processing platform built in Rust. It executes continuous SQL queries on streaming data with event-time semantics, enabling accurate windowed aggregations, joins, and stateful computations on unbounded event streams. The platform uses native Rust execution for high throughput and low latency, with periodic checkpointing for exactly-once fault tolerance and horizontal scaling across distributed workers. The system integrates deeply with Kafka for reading and writing topics with exactly-once delivery and supports change data capture (CDC) from MySQL and Postg
Arroyo reads and writes Avro binary data, supporting Confluent Schema Registry and flexible serialization modes for schema distribution.