73 repositorios
Encoding data into JSON formats validated against specific schemas for interoperability.
Explore 73 awesome GitHub repositories matching data & databases · JSON-Schema. Refine with filters or upvote what's useful.
MinerU is a document parsing pipeline designed to transform unstructured files into machine-readable, structured data. It utilizes deep learning models to perform layout analysis, identifying document regions and extracting complex content such as mathematical expressions. By combining these neural network inferences with geometric heuristics, the system reconstructs the reading order and structural hierarchy of documents to ensure accurate data representation. The project distinguishes itself through a multi-stage processing workflow that integrates layout detection, optical character recogn
Encodes extracted document features and spatial coordinates into a standardized schema for seamless interoperability.
This project is a comprehensive dataset and archive of classical Chinese poetry, prose, and Confucian classics. It serves as a digital humanities corpus, providing machine-readable access to hundreds of thousands of poems and detailed poet biographies, specifically spanning the Tang and Song dynasties. The collection is distinguished by its scholarly depth, incorporating textual variation annotations to track disputed characters across different source editions. It also includes tonal pattern mapping to describe the rhythmic and phonetic structures of the verse, alongside a popularity ranking
Validates the structural integrity of the poetry collection using JSON schemas to ensure data consistency.
This project is a Python framework for building autonomous, event-driven agent systems. It provides a unified runtime for orchestrating multi-agent workflows, managing persistent conversation state, and executing code within secure, isolated sandbox environments. The framework is designed to handle complex task delegation, allowing agents to invoke other agents as tools while maintaining context across multi-turn interactions. The framework distinguishes itself through its deep integration with the Model Context Protocol, enabling agents to connect to external data sources and remote services
Structures agent responses using strict JSON schemas and custom data models for predictable output.
Joi is a JavaScript data validation library used to define schemas that validate, cast, and sanitize data objects. It functions as an object schema validator and parser, ensuring that input data matches specific types and formats before it is processed by an application. The library features a conditional validation engine capable of dynamic schema enforcement, where validation logic and dependencies change based on the values of other keys within an object. It also serves as a data casting and sanitization tool, transforming input values into target types and removing sensitive keys from the
Serializes the internal validation rule tree into JSON for external analysis or automatic UI generation.
Tolaria is a markdown knowledge base manager and bidirectional note linking system. It functions as an integrated environment for organizing notes and structured data, utilizing YAML frontmatter and wikilinks to establish relational mappings between documents. The project distinguishes itself by integrating language model capabilities directly into the editor for content generation and analysis. It further combines prose with structured data through a markdown spreadsheet editor that renders CSV-formatted files as interactive grids with formula support and cross-sheet referencing. The platfo
Uses structured YAML frontmatter in headers to define document types, properties, and relational links.
GraphiQL is an interactive browser-based integrated development environment for writing, testing, and documenting GraphQL queries and mutations. It functions as a code editor, an API exploration tool, and a schema explorer, providing a visual interface for browsing GraphQL types and fields. The project provides a language server that delivers schema-driven autocompletion, linting, and validation via the Language Server Protocol. It allows for the embedding of high-performance editors into external projects and supports the injection of custom tools and components through a sidebar plugin API.
Checks JSON variables against a generated schema based on declared operations to ensure correct input types.
This project is a JSON Schema form generator and React UI component that automatically creates web forms based on JSON Schema definitions. It serves as a schema-driven form builder used to transform declarative data definitions into interactive user interfaces for data entry and validation. The tool provides capabilities for dynamic form generation and JSON Schema integration, allowing for the automatic creation of input fields and layouts to avoid manual coding. It implements schema-based validation to ensure user input adheres to standardized JSON Schema rules in real time. The system mana
Integrates JSON Schema definitions to drive the structure and validation of interactive user interfaces.
RapidJSON is a high-performance C++ library used for parsing and generating JSON data. It provides both document object model and stream-based interfaces to transform JSON strings into structured data and vice versa. The library includes a JSON schema validator to verify that documents conform to predefined rules and a Unicode transcoder for converting strings between UTF-8, UTF-16, and UTF-32 encodings. It also supports relaxed parsing for non-standard JSON containing comments or trailing commas. Additional capabilities cover JSON pointer navigation for locating specific values and string s
Includes a schema validator to verify that JSON documents conform to predefined structural rules.
RapidJSON is a header-only C++ library designed for high-performance parsing, generation, and manipulation of JSON data. It functions as a dual-mode engine, providing both an in-memory document object model for tree-based manipulation and a stream-based interface for event-driven processing. The library is built to minimize memory footprint and maximize execution speed, making it suitable for resource-constrained environments. The library distinguishes itself through advanced memory management and optimization techniques, including in-situ parsing that modifies input buffers directly to elimi
Ensures data integrity by verifying structured content against predefined JSON-Schema rules during parsing or generation.
Outlines is a guided text generation framework and structured output engine for large language models. It enforces precise structural constraints on model output during the sampling process to ensure the generation of valid data. The framework ensures that model outputs strictly adhere to predefined data models, including JSON schemas, regular expressions, and formal grammars. This enables the conversion of natural language inputs into structured arguments for function calling and the generation of valid JSON for downstream processing. The system manages model orchestration through prompt te
Ensures large language models produce valid JSON that adheres to specific schemas for reliable downstream processing.
🔮 Graphile's Crystal Monorepo; home to Grafast, PostGraphile, pg-introspection, pg-sql2 and much more!
Exports the in-memory GraphQL schema as standalone JavaScript code that runs without database introspection.
PostGraphile is an automated tool that converts a PostgreSQL database schema into a fully functional GraphQL API. It serves as a GraphQL execution engine and schema orchestrator, utilizing database schema introspection to retrieve strongly typed metadata directly from PostgreSQL. The project features a modular system for composing and standardizing GraphQL schemas through plugins, which manage naming conventions and connections. It includes a PostgreSQL query builder that constructs dynamic, SQL-injection-proof queries using tagged template literals. The system employs a declarative query pl
Converts in-memory schemas into raw JavaScript source code for execution in standalone environments.
This project is a machine learning research automation system designed to manage the full research lifecycle, from idea discovery to final paper submission. It utilizes markdown-based skill templates to execute autonomous research tasks and manage iterative loops of deep review and experimentation. The system distinguishes itself through integrated capabilities for academic communication and integrity auditing. It can automate the generation of LaTeX papers, conference slide decks, and evidence-grounded peer review rebuttals. To ensure rigor, it employs cross-model review routing and adversar
Translates free-form text into schema-constrained JSON programs to specify entities and spatial relationships.
This project provides a standardized RESTful API for accessing comprehensive aerospace mission records and space exploration data. It serves as a structured interface for retrieving historical and upcoming launch details, hardware specifications, and media assets, while also providing real-time tracking for satellite orbital paths. The service distinguishes itself through a robust architecture designed for high-performance data retrieval. It utilizes in-memory response caching to reduce latency and server load, alongside query-parameter-based filtering that allows users to precisely control t
Uses JSON-Schema to serialize and validate internal database records for consistent API responses.
This project is a comprehensive Node.js software development kit designed for integrating large language models into applications. It serves as a foundational client for interacting with REST and WebSocket services, enabling developers to implement chat functionality, multimodal content generation, and autonomous agent orchestration. The library provides a structured framework for defining executable tools and enforcing JSON schemas, ensuring that model outputs remain programmatically compatible with downstream systems. The SDK distinguishes itself through its robust request orchestration and
Validates model-generated function arguments against JSON schemas to ensure type safety and reliable downstream execution.
This is a markdown-based blog engine and Next.js starter template that renders posts from files with frontmatter support for tags, authors, and metadata. It functions as a static site generator, building a complete blog into deployable HTML files for any hosting provider, while using Tailwind CSS utility classes for fully customizable typography, layout, and color schemes. The template generates RSS feeds, sitemaps, and structured metadata for search engine visibility, and supports connecting to external services like analytics tracking and comment sections through configurable plugin modules
Extracts post attributes like title, date, and tags from YAML frontmatter blocks for automated page generation.
Altair is a declarative data visualization library for Python that generates Vega-Lite specifications. It functions as a tool for mapping data to graphical marks using a high-level syntax, allowing users to describe the desired visual outcome instead of writing imperative drawing commands. The framework enables the creation of interactive charts and graphics, including linked views and filtered displays that respond to user input in real time. It supports the design of multi-view dashboards by combining visualizations into layered or faceted layouts. The library provides capabilities for sta
Translates Python class structures into standardized JSON specifications that describe a visualization's visual and data mappings.
Jackson is a Java data binding framework and multi-format data serializer used to translate data structures into native language objects. It functions as a JSON data binding library and a streaming parser that reads and writes data as discrete tokens to process large datasets with minimal memory. The project distinguishes itself through a bytecode serialization accelerator that replaces standard reflection with generated bytecode to increase data binding speed. It employs a module-based extensibility model to support a wide range of formats beyond JSON, including XML, YAML, CSV, TOML, and bin
Creates JSON schema definitions from data models to validate the structure of incoming documents.
This project is a curated directory and catalog of privacy-respecting software and security-focused services. It serves as a structured resource for finding alternatives to corporate services, focusing on tools that prioritize data sovereignty, end-to-end encryption, and user anonymity. The directory is maintained as a markdown-based resource list and rendered via a static site generator. It further extends its utility through a CORS-enabled public API and a JSON-based data schema, allowing the curated catalog of tools and providers to be retrieved programmatically. The collection covers a w
Utilizes a consistent JSON-based data schema to enable programmatic retrieval and categorization of privacy tools.
This project is a cross-editor theme library and mapper that provides a curated collection of syntax highlighting color schemes. It uses a JSON theme specification to define visual roles and color palettes as immutable data structures, ensuring these definitions remain consistent across different software output targets. The system functions as a declarative style generator, translating abstract color definitions into the specific configuration formats required by various programming environments. By using a template-based approach, it maps a single source of truth for color definitions to mu
Uses immutable JSON data structures to define visual themes consistently across all output targets.