Why is netflix/falcor a recommended Data Source Unification GitHub Repositories repository?

Represents multiple remote data sources as a single virtual JSON model for consistent access.

Why is yusufkaraaslan/skill_seekers a recommended Data Source Unification GitHub Repositories repository?

Merges content from docs, code, and documents while detecting conflicts between documentation and implementation.

Why is cue-lang/cue a recommended Data Source Unification GitHub Repositories repository?

Combines data from different sources by merging constraints into a single consistent result.

Why is spaceandtimefdn/blitzar a recommended Data Source Unification GitHub Repositories repository?

Unifies real-time indexed blockchain data and off-chain datasets into a single verifiable source.

Why is useplunk/plunk a recommended Data Source Unification GitHub Repositories repository?

Merges transactional, campaign, and workflow interactions into a single comprehensive user record.

Why is apache/gravitino a recommended Data Source Unification GitHub Repositories repository?

Organizes metadata from diverse sources into a hierarchical structure of metalakes, catalogs, and schemas.

6 repositorios

Awesome GitHub RepositoriesData Source Unification

Processes that merge disparate data sources while detecting contradictions between different representations of the same information.

Distinguishing note: None of the candidates cover the specific act of merging multi-format sources with conflict detection.

Explore 6 awesome GitHub repositories matching data & databases · Data Source Unification. Refine with filters or upvote what's useful.

Encuentra los mejores repositorios con IA.Buscaremos los repositorios que mejor coincidan usando IA.

netflix/falcor
Netflix/falcor
10,572Ver en GitHub
Falcor is a JavaScript library that models remote data as a single virtual JSON graph, providing a path-based query engine for efficient client-side data retrieval and updates. It represents multiple remote data sources as a unified document where entities are accessed via globally unique identity paths. The system distinguishes itself by treating the remote data model as a virtual JSON resource, allowing the client to query specific paths without managing individual endpoints. It uses a reference-aware graph model to handle many-to-many relationships and prevents data duplication. Network ef
Represents multiple remote data sources as a single virtual JSON model for consistent access.
JavaScript
Ver en GitHub10,572
yusufkaraaslan/skill_seekers
yusufkaraaslan/Skill_Seekers
9,641Ver en GitHub
Skill Seekers is a toolset for generating large language model knowledge bases, featuring a multi-source content scraper and a dedicated RAG data pipeline. It extracts technical data from documentation, code, and video to create structured assets and configuration files for AI-powered IDE extensions. The project distinguishes itself through the ability to transform raw data into polished tutorials and specialized skills for AI plugin marketplaces. It utilizes abstract syntax tree parsing and optical character recognition to analyze GitHub repositories, PDFs, and video frames, converting these
Merges content from docs, code, and documents while detecting conflicts between documentation and implementation.
Pythonai-toolsast-parserautomation
Ver en GitHub9,641
cue-lang/cue
cue-lang/cue
6,147Ver en GitHub
CUE is a constraint-based configuration language designed for data validation, schema definition, and code generation. At its core, it unifies types and values into a single concept, enabling compile-time validation that catches structural and value errors before runtime. The language treats data and constraints as the same thing, allowing a single definition to serve as both a schema and concrete configuration data. CUE distinguishes itself through its constraint-based unification engine, which combines multiple configuration sources into a single coherent result by merging their constraints
Combines data from different sources by merging constraints into a single consistent result.
Goconfigurationdatakubernetes
Ver en GitHub6,147
spaceandtimefdn/blitzar
spaceandtimefdn/blitzar
4,884Ver en GitHub
Blitzar es un motor de pruebas SQL verificables y una librería criptográfica diseñada para la computación SQL verificable. Permite la ejecución de consultas de base de datos fuera de la cadena (off-chain) mientras genera pruebas de conocimiento cero que certifican la corrección de los resultados para su verificación en la cadena (on-chain). El proyecto se distingue por un acelerador de pruebas basado en GPU que descarga cargas de trabajo criptográficas pesadas a procesadores gráficos, reduciendo el tiempo requerido para la generación de pruebas sucintas. Proporciona primitivas criptográficas de alto rendimiento para aplicaciones en C++ y Rust, centrándose en operaciones de curvas elípticas y multiplicación multiescalar. El sistema cubre una amplia superficie de gestión de datos y seguridad, incluyendo la integración de datos sin confianza (trustless) que combina la indexación de blockchain con conjuntos de datos fuera de la cadena en tablas relacionales a prueba de manipulaciones. Utiliza consenso BFT y firmas de umbral para mantener la integridad del estado, junto con mecanismos para la sincronización de datos basada en quórum y la entrega de resultados verificados mediante callbacks de contratos inteligentes. El código base proporciona bindings nativos para C++ y Rust para exponer sus conjuntos de herramientas criptográficas y librerías de computación de pruebas.
Unifies real-time indexed blockchain data and off-chain datasets into a single verifiable source.
C++cpp20curve25519elliptic-curve-cryptography
Ver en GitHub4,884
useplunk/plunk
useplunk/plunk
4,875Ver en GitHub
Plunk is an SMTP email marketing platform and contact relationship manager used for sending bulk broadcasts and transactional emails. It provides a transactional email API for delivering personalized messages using templates and variable substitution, supported by built-in analytics and custom domain authentication. The platform features an email automation workflow engine with a visual builder for creating multi-step sequences triggered by user events and conditional logic. It includes a dynamic audience segmentation tool that groups contacts based on real-time data attributes and behavioral
Merges transactional, campaign, and workflow interactions into a single comprehensive user record.
TypeScript
Ver en GitHub4,875
apache/gravitino
apache/gravitino
2,866Ver en GitHub
Gravitino is a federated metadata lake and unified data catalog designed to manage tables, files, and AI models across diverse data sources and cloud storage. It serves as a centralized interface for governing schemas, access controls, and tagging across relational databases, messaging queues, and object stores. The project distinguishes itself by unifying the management of AI assets, such as machine learning models and their version lineages, alongside traditional tabular data. It also implements the Iceberg REST specification to provide a standardized metadata server and proxy for lakehouse
Organizes metadata from diverse sources into a hierarchical structure of metalakes, catalogs, and schemas.
Javaai-catalogdata-catalogdatalake
Ver en GitHub2,866

Awesome Data Source Unification GitHub Repositories

Netflix/falcor

yusufkaraaslan/Skill_Seekers

cue-lang/cue

spaceandtimefdn/blitzar

useplunk/plunk

apache/gravitino

Explorar subetiquetas