29 مستودعات
Processes for mapping disparate data structures into a unified internal schema.
Distinguishing note: Focuses on schema unification across multiple external sources.
Explore 29 awesome GitHub repositories matching data & databases · Data Normalization. Refine with filters or upvote what's useful.
VeighNa is an event-driven, modular platform designed for the development, backtesting, and execution of automated financial trading strategies. It provides a comprehensive suite of tools that includes a centralized trading terminal for monitoring portfolios and market conditions, alongside a robust algorithmic trading engine that manages real-time data processing and order execution. The platform distinguishes itself through a highly decoupled architecture that isolates algorithmic logic from market connectivity, allowing for independent strategy development and testing. It utilizes a dynami
Normalizes heterogeneous market data and order streams into a consistent internal format for cross-platform analysis.
Notifire is a multi-channel notification infrastructure designed to route and dispatch alerts across email, SMS, push, and chat providers through a unified interface. It functions as an agent communication gateway that normalizes inbound and outbound messages between chat platforms and AI agents for consistent data processing. The system includes a notification workflow engine that uses branching conditions and batching capabilities to design delivery sequences and reduce user fatigue. It also provides a pre-built notification center component, allowing web applications to embed a real-time i
Translates disparate third-party communication formats into a single consistent data model for downstream processing.
NewPipe is a privacy-focused media client that aggregates content from multiple streaming platforms into a single, unified interface. By utilizing a specialized parsing engine, the application extracts structured metadata directly from raw web content, allowing users to browse and play media without requiring individual service accounts or proprietary tracking. The application distinguishes itself through a decoupled playback engine that separates core streaming logic from the user interface, enabling persistent background audio and floating window playback. To ensure consistent access, the s
Maps disparate platform data structures into a unified internal schema to provide a consistent browsing experience across multiple sources.
Zeroclaw is a modular framework for building and deploying autonomous agents that integrate AI models, messaging platforms, and hardware interfaces. It functions as a multi-agent orchestrator and embedded systems controller, providing a unified runtime for managing agent lifecycles, memory, and security policies across diverse environments. The system distinguishes itself through its focus on secure, verifiable hardware and software orchestration. It enforces strict security boundaries, including command allowlisting, resource throttling, and interactive human-in-the-loop approval for sensiti
Aggregates inbound data from multiple communication platforms into a unified format for consistent processing.
Antigravity-Manager is an artificial intelligence model orchestration platform that functions as a unified gateway for interacting with multiple external service providers. It standardizes heterogeneous vendor data structures into a consistent internal schema, allowing third-party tools to interface with various models through a single, normalized API. The system distinguishes itself through automated infrastructure management, including the lifecycle tracking of service accounts and the secure rotation of authentication credentials. By acting as a middleware layer, it intercepts traffic to p
Normalizes heterogeneous vendor-specific data structures into a consistent internal schema.
Firefly III is a self-hosted personal finance management system built on a double-entry bookkeeping engine. It provides a comprehensive platform for tracking income, expenses, and account balances while maintaining financial integrity through structured accounting principles. Designed for private use, the system supports multi-user access, allowing independent financial administrations to coexist within a single installation. The platform distinguishes itself through extensive automation and integration capabilities. It features a robust REST JSON API and webhook system that enables programma
Maps disparate transaction structures into a unified internal schema for financial tracking.
Vector is a high-performance observability data pipeline designed to collect, transform, and route logs, metrics, and traces across distributed infrastructure. It functions as a modular engine that decouples data ingestion from processing and transmission, utilizing a component-based architecture to connect diverse sources to multiple destinations. The project distinguishes itself through a focus on reliability and flow control. It implements backpressure-aware data movement to prevent data loss during traffic spikes and utilizes disk-backed event buffering to ensure durability during network
Normalizes non-ASCII log messages into UTF-8 format during ingestion.
This project is a comprehensive research platform designed for the end-to-end lifecycle of robotic learning. It provides a modular framework for training neural network policies—specifically through imitation and reinforcement learning—and deploying them onto physical robotic hardware. By offering a unified interface for hardware abstraction, the platform decouples high-level control logic from the specific sensors and actuators of diverse robotic systems. The framework distinguishes itself through a standardized approach to data and policy management. It utilizes a consistent schema for reco
Scales raw action values into a normalized range for model training and inference.
OpenObserve is a unified observability data platform designed to ingest, store, and analyze logs, metrics, and traces. It functions as a cloud-native monitoring tool that centralizes telemetry from diverse sources, including standard collectors and cloud service providers, into a single, scalable system. By utilizing a columnar storage engine backed by object storage, the platform enables efficient long-term data retention and high-performance analytical querying. The platform distinguishes itself through deep integration with artificial intelligence, allowing users to query data using natura
Standardizes incoming data from diverse frameworks into a unified format for consistent analysis.
Telegraf is a modular, cross-platform telemetry pipeline designed to collect, process, and route metrics from diverse infrastructure, applications, and hardware. It functions as a server-side middleware that normalizes heterogeneous data into a unified format, enabling consistent monitoring across complex environments. By utilizing a plugin-driven architecture, the agent manages the entire lifecycle of telemetry data from initial ingestion to final transmission. The project distinguishes itself through a declarative, configuration-driven execution model that allows users to define complex dat
Converts diverse data formats from heterogeneous sources into a unified internal representation for consistent processing and storage.
FossFLOW is an open source metadata search engine and data platform designed to aggregate and normalize repository information from multiple code hosting services. It functions as a developer productivity utility, enabling users to discover software projects and analyze contributor networks through a unified, searchable index. The platform distinguishes itself by utilizing vector-based semantic search, which converts project descriptions and code metadata into numerical embeddings to facilitate discovery based on conceptual relevance. To maintain a consistent view of disparate data, the syste
Maps disparate repository structures and contributor metrics into a consistent internal format for uniform querying.
This project is a comprehensive framework for engineering financial data pipelines, designed to automate the collection, cleaning, and synchronization of large-scale market datasets. It functions as a quantitative trading data engine, providing the infrastructure necessary to manage historical and real-time asset pricing information for research and machine learning workflows. The system distinguishes itself through a configuration-driven approach to orchestration, allowing users to manage complex data acquisition tasks across multiple financial providers. It features resilient middleware tha
Normalizes historical price series by applying adjustments for stock splits and dividends.
LibreTV is a self-hosted media aggregator and streaming client designed to consolidate video content from multiple external providers into a single, unified library. By standardizing metadata and media formats, the platform provides a centralized interface for browsing and managing personal media collections. The application distinguishes itself through its focus on uninterrupted playback and efficient navigation. It features automated manifest parsing to detect and strip commercial segments from video streams, ensuring an ad-free viewing experience. Additionally, the interface supports direc
Maps heterogeneous metadata from disparate external video providers into a unified internal schema.
Redux Toolkit is a state management toolkit and store configurator designed to simplify the development of Redux applications by reducing boilerplate code. It functions as an immutable state manager and a centralized store configuration system that provides a streamlined workflow for managing global application state. The project distinguishes itself through an automated async action orchestrator that manages the lifecycle of promises by automatically dispatching pending, fulfilled, and rejected actions. It also acts as a normalized state organizer, providing tools to structure complex relati
Provides utilities to organize and maintain complex relational data in a flattened, normalized format.
Vue Storefront is a composable commerce platform designed to decouple the presentation layer from backend systems. By providing a headless frontend framework, it enables developers to build high-performance, mobile-first digital storefronts that remain independent of specific commerce engines, payment providers, or content management systems. The platform distinguishes itself through a modular architecture that uses standardized integration adapters to aggregate data from disparate services into a unified layer. This approach allows businesses to modernize legacy infrastructure or manage comp
Standardizes data structures across disparate commerce backends to ensure frontend consistency.
CS-Xmind-Note is a collection of structured mind maps and conceptual diagrams serving as a comprehensive knowledge base for computer science fundamentals. It functions as an academic reference and study guide, organizing core subjects into a visual mapping of interdependent technical concepts. The project utilizes an XMind-compatible schema to model complex domains through hierarchical nodes and relational concept mapping. This approach allows for the visual representation of technical layers, linking hardware specifications to software abstractions. The knowledge base covers several primary
Explains the process of decomposing relational schemas into normal forms to eliminate data redundancy.
Om is a frontend state management library and reactive user interface framework that integrates ClojureScript functional programming with the React virtual DOM rendering engine. It provides a bridge to build responsive web interfaces where visual elements automatically update when underlying application data changes. The project centers on a normalized state store that flattens complex data structures into a relational format. This data is accessed through a reader-based querying system, which decouples the user interface from the state by allowing components to declare specific data requirem
Flattens complex nested data structures into a relational format for efficient storage and retrieval within the application state.
Faraday is a vulnerability management platform and security tool aggregator designed to centralize security findings from multiple scanners into a single dashboard. It utilizes a relational security database to catalog hosts, services, and security flaws, enabling users to track remediation and analyze organizational risk. The platform distinguishes itself through a plugin-based system that normalizes diverse security tool outputs into a unified data model. It supports deep integration with a wide array of scanners and CLI tools, intercepting shell command output or parsing report files to ag
Normalizes diverse security tool outputs into a unified data model with standardized severity levels.
re-frame هو إطار عمل وظيفي لبناء تطبيقات الصفحة الواحدة (SPA) باستخدام ClojureScript. يوفر قاعدة بيانات مركزية غير قابلة للتغيير تعمل كمصدر وحيد للحقيقة لحالة التطبيق بالكامل، مما يفرض تدفق بيانات أحادي الاتجاه صارم حيث تؤدي الأحداث إلى تغييرات في الحالة وتحديثات لاحقة للعرض. يتميز إطار العمل برسم بياني تفاعلي للإشارات وخط أنابيب (pipeline) وسيط يعتمد على المعترضات (interceptors). من خلال التعامل مع منطق التطبيق كسلسلة من الأحداث القائمة على البيانات والآثار الجانبية التصريحية، فإنه يفصل منطق الأعمال عن طبقة العرض. تسمح هذه البنية للمطورين بإدارة تغييرات الحالة المعقدة والعمليات الخارجية من خلال دوال نقية، مما يضمن تنفيذ الآثار الجانبية بواسطة مترجم منفصل بدلاً من الاستدعاءات الإلزامية. يتضمن النظام مجموعة شاملة من الإمكانيات لإدارة بنية التطبيق، بما في ذلك اشتقاق البيانات التفاعلي، ومطابقة العرض القائمة على الاشتراكات، وإدارة الحالة القائمة على الأحداث. يدعم سير عمل التطوير المتقدم مثل تتبع الأحداث، وحفظ نقاط فحص الحالة، والقدرة على محاكاة الآثار الجانبية للاختبار المعزول. تم تصميم المشروع للتكامل مع React، مستفيداً من مطابقة DOM الافتراضي لتحديث واجهات المستخدم بكفاءة. يوفر مجموعة قوية من الأدوات للتعامل مع الاهتمامات المتقاطعة، وإدارة رسوم بيانية معقدة لتدفق البيانات، وتنسيق العمليات غير المتزامنة ضمن خط أنابيب أحداث متسلسل وقابل للتنبؤ.
Structures complex data into flat, relational maps within the central store to mirror server-side database schemas and simplify data updates.
هذا المشروع عبارة عن مجموعة شاملة من مواد تعليم برمجة Python، بما في ذلك البرامج التعليمية، والتمارين، وعينات الكود المنسقة. يعمل كمنهج تعليمي ومجموعة أدوات هندسة برمجيات، باستخدام Jupyter Notebooks لدمج الكود القابل للتنفيذ مع نص تعليمي وصفي. يوفر المستودع أدلة تنفيذ عملية لبناء تطبيقات نماذج لغوية كبيرة، مثل أنظمة التوليد المعزز بالاسترجاع، ووكلاء الذكاء الاصطناعي ذوي الحالة، وسير عمل التعلم الآلي. يتميز بتقديم نهج منظم لسير عمل الترميز الوكيل، وتغطية تقطير نافذة السياق، وتوجيه النموذج المستقل عن المزود، والمخرجات المهيكلة المفروضة بالمخطط. تغطي المواد مجموعة واسعة من قدرات هندسة البرمجيات، بما في ذلك البرمجة غير المتزامنة مع طوابير المهام الموزعة، وتطوير تطبيقات الويب مع REST APIs، وسير عمل تحليل البيانات. كما يتضمن موارد لإتقان التصميم الموجه للكائنات، وتنفيذ خطوط أنابيب CI/CD، وتطبيق معايير التنسيق والتدقيق المهنية.
Demonstrates mathematical transformation of numerical data using techniques like min-max or Z-score normalization.