10 Repos
Tools that fetch repository metadata and contributor statistics directly from remote service endpoints.
Explore 10 awesome GitHub repositories matching data & databases · GitHub API Aggregators. Refine with filters or upvote what's useful.
This project functions as a curated software directory and developer resource index, providing a centralized platform for discovering and evaluating high-quality open-source repositories. It serves as an aggregator that monitors trending software and educational resources, organizing them by technical domain and programming language to assist developers in identifying tools for their specific technical challenges. The directory distinguishes itself through a community-driven curation workflow, where repository lists are validated and updated based on collective developer consensus. This infor
Retrieves real-time repository metadata and contributor statistics directly from remote service endpoints.
FossFLOW is an open source metadata search engine and data platform designed to aggregate and normalize repository information from multiple code hosting services. It functions as a developer productivity utility, enabling users to discover software projects and analyze contributor networks through a unified, searchable index. The platform distinguishes itself by utilizing vector-based semantic search, which converts project descriptions and code metadata into numerical embeddings to facilitate discovery based on conceptual relevance. To maintain a consistent view of disparate data, the syste
Aggregates and filters project metadata directly from remote code hosting service endpoints.
This project is a collection of educational resources and reference implementations for the Apache Flink stream processing framework. It provides a learning resource focused on mastering distributed stream processing through implementation guides, performance tuning tutorials, and practical examples. The repository features detailed walkthroughs for building real-time data pipelines using the DataStream and Table APIs. It includes specific integration examples for connecting Apache Flink with Kafka brokers and Elasticsearch indices, as well as reference implementations for real-time deduplica
Groups continuous data flows into temporal or count-based windows to perform periodic aggregations.
Devhub is a cross-platform developer tool and event aggregator designed to monitor GitHub activities. It provides a unified interface for tracking issues, notifications, and user actions across multiple repositories, consolidating these updates into a single view to reduce notification clutter. The application utilizes a multi-column dashboard for organizing data streams via customizable filters and saved searches. This interface allows for the management of review queues, the monitoring of specific user actions, and the display of notification context without requiring navigation to the sour
Collects repository updates and user actions into a single view to reduce notification clutter.
Star History is a suite of utilities for visualizing the growth of GitHub repositories over time. It functions as a star growth visualizer, a repository comparison tool, a metric embedder for external websites, and a trending analytics dashboard. The project enables the analysis of star acquisition rates for multiple repositories on a single chart to determine relative growth. It also provides the ability to rank repositories by growth windows to identify rising projects. The system covers project analytics and open source benchmarking by generating time-series charts and growth reports. It
Aggregates star event data by polling the GitHub API for historical repository metrics.
RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process continuous data streams. It functions as a streaming data lakehouse, combining the capabilities of a streaming SQL database with a platform that integrates streaming ingestion with open table formats. The system is distinguished by its use of the PostgreSQL wire protocol, allowing it to integrate with existing SQL tools and drivers. It employs a decoupled compute and storage architecture, persisting streaming state and materialized views in cloud object storage to enable independen
Groups continuous event flows into time intervals to calculate periodic metrics.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Consumes events from streaming sources to create a unified, queryable SQL view across microservice architectures.
Starred is a utility that automates the management and documentation of starred repositories. It functions by fetching repository metadata through the GitHub API and organizing these projects into structured, categorized lists based on programming language or topic. The tool distinguishes itself by maintaining these lists through automated, scheduled workflows that synchronize data directly to a dedicated repository. It supports the inclusion of private repositories in the generated output, ensuring that a user's complete collection is documented and backed up. The project provides a configu
Fetches repository metadata and contributor statistics directly from remote service endpoints.
Indigo ist ein Go-basiertes Framework und Toolkit für den Aufbau, das Hosten und Skalieren von Diensten innerhalb des AT Protocol-Ökosystems. Es bietet die grundlegende Infrastruktur für dezentrales Social Networking und ermöglicht es Entwicklern, Relay-Dienste zu implementieren, kryptografisch signierte Benutzer-Repositories zu verwalten und die Identitätsauflösung über föderierte Umgebungen hinweg zu handhaben. Das Projekt zeichnet sich durch eine robuste Architektur aus, die Content-Hosting von Discovery entkoppelt, was unabhängige Moderation und algorithmische Feed-Generierung ermöglicht. Es nutzt inhaltsadressierten Speicher und Merkle-Tree-basierte Repository-Strukturen, um die Datenintegrität sicherzustellen, während die Lexicon-gesteuerte Schema-Generierung automatisch typsichere Strukturen für die Cross-Service-Kommunikation erstellt. Durch die Abbildung menschenlesbarer Handles auf dezentrale Identifikatoren behält das System die verifizierbare Benutzereigentümerschaft und Account-Portabilität über unabhängige Hosting-Anbieter hinweg bei. Über seine Kernidentität hinaus deckt das Projekt ein umfassendes Spektrum für die Verwaltung verteilter Zustände ab, einschließlich Echtzeit-Event-Streaming, Synchronisierung und automatisierter Moderation. Es bietet umfangreiche Tools für die Simulation von Netzwerkaktivitäten, operative Telemetrie und die Indizierung globaler Datenströme. Das Framework ist für Produktionsumgebungen konzipiert und bietet containerisierte Bereitstellungsoptionen sowie Diagnose-Endpunkte zur Überwachung der Synchronisierungsgesundheit und Systemleistung.
Collects and streams data records from multiple personal data servers into a unified feed for real-time network monitoring and indexing.
This project provides a curated catalog of community-contributed geospatial datasets designed for environmental analysis and mapping workflows. It functions as a centralized repository for discovering and retrieving geographic information, facilitating access to earth observation data without the need for manual preprocessing. Beyond its role as a data catalog, the project includes automation utilities for maintaining project documentation and monitoring repository health. It uses marker-based text injection to dynamically update documentation files and aggregates public engagement metrics, s
Fetches repository metadata and community engagement statistics directly from remote service endpoints.