Why is alibaba/datax a recommended SQL Data Retrieval GitHub Repositories repository?

Implements techniques for filtering and extracting specific data from relational tables using SQL WHERE clauses.

Why is risingwavelabs/risingwave a recommended SQL Data Retrieval GitHub Repositories repository?

Queries real-time data directly using a built-in serving layer and standard SQL.

Why is mouredev/hello-sql a recommended SQL Data Retrieval GitHub Repositories repository?

Provides guides on writing queries to extract and aggregate specific information from relational tables.

Why is redpanda-data/connect a recommended SQL Data Retrieval GitHub Repositories repository?

Extracts, filters, and aggregates data from relational tables using standard SQL query language.

Why is nlpchina/elasticsearch-sql a recommended SQL Data Retrieval GitHub Repositories repository?

Provides the ability to retrieve, filter, sort, and group data from indices using standard SQL syntax.

Why is apache/pinot a recommended SQL Data Retrieval GitHub Repositories repository?

Exposes a tabular data model for retrieving and analyzing information using standard SQL syntax.

Why is trailbaseio/trailbase a recommended SQL Data Retrieval GitHub Repositories repository?

Allows direct execution of SQL queries for complex data modeling and retrieval.

Why is biopython/biopython a recommended SQL Data Retrieval GitHub Repositories repository?

Extracts biological records from relational databases on demand as sequence record objects.

8 个仓库

Awesome GitHub RepositoriesSQL Data Retrieval

Techniques for extracting, filtering, and aggregating data from relational tables using SQL.

Distinguishing note: None of the candidates focus on the general educational practice of writing retrieval queries; they focus on loaders or distributed engines.

Explore 8 awesome GitHub repositories matching data & databases · SQL Data Retrieval. Refine with filters or upvote what's useful.

用 AI 发现最棒的仓库。我们将通过 AI 为您搜索最匹配的仓库。

alibaba/datax
alibaba/DataX
17,241在 GitHub 上查看
DataX is a distributed data integration framework and plugin-based ETL tool designed for synchronizing large datasets between heterogeneous sources and destinations. It functions as a JDBC data migration engine and offline synchronization tool, enabling the movement of data between relational databases, NoSQL stores, and object storage. The system utilizes a plugin-based connector architecture that decouples reader and writer logic, allowing it to map and transform data types across different storage engines using a standardized internal representation. This design supports heterogeneous data
Implements techniques for filtering and extracting specific data from relational tables using SQL WHERE clauses.
Java
在 GitHub 上查看17,241
risingwavelabs/risingwave
risingwavelabs/risingwave
9,093在 GitHub 上查看
RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process continuous data streams. It functions as a streaming data lakehouse, combining the capabilities of a streaming SQL database with a platform that integrates streaming ingestion with open table formats. The system is distinguished by its use of the PostgreSQL wire protocol, allowing it to integrate with existing SQL tools and drivers. It employs a decoupled compute and storage architecture, persisting streaming state and materialized views in cloud object storage to enable independen
Queries real-time data directly using a built-in serving layer and standard SQL.
Rustapache-icebergdata-engineeringdatabase
在 GitHub 上查看9,093
mouredev/hello-sql
mouredev/hello-sql
8,826在 GitHub 上查看
hello-sql is a collection of educational resources and practical guides designed for mastering relational database design, SQL query writing, and schema mapping. It provides a set of lessons and exercises for practicing the creation and manipulation of data within relational databases. The project includes a database schema workbook for designing tables and mapping relationships, alongside a dedicated SQL query guide for writing selection, filtering, and aggregation statements. These resources are delivered through a relational database tutorial and a broader SQL learning resource. The mater
Provides guides on writing queries to extract and aggregate specific information from relational tables.
Pythonbasesdedatoscursodatabase
在 GitHub 上查看8,826
redpanda-data/connect
redpanda-data/connect
8,681在 GitHub 上查看
Connect is a Kafka data integration platform and stream processing engine used to build declarative pipelines that move and transform messages between Kafka topics and external sources. It functions as a Kafka Connect framework and a change data capture tool, streaming real-time database modifications to synchronize data across distributed environments. The project differentiates itself through a dedicated mapping language for mutating and reshaping message payloads and the ability to execute custom processing logic within a sandboxed WebAssembly runtime. It also provides an observability pip
Extracts, filters, and aggregates data from relational tables using standard SQL query language.
Goamqpcqrsdata-engineering
在 GitHub 上查看8,681
nlpchina/elasticsearch-sql
NLPchina/elasticsearch-sql
7,012在 GitHub 上查看
This project provides a SQL interface for Elasticsearch, serving as a translator and database layer that allows users to retrieve, filter, and manipulate indices using structured query language. It functions by converting standard SQL statements into the native JSON query language used by the search engine. The system includes a geospatial SQL engine for executing location-based searches and distance calculations. It also features a query debugger used to visualize the translation process from SQL to search engine request bodies to verify the logic and accuracy of data retrieval. The capabil
Provides the ability to retrieve, filter, sort, and group data from indices using standard SQL syntax.
Java
在 GitHub 上查看7,012
apache/pinot
apache/pinot
6,098在 GitHub 上查看
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Exposes a tabular data model for retrieving and analyzing information using standard SQL syntax.
Java
在 GitHub 上查看6,098
trailbaseio/trailbase
trailbaseio/trailbase
5,324在 GitHub 上查看
Trailbase 是一个后端即服务（BaaS）平台，以单个可执行文件的形式交付，集成了实时数据库引擎、身份和访问管理器以及类型安全的 API 生成器。它提供了一个全面的后端环境，包括基于 SQLite 的存储引擎和用于执行自定义逻辑的 WebAssembly 运行时服务器。该平台通过自动将数据库模式转换为具有跨语言客户端绑定的 JSON API，以及允许执行用于服务器端渲染和自定义 HTTP 路由的便携式组件而脱颖而出。它还集成了向量数据库功能，以支持嵌入向量的存储和基于相似性的向量搜索。该系统涵盖了广泛的操作功能，包括支持社交登录的用户认证、用于数据可见性的访问控制列表，以及用于实时数据更新的发布-订阅（pub-sub）同步。它还提供了通过 SQL 迁移管理数据库模式以及处理地理空间数据的工具。
Allows direct execution of SQL queries for complex data modeling and retrieval.
Rustauthenticationdatabaserest-api
在 GitHub 上查看5,324
biopython/biopython
biopython/biopython
5,078在 GitHub 上查看
Biopython 是一个 Python 生物信息学库，提供用于解析、操作和分析生物序列、分子结构和系统发育树的工具。它作为基因组和蛋白质组数据的生物序列解析器，支持多种行业标准文件格式，并充当从 NCBI Entrez 仓库查询生物数据和引用的接口。该项目以其用于蛋白质结构分析和系统发育树构建的专业工具包而著称。它包括用于处理 PDB 和 mmCIF 文件以计算分子几何结构的蛋白质结构分析器，以及用于分析物种间进化关系的系统发育树工具包。该库涵盖了广泛的生物信息学能力，包括用于转录和翻译的基因组序列分析、序列比对管理以及群体遗传学计算。它还提供用于 3D 原子坐标操作的结构分析工具，以及用于基因组特征可视化和生物地理数据建模的实用程序。该系统通过工具封装与外部生物信息学二进制文件集成，并支持通过 SQL 后端进行持久化生物记录存储。
Extracts biological records from relational databases on demand as sequence record objects.
Pythonbioinformaticsbiopythondna
在 GitHub 上查看5,078

Awesome SQL Data Retrieval GitHub Repositories

alibaba/DataX

risingwavelabs/risingwave

mouredev/hello-sql

redpanda-data/connect

NLPchina/elasticsearch-sql

apache/pinot

trailbaseio/trailbase

biopython/biopython

探索子标签