What are the best open-source alternatives to Hive?

30 open-source projects similar to apache/hive, ranked by shared features. Top picks: apache/spark, apache/pinot, lancedb/lancedb, hazelcast/hazelcast, prestodb/presto, quickwit-oss/quickwit, apache/druid, apache/flink, cwida/duckdb, line/armeria.

Is apache/spark a good alternative to Hive?

Apache Spark is a unified distributed data processing engine designed for large-scale data analysis and computation graphs. It functions as a distributed machine learning framework, a graph processing system, a real-time stream processor, and a SQL analytics engine. The system enables the executio…

Is apache/pinot a good alternative to Hive?

Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system arch…

Is lancedb/lancedb a good alternative to Hive?

LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The syst…

Is hazelcast/hazelcast a good alternative to Hive?

Hazelcast is a distributed data platform that combines an in-memory data grid with a stream processing engine to support real-time analytics and event-driven applications. It functions as a partitioned, distributed key-value store that replicates data across cluster nodes to provide low-latency acc…

Is prestodb/presto a good alternative to Hive?

Presto is a distributed SQL query engine designed for high-performance analytical processing across heterogeneous data sources. It functions as a data federation platform and massively parallel processing engine, allowing users to execute interactive queries against diverse storage systems without…

Is quickwit-oss/quickwit a good alternative to Hive?

Quickwit is a cloud-native, distributed search engine designed for observability data such as logs, traces, and metrics. It functions as an observability backend that decouples compute from storage by persisting indices directly in S3-compatible cloud object stores. The system is distinguished by…

Is apache/druid a good alternative to Hive?

Apache Druid is a real-time analytics database and distributed columnar time-series store designed for sub-second analytical queries. It functions as a data platform featuring a distributed SQL query engine and a real-time data ingestion system for moving historical and streaming data from external…

Is apache/flink a good alternative to Hive?

Apache Flink is a distributed processing engine designed for both high-throughput, low-latency data streams and finite batch workloads. It functions as a stateful stream processor and a SQL stream processing engine, providing a unified runtime to execute relational queries and event-based transform…

Is cwida/duckdb a good alternative to Hive?

DuckDB is an embedded, in-process analytical SQL database and OLAP database management system. It functions as a data engine for Parquet and CSV files, allowing users to execute complex SQL queries on large datasets without requiring a separate server process. The system is designed for local anal…

Is line/armeria a good alternative to Hive?

Armeria is a Netty-based microservice framework used for building high-performance asynchronous services. It functions as a multi-protocol RPC server capable of exposing gRPC, Thrift, and REST services on a single unified port. The project is distinguished by its ability to run diverse communicati…

Back to apache/hive

Open-source alternatives to Hive

30 open-source projects similar to apache/hive, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Hive alternative.

apache/spark
apache/spark
43,467View on GitHub
Apache Spark is a unified distributed data processing engine designed for large-scale data analysis and computation graphs. It functions as a distributed machine learning framework, a graph processing system, a real-time stream processor, and a SQL analytics engine. The system enables the execution of distributed SQL querying, large-scale graph analysis, and real-time stream analytics across clusters of machines. It also provides a scalable environment for implementing machine learning algorithms and predictive model development on massive datasets. The engine incorporates relational query e
Scalabig-datajavajdbc
View on GitHub43,467
apache/pinot
apache/pinot
6,098View on GitHub
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Java
View on GitHub6,098
lancedb/lancedb
lancedb/lancedb
9,031View on GitHub
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
HTMLapproximate-nearest-neighbor-searchimage-searchnearest-neighbor-search
View on GitHub9,031

Open-source alternatives to Hive

apache/spark

apache/pinot

lancedb/lancedb

hazelcast/hazelcast

prestodb/presto

quickwit-oss/quickwit

apache/druid

apache/flink

cwida/duckdb

line/armeria

GreptimeTeam/greptimedb

slatedb/slatedb

apache/hbase

druid-io/druid

delta-io/delta

pawelsalawa/sqlitestudio

zhisheng17/flink-learning

Velocidex/velociraptor

memgraph/memgraph

linkedin/school-of-sre

youtube/vitess

comfyanonymous/ComfyUI_examples

apache/shardingsphere

ibis-project/ibis

apache/incubator-druid

ron-rs/ron

apache/ignite

hatchet-dev/hatchet

openobserve/openobserve

Comfy-Org/ComfyUI