What are the best open-source alternatives to LakeFS?

30 open-source projects similar to treeverse/lakefs, ranked by shared features. Top picks: lancedb/lancedb, prefecthq/prefect, aws/aws-cdk, delta-io/delta, magit/magit, attic-labs/noms, xirong/my-git, clearml/clearml, treeverse/dvc, insforge/insforge.

Is lancedb/lancedb a good alternative to LakeFS?

LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The syst…

Is prefecthq/prefect a good alternative to LakeFS?

Prefect is a workflow orchestration platform designed to define, schedule, and monitor complex data pipelines as Python code. It functions as a container-native engine that wraps individual tasks in isolated environments, ensuring consistent dependencies and resource allocation across diverse infra…

Is aws/aws-cdk a good alternative to LakeFS?

The AWS Cloud Development Kit is an infrastructure-as-code framework that enables developers to define and provision cloud resources using familiar programming languages. By utilizing construct-based synthesis, it translates high-level, object-oriented code into declarative templates, allowing for…

Is delta-io/delta a good alternative to LakeFS?

Delta is a lakehouse table format that brings ACID transactions and data warehouse consistency to large scale data lakes on cloud object storage. It serves as an ACID transaction manager, coordinating atomic commits and serializable isolation for concurrent reads and writes across distributed compu…

Is magit/magit a good alternative to LakeFS?

Magit is a complete Git interface that runs inside Emacs, providing a full-featured porcelain for version control operations without leaving the editor. It renders repository state as structured, collapsible sections within Emacs buffers, and manages Git command execution through a transactional pr…

Is attic-labs/noms a good alternative to LakeFS?

Noms is a distributed version control database and content-addressable data store. It identifies data by cryptographic hashes to ensure integrity and deduplication, while tracking dataset state changes through a sequence of immutable commits to enable branching, forking, and historical recovery. T…

Is xirong/my-git a good alternative to LakeFS?

my-git is a comprehensive framework and reference guide for Git version control administration, repository governance, and software release management. It provides a structured approach to managing the software development lifecycle, from initial feature branching to final production deployment. T…

Is clearml/clearml a good alternative to LakeFS?

ClearML is a comprehensive MLOps platform designed to manage the end-to-end machine learning lifecycle, from initial experimentation to production deployment. It provides a suite of integrated tools including a pipeline orchestrator for automating workflows, an experiment tracking tool for logging…

Is treeverse/dvc a good alternative to LakeFS?

DVC is a data versioning tool and pipeline orchestrator designed to track large datasets and machine learning models using external storage and metadata pointers. It integrates with Git by utilizing placeholders to keep heavy artifacts out of the repository while maintaining a versioned link betwee…

Is insforge/insforge a good alternative to LakeFS?

InsForge is a backend-as-a-service platform that provides an integrated suite of tools for managing relational databases, identity provision, object storage, and serverless compute. It functions as an open-source identity provider and a PostgreSQL database manager featuring integrated vector storag…

Back to treeverse/lakefs

Open-source alternatives to LakeFS

30 open-source projects similar to treeverse/lakefs, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best LakeFS alternative.

lancedb/lancedb
lancedb/lancedb
9,031View on GitHub
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
HTMLapproximate-nearest-neighbor-searchimage-searchnearest-neighbor-search
View on GitHub9,031
prefecthq/prefect
PrefectHQ/prefect
21,640View on GitHub
Prefect is a workflow orchestration platform designed to define, schedule, and monitor complex data pipelines as Python code. It functions as a container-native engine that wraps individual tasks in isolated environments, ensuring consistent dependencies and resource allocation across diverse infrastructure. By utilizing a state-machine-based orchestration model, the system tracks execution progress through discrete transitions and persistent event logs to maintain reliable and observable task processing. The platform distinguishes itself through a decoupled worker-API architecture, which sep
Pythonautomationdatadata-engineering
View on GitHub21,640
aws/aws-cdk
aws/aws-cdk
12,817View on GitHub
The AWS Cloud Development Kit is an infrastructure-as-code framework that enables developers to define and provision cloud resources using familiar programming languages. By utilizing construct-based synthesis, it translates high-level, object-oriented code into declarative templates, allowing for the automated management of complex cloud environments through a centralized, code-driven control plane. The framework distinguishes itself through its ability to model infrastructure as a dependency-aware resource graph, ensuring that components are provisioned and updated in the correct order. It
TypeScriptawscloud-infrastructurehacktoberfest
View on GitHub12,817

Open-source alternatives to LakeFS

lancedb/lancedb

PrefectHQ/prefect

aws/aws-cdk

delta-io/delta

magit/magit

attic-labs/noms

xirong/my-git

clearml/clearml

treeverse/dvc

InsForge/InsForge

wandb/wandb

wekan/wekan

dagster-io/dagster

datahub-project/datahub

firebase/quickstart-js

agis/git-style-guide

bup/bup

zimfw/zimfw

huggingface/smollm

minio/minio-go

rustfs/rustfs

zfile-dev/zfile

backup/backup

ling-drag0n/CloudPaste

juicedata/juicefs

great-expectations/great_expectations

ydataai/ydata-profiling

huggingface/datasets

CapSoftware/Cap

SimplifyJobs/Summer2026-Internships