What are the best open-source alternatives to Neosync?

30 open-source projects similar to nucleuscloud/neosync, ranked by shared features. Top picks: lk-geimfari/mimesis, qovery/replibyte, microsoft/presidio, wisser/jailer, wiseodd/generative-models, dagster-io/dagster, apache/nifi, quantumblacklabs/kedro, treeverse/dvc, matz/streem.

Is lk-geimfari/mimesis a good alternative to Neosync?

Mimesis is a Python synthetic data generator used to create realistic fake datasets and mock data for software testing and development. It functions as a schema-based dataset generator capable of producing structured records and relational datasets, while also serving as a production data anonymize…

Is qovery/replibyte a good alternative to Neosync?

Replibyte is a tool that automates the lifecycle of database snapshots for non-production environments, handling the export, anonymization, subsetting, and restoration of data. It is designed to support privacy-compliant development workflows by replacing sensitive production data with synthetic va…

Is microsoft/presidio a good alternative to Neosync?

Presidio is a PII detection and anonymization framework designed to identify and mask personally identifiable information in text. It functions as a PII recognition pipeline and a data masking engine, using a combination of machine learning, regular expressions, and rule-based logic to locate sensi…

Is wisser/jailer a good alternative to Neosync?

Jailer is a suite of specialized tools for AI-assisted SQL management, referential integrity preservation, and relational data browsing. It provides a system for generating referentially intact database subsets, allowing users to extract consistent slices of relational data while preserving foreign…

Is wiseodd/generative-models a good alternative to Neosync?

This is a generative AI model library containing a collection of PyTorch and TensorFlow implementations for creating synthetic data and modeling complex probability distributions. It serves as a multi-framework repository of deep learning models designed for learning and replicating data patterns.…

Is dagster-io/dagster a good alternative to Neosync?

Dagster is a data orchestration platform designed to manage the entire lifecycle of data assets through declarative modeling and version-controlled code. It functions as a workflow engine that treats data assets as first-class primitives, allowing teams to define, schedule, and monitor complex pipe…

Is apache/nifi a good alternative to Neosync?

Apache NiFi is a flow-based programming platform that enables the visual design, monitoring, and management of data pipelines. At its core, it provides a web-based visual dataflow designer where users build directed graphs of processors to route, transform, and mediate data movement between any sou…

Is quantumblacklabs/kedro a good alternative to Neosync?

Kedro is a data science pipeline framework and production toolbox designed to build reproducible, modular workflows using software engineering best practices. It functions as a data engineering orchestrator and catalog manager, bridging the gap between interactive analysis and maintainable producti…

Is treeverse/dvc a good alternative to Neosync?

DVC is a data versioning tool and pipeline orchestrator designed to track large datasets and machine learning models using external storage and metadata pointers. It integrates with Git by utilizing placeholders to keep heavy artifacts out of the repository while maintaining a versioned link betwee…

Is matz/streem a good alternative to Neosync?

Streem is a stream-based programming language and data pipeline orchestrator. It provides a domain-specific language for defining concurrent data flows, allowing users to link data sources to destinations through a sequence of operations that transform and filter individual stream elements. The sy…

Back to nucleuscloud/neosync

Open-source alternatives to Neosync

30 open-source projects similar to nucleuscloud/neosync, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Neosync alternative.

lk-geimfari/mimesis
lk-geimfari/mimesis
4,818View on GitHub
Mimesis is a Python synthetic data generator used to create realistic fake datasets and mock data for software testing and development. It functions as a schema-based dataset generator capable of producing structured records and relational datasets, while also serving as a production data anonymizer to replace sensitive information with synthetic values. The library distinguishes itself through comprehensive multilingual support, allowing for the generation of locale-specific information to simulate regional user profiles. It ensures reproducibility through deterministic data generation using
Pythondatadataframedatascience
View on GitHub4,818
qovery/replibyte
Qovery/Replibyte
4,381View on GitHub
Replibyte is a tool that automates the lifecycle of database snapshots for non-production environments, handling the export, anonymization, subsetting, and restoration of data. It is designed to support privacy-compliant development workflows by replacing sensitive production data with synthetic values and extracting consistent subsets of rows while preserving referential integrity. The tool operates through a configurable pipeline defined in a YAML file, orchestrating stages such as dump, anonymize, subset, and restore. Each operation runs as an isolated, ephemeral container job, and snapsho
Rustawsbackupcloud
View on GitHub4,381
microsoft/presidio
microsoft/presidio
6,995View on GitHub
Presidio is a PII detection and anonymization framework designed to identify and mask personally identifiable information in text. It functions as a PII recognition pipeline and a data masking engine, using a combination of machine learning, regular expressions, and rule-based logic to locate sensitive entities. The system acts as an NER model orchestrator, allowing for the integration of external named entity recognition models and PII detectors to support multi-language privacy scrubbing. It employs a plugin-based recognizer architecture that can be extended with custom recognizers, deny-li
Pythonanonymizationdata-anonymizationdata-masking
View on GitHub6,995

Open-source alternatives to Neosync

lk-geimfari/mimesis

Qovery/Replibyte

microsoft/presidio

Wisser/Jailer

wiseodd/generative-models

dagster-io/dagster

apache/nifi

quantumblacklabs/kedro

treeverse/dvc

matz/streem

apache/incubator-airflow

fzaninotto/Faker

Data-Centric-AI-Community/fg-data-synthetic

mage-ai/mage-ai

orchest/orchest

dbt-labs/dbt-core

PrefectHQ/prefect

spotify/luigi

jd-opensource/joyagent-jdgenie

airbytehq/airbyte

apache/dolphinscheduler

Unstructured-IO/unstructured

azkaban/azkaban

airbnb/airflow

microsoft/TinyTroupe

DiUS/java-faker

maiot-io/zenml

apache/flink-cdc

lerocha/chinook-database

Microsoft/Recommenders