30 open-source projects similar to dius/java-faker, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Java Faker alternative.
Bogus is a fake data generator for .NET applications, including C#, F#, and VB.NET. It provides a deterministic mock data engine and an object configuration mapper to produce realistic profiles, addresses, and financial records. The library differentiates itself through a localization data provider that generates region-specific identifiers across various international languages and locales. It ensures reproducibility across executions by using seed values to control the sequence of generated data. The project covers wide-ranging data synthesis capabilities, including the generation of netwo
Faker is a PHP library for creating realistic synthetic data used for testing, prototyping, and populating database entities. It serves as a test data generator and localized mocking tool capable of producing synthetic names, addresses, and identifiers specific to various countries and languages. The library provides mechanisms to ensure data consistency and quality, including deterministic seeding to produce identical data sequences across executions and stateful uniqueness tracking to prevent duplicate values. It also supports probability-weighted optionality to simulate missing data and cu
Faker is a Ruby library used to generate randomized, realistic placeholder information for testing and development. It produces synthetic data to populate databases and test application logic without the use of real user information. The library provides localized data generation, using region-specific formats and strings for names, addresses, and phone numbers. It supports deterministic output through seedable random number generation, ensuring that sequences of fake data can be repeated across different test runs. The generator covers a wide range of domains, including personal identity, f
Faker is a PHP fake data generator and testing utility used to produce realistic randomized values for populating databases and test applications. It serves as a localization library that generates data tailored to specific languages and regional formats, providing a framework for extending data generation through custom classes and domain-specific formatters. The library ensures repeatability in testing environments through deterministic random seeding. It includes mechanisms to control output quality, such as enforcing value uniqueness and simulating missing data by occasionally producing n
Mimesis is a Python synthetic data generator used to create realistic fake datasets and mock data for software testing and development. It functions as a schema-based dataset generator capable of producing structured records and relational datasets, while also serving as a production data anonymizer to replace sensitive information with synthetic values. The library distinguishes itself through comprehensive multilingual support, allowing for the generation of locale-specific information to simulate regional user profiles. It ensures reproducibility through deterministic data generation using
gofakeit is a Go library for creating realistic synthetic datasets and populating Go structs with mock information. It functions as a deterministic data generator, allowing for seedable random number generation to ensure reproducible datasets for software testing. The project distinguishes itself by providing a mock data API server that exposes generation functions as HTTP endpoints and a synthetic dataset exporter for producing files in CSV, JSON, and XML formats. It also includes a command-line interface for generating mock data directly from the terminal. The library covers a wide array o
whodb is a multi-database management interface and notebook client designed for exploring and managing data across various engines, including Postgres, MySQL, MongoDB, and Redis. It functions as a graphical interface for managing database connections, records, and schemas through a unified layer. The project features a natural language query interface that uses large language models to translate plain English into executable SQL or NoSQL queries. This is supported by schema-aware prompting that injects database metadata into the model context to ensure generated queries match actual table def
Mock is a JavaScript API mocking tool and network request interceptor designed to decouple front-end development from back-end progress. It functions as an API simulation tool and mock data generator, allowing developers to build user interfaces and high-fidelity prototypes by mimicking the request and response cycle without a live server. The system provides a mechanism for intercepting outgoing HTTP calls and returning simulated data. It enables front-end prototyping by generating synthetic datasets to validate application behavior during automated testing cycles and development. Capabilit
SQLiteStudio is an open-source graphical tool for browsing, editing, and managing SQLite database files. It combines a full-featured SQL editor with syntax highlighting, a visual database schema designer for creating entity-relationship diagrams, and a plugin-based extensibility platform that allows adding custom functionality through C/C++, JavaScript, Tcl, or Python. The application distinguishes itself through its multi-language scripting engine, which embeds JavaScript, Tcl, and Python interpreters to enable user-defined functions and scripts within SQL queries. It supports encrypted data
Goravel is a full-featured development scaffold and framework for building web applications, REST APIs, and gRPC services using the Go programming language. It implements a model-view-controller architecture and provides a comprehensive toolkit for high-performance remote procedure call servers and clients. The framework is distinguished by its extensive integrated ecosystem, which includes a fluent object-relational mapper for database management and a dedicated command-line interface toolkit for administrative automation and project scaffolding. It features a driver-based service abstractio
This project is a synthetic data generator designed to create realistic tabular and time-series datasets for machine learning and testing workflows. It functions as a privacy-preserving platform that models the underlying statistical distributions of source data to produce new records that maintain the original statistical properties and structural integrity. The tool distinguishes itself by utilizing CPU-optimized statistical sampling, allowing for high-performance data generation on standard hardware without the need for specialized graphics processing units. It employs a configuration-driv
This is a generative AI model library containing a collection of PyTorch and TensorFlow implementations for creating synthetic data and modeling complex probability distributions. It serves as a multi-framework repository of deep learning models designed for learning and replicating data patterns. The project provides specialized implementation suites for several generative architectures. This includes Generative Adversarial Networks using competing generator and discriminator models, Variational Autoencoder frameworks that map data to a latent space, and Restricted Boltzmann Machine and Deep
Chance is a JavaScript library for generating random data, designed to produce realistic test data for automated tests and prototypes. It uses a Mersenne Twister pseudo-random number generator that accepts an optional seed value, enabling reproducible sequences of random values across multiple runs. The library provides a wide range of generators for common data types, including random integers, floats, booleans, characters, strings, and dates, all with configurable ranges and character pools. It can generate realistic geographic data like addresses, as well as financial data such as credit c
Alice is a PHP test data generator and fixture library used to automate the creation of large sets of fake objects and entities. It functions as an object hydrator and random data provider, allowing users to define the structure and attributes of dummy test data in markup or arrays to simulate specific application states. The library distinguishes itself through a template-based system that supports fixture inheritance to reduce data duplication. It utilizes a flexible instantiation model that allows for custom factory integration, method invocation, and property hydration via reflection or c
This project is a framework for generating synthetic tabular data that preserves the statistical properties and relational integrity of original source datasets. It functions as a metadata-driven engine, utilizing language models to synthesize information even when original training samples are restricted. The system is designed to maintain logical consistency across complex, multi-table structures while ensuring that generated outputs adhere to defined schema requirements. The platform distinguishes itself through a focus on privacy-preserving synthesis, integrating tools to quantify and mit
TypeORM Seeding is a development utility designed to automate database population and schema management within TypeORM-based projects. It provides a framework for resetting database structures and injecting consistent data, facilitating predictable states for testing and local development environments. The tool distinguishes itself through a factory-based approach to data generation, allowing developers to define reusable templates that produce randomized entity records. By integrating directly with the existing database abstraction layer, it ensures that generated objects are persisted into
TinyTroupe is a multi-agent simulation framework designed to create populations of persona-based agents that interact to generate synthetic behavioral data and business insights. It serves as a persona-based agent orchestrator and synthetic data generator, allowing for the definition of agents with specific personality traits and goals to coordinate their interactions through structured workflows. The project features an extensible plugin system for connecting simulated agents to external tools and servers to execute code and access remote data. It includes an agentic simulation dashboard tha
Faker is a synthetic data generation library used to create realistic but fake information, such as names, addresses, and phone numbers, for software testing and database population. It functions as a tool for producing synthetic test data to fill development databases with records that simulate production environments. The library provides localized data generation, allowing synthetic information to be customized for specific geographic regions and language formats. It also includes a mechanism for unique value enforcement to prevent the repetition of generated data by tracking and rejecting
NeoSync is a database synchronization tool and data pipeline orchestrator designed to move and transform datasets across different environments. It functions as a PII data security platform and a synthetic data generator, allowing for the synchronization of production data while ensuring privacy compliance. The system utilizes an event-sourced coordinator to manage asynchronous data movements, providing automated retry and failure handling. It differentiates itself by combining rule-based PII anonymization and detection with schema-based synthetic data generation to create artificial datasets
JSONPlaceholder is a REST API mock server and JSON mocking service that provides a hosted frontend development sandbox. It functions as a fake backend that returns predefined JSON responses to simulate a REST API for development and testing. The service supports cross-origin resource sharing, allowing API integrations to be tested from different browser domains. It enables the simulation of CRUD operations and the retrieval of mock data without requiring a live database. The system maps URL patterns to a static JSON-based data store and handles requests statelessly. It includes capabilities
This project is a Git commit standardization tool and semantic commit generator. It serves as an assistant to align code changes with semantic versioning by enforcing a consistent commit structure and formatting rules. The utility uses interactive prompts to gather user input, which it then validates against semantic categories and interpolates into predefined templates. This process automates the generation of standardized messages, ensuring that each commit follows a specific format to improve project history and traceability. The system also handles commit metadata structuring, including
mlfinlab is a Python machine learning library for finance designed for building and validating models used in quantitative trading and portfolio management. It provides a financial data engineering toolkit and a quantitative strategy backtesting framework to transform raw market data into predictive signals and target classes. The library includes a synthetic financial data generator to create artificial datasets that mimic the statistical properties of real assets for stress testing. It also provides specialized tools for financial time series labeling and sampling to prevent data leakage in
Ebean is a Java object-relational mapping framework designed to simplify database persistence through automated query generation, schema migration, and transaction management. It uses metadata-driven mapping and bytecode enhancement to bridge the gap between application objects and relational database tables, providing a persistent layer that handles complex data interactions while maintaining consistency across unit-of-work boundaries. The framework distinguishes itself through its focus on developer productivity and performance optimization. It provides type-safe query builders that generat
AngularJS Internationalization Library is a localization framework for AngularJS 1.x applications. It functions as an i18n translation tool used to swap static user interface text with localized versions based on a selected locale. The framework manages multi-language content through asynchronous loading of translation files to minimize bundle size. It includes systems for handling pluralization rules and interpolating dynamic variables into translation strings. The project also provides capabilities for language switching and fallback-chain resolution to ensure a readable string is displaye
Evidently is an AI observability platform and evaluation framework designed to quantify the performance of machine learning models and large language models. It functions as a monitoring tool for detecting data drift and quality degradation in tabular datasets, while providing a specialized analyzer for the faithfulness and correctness of retrieval augmented generation systems. The project distinguishes itself through an evaluation framework that utilizes judge models and custom rubrics to score language model outputs. It includes tools for iterative prompt optimization and the generation of
Giskard is an AI quality assurance suite and evaluation framework designed to measure the performance, bias, and security risks of large language models and AI agents. It functions as a vulnerability scanner to detect security flaws and performance regressions. The project provides automated red-teaming and adversarial testing workflows. These tools generate prompt-injection probes and adversarial attacks based on system descriptions to identify security gaps and vulnerabilities. The platform covers AI agent auditing and RAG quality validation, using knowledge-base grounding and synthetic da
Orval is an OpenAPI-to-TypeScript code generator that produces fully typed API clients, data-fetching hooks, mock data, validation schemas, and server handlers from OpenAPI or Swagger specifications. It reads any YAML or JSON API specification and generates TypeScript interfaces, HTTP request functions, and framework-specific integration code that ensures compile-time correctness for all API calls. The project distinguishes itself by generating production-ready data-fetching hooks for React Query, Vue Query, Svelte Query, Solid Query, Angular, and SWR, complete with automatic cache invalidati
This project is a Python machine learning education kit that provides curated datasets and visualization scripts to teach fundamental machine learning concepts. It functions as both a machine learning visualization library and a collection of educational datasets designed for demonstrating and testing common models and patterns. The toolkit focuses on illustrating the internal logic and operational patterns of machine learning algorithms. It generates figures and datasets that visualize how different models behave and operate on data to aid in the learning process. The implementation utilize
test_db is a collection of tools for validating database integrity, benchmarking system throughput, and generating synthetic schemas and datasets. It includes a sample corporate employee database for MySQL, a SQL dataset generator for creating representative records, and an integrity validator that uses checksums and record counts to verify data consistency across different database engines. The project provides a database performance benchmark consisting of complex queries and stored procedures designed to measure system response times and throughput. These tools simulate real-world workload