This project is a synthetic data generator designed to create realistic tabular and time-series datasets for machine learning and testing workflows. It functions as a privacy-preserving platform that models the underlying statistical distributions of source data to produce new records that maintain the original statistical properties and structural integrity. The tool distinguishes itself by utilizing CPU-optimized statistical sampling, allowing for high-performance data generation on standard hardware without the need for specialized graphics processing units. It employs a configuration-driv
This is a generative AI model library containing a collection of PyTorch and TensorFlow implementations for creating synthetic data and modeling complex probability distributions. It serves as a multi-framework repository of deep learning models designed for learning and replicating data patterns. The project provides specialized implementation suites for several generative architectures. This includes Generative Adversarial Networks using competing generator and discriminator models, Variational Autoencoder frameworks that map data to a latent space, and Restricted Boltzmann Machine and Deep
Easy-dataset is a comprehensive platform designed for the end-to-end management of machine learning datasets, specifically tailored for language and vision model fine-tuning. It functions as a centralized environment for the entire data lifecycle, encompassing the automated generation of synthetic training data, the structural organization of document collections, and the systematic annotation of individual data points. The platform distinguishes itself through its integrated evaluation and orchestration capabilities. It provides a dedicated suite for benchmarking models, featuring blind side