awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Data Ingestion and Preparation · Awesome GitHub Repositories

3 repos

Awesome GitHub RepositoriesData Ingestion and Preparation

Tools focused on the initial stages of the pipeline, including loading, formatting, and augmenting raw data for model consumption.

Explore 3 awesome GitHub repositories matching artificial intelligence & ml · Data Ingestion and Preparation. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Machine Learning Pipelines
  4. Data Ingestion and Preparation

Awesome Data Ingestion and Preparation GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • ultralytics/yolov5

    ultralytics/yolov5

    56,830GitHubView on GitHub↗

    YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning

    Pythoncoremldeep-learningios
  • deepfakes/faceswap

    deepfakes/faceswap

    54,974GitHubView on GitHub↗

    Faceswap is a comprehensive framework for automated media manipulation and neural face synthesis. It provides a modular pipeline that manages the entire lifecycle of facial feature extraction, deep learning model training, and image conversion. By coordinating complex computer vision workflows, the system enables users

    Pythondeep-face-swapdeep-learningdeep-neural-networks
  • karpathy/nanoGPT

    karpathy/nanoGPT

    53,461GitHubView on GitHub↗

    nanoGPT is a lightweight engine for training and fine-tuning transformer-based language models from scratch. It provides a minimalist codebase designed for educational exploration and rapid experimentation with neural network architectures, utilizing self-attention and feed-forward layers to process sequences and predi

    Python

Explore sub-tags

  • Data Augmentation1 sub-tagTechniques and pipelines used to artificially expand training datasets by creating modified versions of existing data.
  • Data Preparation ToolsUtilities designed to clean, format, and transform raw data into a structure suitable for machine learning ingestion.
  • Dataset LoadersSoftware components that automate the retrieval and loading of datasets into machine learning training pipelines.