awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Dataset Preparation Tutorials · Awesome GitHub Repositories

1 repo

Awesome GitHub RepositoriesDataset Preparation Tutorials

Guides on formatting and processing raw text data for machine learning tasks.

Distinguishing note: Focuses on the data ingestion and preprocessing phase of the machine learning pipeline.

Explore 1 awesome GitHub repository matching education & learning resources · Dataset Preparation Tutorials. Refine with filters or upvote what's useful.

  1. Home
  2. Education & Learning Resources
  3. Dataset Preparation Tutorials

Awesome Dataset Preparation Tutorials GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • datawhalechina/self-llm

    datawhalechina/self-llm

    28,285View on GitHub↗

    This project is an open-source educational resource providing structured, step-by-step guides for fine-tuning large language models. It focuses on adapting pre-trained transformer-based causal models to custom datasets, enabling users to transfer specific writing styles or domain knowledge into generative AI models. The repository distinguishes itself by emphasizing parameter-efficient training techniques, specifically low-rank adaptation. By providing practical implementations for updating only a small subset of model weights, it allows for the customization of massive neural networks on con

    首先,我们需要准备《甄嬛传》剧本数据,这里我们使用了《甄嬛传》剧本数据,我们可以查看一下原始数据的格式。 ```text 第2幕 (退朝,百官散去) 官员甲:咱们皇上可真是器重年将军和隆科多大人。 官员乙:隆科多大人,恭喜恭喜啊!您可是国家的大功臣啊! 官员丙:年大将军,皇上对你可是垂青有加呀! 官员丁:年大人,您可是皇上的股肱之臣哪! 苏培盛(追上年羹尧):年大将军请留步。大将军—— 年羹尧:苏公公,有何指教? 苏培盛:不敢。皇上惦记

    Jupyter Notebookchatglmchatglm3gemma-2b-it
    28,285View on GitHub↗