←Backhuggingface/datatrove0Copy as MarkdownView on GitHub↗3,092 stars·273 forks·Python·Apache-2.0·0 viewsDatatroveFeaturesData Curation and Filtering - Library for building scalable, platform-agnostic text processing pipelines.Data Pipelines - Processes, filters, and deduplicates large-scale text data.Data Processing - Platform-agnostic pipeline blocks for data processing.