←BackCLUEbenchmark/CLUECorpus20200Copy as MarkdownView on GitHub↗1,012 stars·83 forks·MIT·0 viewsarxiv.org/abs/2003.01355↗CLUECorpus2020FeaturesDatasets and Corpora - High-quality Chinese pre-training corpus for NLP tasks.Pre-training Datasets - Cleaned 100GB Chinese corpus for pre-training and NLP tasks.