1 repo
Collections of audio recordings and transcriptions used to train and evaluate speech-based machine learning models.
Explore 1 awesome GitHub repository matching artificial intelligence & ml · Speech Datasets. Refine with filters or upvote what's useful.
This project is a speech recognition and translation engine that utilizes a sequence-to-sequence transformer architecture to convert audio into text. It is built upon a weakly supervised learning framework, which leverages large-scale, unlabelled audio-transcript data to create generalized speech representations capabl
Offers access to diverse multilingual speech corpora to ensure broad language coverage during model training and evaluation.