ASR

Search our off-the-shelf datasets.

Filter by
Language
Search
Language
Devices
Devices
More
Applicable Fields
Applicable Fields
More
Applicable Scenarios
Applicable Scenarios
More
Ten thousand people speech corpus
This dataset covers News, SMS, Vehicle Control, Digital String, Music, General, Maps, Family, Health, Travel, Work, Social, Celebrities, Weather and other common topics
Chinese Mandarin Speech Recognition Corpus for the Elderly and Children
This dataset is specifically tailored to capture the nuances of speech from the elderly and children, two demographic groups with distinct vocal characteristics. This dataset is recorded using desktop equipment to ensure high audio quality, and all recordings take place in a quiet environment to minimize background noise. The corpus includes read speech, which is beneficial for training speech recognition models on clear and deliberate pronunciations. The gender balance in the dataset ensures that the recognition system can accurately interpret both male and female voices. Furthermore, the speakers are drawn from the seven major Chinese dialect regions, providing a diverse and balanced distribution of accents and speech patterns. For the children's recordings, the dataset includes speech samples from interactive car control systems, children's audiobooks, children's video content, and music featuring children's songs and popular tunes from platforms like TikTok. The elderly recordings cover similar domains with a focus on applications and content that cater to their preferences, such as car control, map navigation, audiobooks with programs selected for an older audience, and music that includes selections favored by the elderly. This comprehensive approach ensures that the speech recognition system can effectively adapt to the unique speech traits of these age groups across various contexts and dialects.
Chinese Mandarin Speech Recognition Corpus
This dataset is a specialized collection of bilingual Chinese-English speech recordings, tailored to cater to the needs of speech recognition technology development. It is characterized by its unique blend of languages, high-fidelity audio quality, and the diversity of its contributors.
Chinese-English Mixed Speech Recognition Corpus (Desktop)
The dataset comprises speech samples that include a mixture of Mandarin Chinese and English, reflecting the linguistic diversity found in various global contexts. Each recording in the dataset is of superior quality, with clear and distinct sound that is essential for training robust speech recognition algorithms.
Chinese-English Mixed Speech Recognition Corpus
This dataset is designed to support the development and refinement of bilingual Chinese-English mixed speech recognition technologies. It contains a diverse set of speech samples recorded in various scenarios to train and test speech recognition systems. Recordings were primarily made using desktop devices to mimic everyday usage environments. All recordings are made using high-fidelity equipment to ensure clarity and improve the accuracy of speech recognition systems. Recordings were conducted in noise-free or low-noise environments to minimize the impact of background noise on speech recognition performance. Speakers are selected from the seven major Chinese dialect regions to achieve a balanced representation of regional accents.
Singapore English Speech Recognition Corpus
This dataset is Singaporean English Dialogue, applicable for dual channel for mobile and online calls with sentence segmentation data. It covers Telemarketing Customer Service, Financial Consumption, Common Daily Life Language, Social Hotspots, Travel Shopping, Sports Entertainment, Education Learning, Technology Digital Games, where Telemarketing Customer Service and Financial Consumption account for no less than 30%.
Multilingual Intelligent Speech Dataset
This dataset covers over 30 scenarios including sports, entertainment, health, shopping, pet, education, food, travel, and so on.
Cantonese Speech Recognition Corpus (Mobile)
This dataset was recorded in noisy environments such as shopping malls, streets, and cars, with a total of 149 speakers participating, including 72 males and 77 females. All speakers involved in the recording were professionally selected to ensure standard pronunciation and clear articulation. The recorded text covers news, daily conversations, Twitter, and other information.
Uyghur Speech Recognition Corpus (Mobile)
This dataset was recorded in a quiet office/home environment, with a total of 718 speakers participating, including 327 males and 391 females. All speakers involved in the recording were professionally selected to ensure standard pronunciation and clear articulation. The recorded text covers news, daily conversations, Twitter, and other information.

Join our newsletter to stay updated

Thank you for signing up!

Stay informed and ahead with the latest updates, insights, and exclusive content delivered straight to your inbox.

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.

Filter by
Filter by
Language
Search
Language
Devices
Devices
More
Applicable Fields
Applicable Fields
More
Applicable Scenarios
Applicable Scenarios
More