ASR

Search our off-the-shelf datasets.

Filter by
Language
Filter by Languages
Language
Devices
Devices
Applicable Fields
Applicable Fields
More
Applicable Scenarios
Applicable Scenarios
More
Chinese Wu Accented Mandarin Speech Recognition Corpus (Mobile)
This dataset was recorded in a quiet office/home environment, with a total of 505 speakers participating, including 258 males and 247 females. All speakers involved in the recording were professionally selected to ensure standard pronunciation and clear articulation. The recorded text covers news, daily conversations, Twitter, and other information.
Chinese Yunnan Accented Mandarin Speech Recognition Corpus (Mobile)
This dataset was recorded in both quiet and noisy environments, with a total of 509 speakers participating, including 266 males and 243 females. All speakers involved in the recording were professionally selected to ensure standard pronunciation and clear articulation. The recorded text covers news, daily conversations, Twitter, and other information.
Chinese-English Mixed Speech Recognition Corpus
This dataset is designed to support the development and refinement of bilingual Chinese-English mixed speech recognition technologies. It contains a diverse set of speech samples recorded in various scenarios to train and test speech recognition systems. Recordings were primarily made using desktop devices to mimic everyday usage environments. All recordings are made using high-fidelity equipment to ensure clarity and improve the accuracy of speech recognition systems. Recordings were conducted in noise-free or low-noise environments to minimize the impact of background noise on speech recognition performance. Speakers are selected from the seven major Chinese dialect regions to achieve a balanced representation of regional accents.
Chinese-English Mixed Speech Recognition Corpus
Desktop devices, high audio quality, quiet environment, gender balance, and balanced distribution of speakers across the seven major Chinese dialect areas. Corpus types: Media-specific, including radio, audiobooks (with a focus on Ximalaya), and video scene data.
Chinese-English Mixed Speech Recognition Corpus (Desktop)
【Product Features】 High sampling rate (44.1/48K), in-vehicle corpus, collected in a quiet indoor environment, multiple scenarios (vehicle control, music, general, maps, casual conversation, English interaction, audiobooks, etc.) Applicable to in-car and other common voice recognition scenarios.
Chinese-German-US English-Korea parallel speech Corpus (Desktop)
This dataset was recorded in a recording studio environment, with a total of 32 speakers participating, including 16 males and 16 females. All speakers involved in the recording were professionally selected to ensure standard pronunciation and clear articulation. The recorded text covers daily conversations.
Chinese-Russian Parallel Speech Recognition Corpus (Mobile)
This dataset was recorded in a quiet office environment, with the participation of 403 speakers, including 169 males and 234 females. All speakers involved in the recording were professionally selected to ensure standardized pronunciation and clear enunciation. The recorded texts cover military news and national defense white papers.
Colombian American English Speech Recognition Corpus (Desktop+Mobile)
This dataset was recorded in a quiet office/home environment, with a total of 100 speakers participating, including 57 males and 43 females. All speakers who took part in the recording were professionally screened to ensure standardized pronunciation and clear enunciation. The recorded texts cover information on news and everyday conversations, among other topics.
Colombian Spanish Conversational Speech Recognition Corpus (Mobile)
This dataset was recorded in a quiet office/home environment, with a total of 100 speakers participating, including 54 males and 46 females. All speakers involved in the recording were professionally selected to ensure standardized pronunciation and clear enunciation. The recording texts cover information on family, work, food, and other topics.

Join our newsletter to stay updated

Thank you for signing up!

Stay informed and ahead with the latest updates, insights, and exclusive content delivered straight to your inbox.

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.

Filter by
Filter by
Language
Filter by Languages
Language
Devices
Devices
Applicable Fields
Applicable Fields
More
Applicable Scenarios
Applicable Scenarios
More