ASR

Chinese Mandarin Wake-up Words Speech Recognition Corpus (TicHome Mini)

This dataset was recorded in both quiet and noisy environments, involving a total of 200 speakers, comprising 101 males and 99 females. All participants in the recording were carefully selected through a professional screening process to ensure standard pronunciation and clarity in speech. The texts for the recordings include information on wake words and other related content.

Education and learning Smart search Speech recognition

Chinese Mandarin Whisper Speech Recognition Corpus (Mobile)

This dataset was recorded in a quiet office/home environment, with a total of 21 speakers participating, including 11 males and 10 females. All participants involved in the recording were professionally screened to ensure standard pronunciation and clear enunciation. The recorded texts cover information on everyday conversations and other relevant topics.

Education and learning Smart search Speech recognition

Chinese SiChuan Accented Mandarin Speech Recognition Corpus (Mobile)

This dataset was recorded in both quiet and noisy environments, with a total of 555 speakers participating, including 263 males and 292 females. All speakers involved in the recording were professionally selected to ensure standard pronunciation and clear articulation. The recorded text covers SMS, news, and other information.

Education and learning Smart search Speech recognition

Chinese Songs Corpus (Mobile)

This dataset was recorded in quiet office and home environments, with 20 speakers participating, including 11 males and 9 females. All speakers involved in the recording were professionally selected to ensure standardized pronunciation and clear articulation. The recorded texts cover songs and other information.

Education and learning Smart search Speech recognition

Chinese Speech Corpus-Incabin

【Product Type】Chinese, Reading, Desktop Collection (16K) 【Product Features】Collected in-vehicle, various types of corpora (vehicle control, music, general, maps, casual conversation scenarios), over 100 recording scenarios. Applicable to the automotive field. 【Pronunciation Person Information】 Gender: Male 49%, Female 51% Age: Pronunciation people cover the age range of 15-60 years old, with approximately 10% over the age of 45. Accent: Equally distributed across the Chinese seven major accent regions.

Vehicle Control Maps Voice Control

Chinese Wu Accented Mandarin Speech Recognition Corpus (Mobile)

This dataset was recorded in a quiet office/home environment, with a total of 505 speakers participating, including 258 males and 247 females. All speakers involved in the recording were professionally selected to ensure standard pronunciation and clear articulation. The recorded text covers news, daily conversations, Twitter, and other information.

Education and learning Smart search Speech recognition

Chinese Yunnan Accented Mandarin Speech Recognition Corpus (Mobile)

This dataset was recorded in both quiet and noisy environments, with a total of 509 speakers participating, including 266 males and 243 females. All speakers involved in the recording were professionally selected to ensure standard pronunciation and clear articulation. The recorded text covers news, daily conversations, Twitter, and other information.

Education and learning Smart search Speech recognition

Chinese-English Mixed Speech Recognition Corpus

This dataset is designed to support the development and refinement of bilingual Chinese-English mixed speech recognition technologies. It contains a diverse set of speech samples recorded in various scenarios to train and test speech recognition systems. Recordings were primarily made using desktop devices to mimic everyday usage environments. All recordings are made using high-fidelity equipment to ensure clarity and improve the accuracy of speech recognition systems. Recordings were conducted in noise-free or low-noise environments to minimize the impact of background noise on speech recognition performance. Speakers are selected from the seven major Chinese dialect regions to achieve a balanced representation of regional accents.

Education and learning Smart search Speech recognition

Chinese-English Mixed Speech Recognition Corpus

Desktop devices, high audio quality, quiet environment, gender balance, and balanced distribution of speakers across the seven major Chinese dialect areas. Corpus types: Media-specific, including radio, audiobooks (with a focus on Ximalaya), and video scene data.

Audiobooks Radio High-quality

Filter by

Chinese Mandarin Wake-up Words Speech Recognition Corpus (TicHome Mini)

Chinese Mandarin Whisper Speech Recognition Corpus (Mobile)

Chinese SiChuan Accented Mandarin Speech Recognition Corpus (Mobile)

Chinese Songs Corpus (Mobile)

Chinese Speech Corpus-Incabin

Chinese Wu Accented Mandarin Speech Recognition Corpus (Mobile)

Chinese Yunnan Accented Mandarin Speech Recognition Corpus (Mobile)

Chinese-English Mixed Speech Recognition Corpus

Chinese-English Mixed Speech Recognition Corpus

Get started

Filter by

Filter by

ASR

Filter by

Chinese Mandarin Wake-up Words Speech Recognition Corpus (TicHome Mini)

Chinese Mandarin Whisper Speech Recognition Corpus (Mobile)

Chinese SiChuan Accented Mandarin Speech Recognition Corpus (Mobile)

Chinese Songs Corpus (Mobile)

Chinese Speech Corpus-Incabin

Chinese Wu Accented Mandarin Speech Recognition Corpus (Mobile)

Chinese Yunnan Accented Mandarin Speech Recognition Corpus (Mobile)

Chinese-English Mixed Speech Recognition Corpus

Chinese-English Mixed Speech Recognition Corpus

Get started

Join our newsletter to stay updated

Filter by

Filter by