German Speech Recognition Corpus (Mobile)

This dataset was recorded in a quiet office environment, with the participation of 1,682 speakers, including 692 males and 990 females. All speakers involved in the recording were professionally selected to ensure standardized pronunciation and clear enunciation. The recorded texts cover daily spoken language, education, medical health, sports and entertainment, and social economy.
2130.1 hours
Sample rate & bit depth
16 kHz,16bit
Recording environment
Quiet (office)
692 males and 990 females
Accuracy Rate
Sentence Accuracy Rate (SAR) 95%

People also searched for

Chinese-English Mixed Speech Recognition Corpus
Desktop devices, high audio quality, quiet environment, gender balance, and balanced distribution of speakers across the seven major Chinese dialect areas. Corpus types: Media-specific, including radio, audiobooks (with a focus on Ximalaya), and video scene data.
Ten Thousand People Corpus
Reading and Conversation Data News, Text Messages, Car Control, Number Sequences, Music, General, Maps, Daily Colloquial Speech Family, Health, Travel, Work, Socializing, Celebrities, Weather, and other common life topics. Read Text: 10,051 people, 3,953 hours (no less than 1 minute per person, no less than 4 characters per sentence) Free Conversation: 3,844 people, 1,914 hours (Long Audio)
Hong Kong Cantonese-English Mixed Corpus
Daily Conversation Scenarios: Including commonly used English words, abbreviations, names, software, trademarks, shop names, etc. in Cantonese
Chinese Mandarin Speech Recognition Corpus for the Elderly and Children
This dataset is specifically tailored to capture the nuances of speech from the elderly and children, two demographic groups with distinct vocal characteristics. This dataset is recorded using desktop equipment to ensure high audio quality, and all recordings take place in a quiet environment to minimize background noise. The corpus includes read speech, which is beneficial for training speech recognition models on clear and deliberate pronunciations. The gender balance in the dataset ensures that the recognition system can accurately interpret both male and female voices. Furthermore, the speakers are drawn from the seven major Chinese dialect regions, providing a diverse and balanced distribution of accents and speech patterns. For the children's recordings, the dataset includes speech samples from interactive car control systems, children's audiobooks, children's video content, and music featuring children's songs and popular tunes from platforms like TikTok. The elderly recordings cover similar domains with a focus on applications and content that cater to their preferences, such as car control, map navigation, audiobooks with programs selected for an older audience, and music that includes selections favored by the elderly. This comprehensive approach ensures that the speech recognition system can effectively adapt to the unique speech traits of these age groups across various contexts and dialects.

Join our newsletter to stay updated

Thank you for signing up!

Stay informed and ahead with the latest updates, insights, and exclusive content delivered straight to your inbox.

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.