This dataset was recorded in a quiet office environment, with 10 speakers participating, including 5 males and 5 females. All speakers involved in the recording were professionally selected to ensure standardized pronunciation and clear articulation. The recorded texts cover voicemail, paragraph dictation, and other information.
【Product Type】Odia language from India, free conversation, mobile 16K
【Corpus Type】
Home, health, travel, education, work, gourmet food, marriage, movies, music, socializing, celebrities, weather, sports, and other common topics in daily life
Natural context, applicable to the entire industry
【Pronouncer Information】
Gender: Male 44%, Female 56%
Age: Pronouncers mainly cover the age range of 16-45
Accent: Pronouncers are from Odisha state.
Ten Thousand People Dialect with High-Quality Labeling Speech Corpus
This dataset covers 29,954 dialect speakers from 26 provinces in China, ranging in age from 12 to 75, with a total recording time of 34,073 hours and an average recording duration of nearly 60 minutes, maintaining a balanced gender ratio. The topics covered are very extensive, including news, text messages, vehicle control, music, general, maps, daily colloquial speech, family, health, travel, work, socializing, celebrities, weather, and other common life topics.
【Pronunciation Speaker Information】
Gender: The ratio of male to female pronunciation speakers is approximately 1:1.
Age: Pronunciation speakers cover the age range from 16 to 60 years old.
Accents:
Fujian, Guangdong, Hunan, Jiangxi, Wu (Suzhou), Yunnan, Guizhou, Wu (Shanghai), Tianjin, Anhui, Shandong, Henan, Liaoning (Shenyang/Anshan), Shaanxi, Shanxi, Hubei, Gansu, Wenzhou, Hebei, Liaoning (Dalian/Dandong), Wu (Zhejiang), Sichuan.