【Product Features】High sampling rate, in-vehicle corpus, indoor quiet collection, multiple scenarios (vehicle control, music, general, map, casual chat scenarios)
Can be applied to in-vehicle and other common speech recognition scenarios.
【Audio Parameters】
16k: 1 person 0.5 hours
44.1k: 148 people 74.9 hours
48k:2463 people,1354.8 hours
This corpus contains 5,763 speakers with a balanced gender ratio. The speakers are from Spain, Mexico, America,Argentina, and Colombia. The age range is from 16 to 80 years old.
This corpus covers 12 languages of India with 13,150 speakers.The languages including
Assamese,English,Gujarati,Hindi,Kashmiri,Malayalam,Marathi,Odia,Punjabi,Tamil,Telugu,and Urdu
This corpus comprises recordings from 35,628 speakers with each speaker contributing between 10 to 60 minutes of speech. The gender distribution is approximately equal. The age range of the speakers spans from 7 to 80 years old. It includes a diverse array of accents, representing 64 countries including China, the United States, the United Kingdom, Canada, Australia, Japan, South Korea, and many others.
Morocco Arabic Speech Recognition Corpus ( Phone )
This dataset covers free dialogue content, the topics include news, text messages, car control, music, general, maps, daily oral language, family, health, travel, work, socializing, celebrities, weather, and other common topics in life.