All Datasets

Search our off-the-shelf datasets.

Filter by
Free dialogue in Odia Speech Corpus
【Product Type】Odia language from India, free conversation, mobile 16K 【Corpus Type】 Home, health, travel, education, work, gourmet food, marriage, movies, music, socializing, celebrities, weather, sports, and other common topics in daily life Natural context, applicable to the entire industry 【Pronouncer Information】 Gender: Male 44%, Female 56% Age: Pronouncers mainly cover the age range of 16-45 Accent: Pronouncers are from Odisha state.
Ten Thousand People Dialect with High-Quality Labeling Speech Corpus
This dataset covers 29,954 dialect speakers from 26 provinces in China, ranging in age from 12 to 75, with a total recording time of 34,073 hours and an average recording duration of nearly 60 minutes, maintaining a balanced gender ratio. The topics covered are very extensive, including news, text messages, vehicle control, music, general, maps, daily colloquial speech, family, health, travel, work, socializing, celebrities, weather, and other common life topics.
High-Definition Dance Video Corpus
Product Features: This dataset has collected 100,000 dance videos, each averaging 30 seconds in length, at 4K resolution, including adults and teenagers with a foundation in dance, with a balanced gender ratio. It includes both solo and group dances, with high richness in videos from various angles such as front, side, back, and turning. Dance types include folk dance, jazz, street dance, and more. Application Fields: This dataset can be applied to virtual humans, VR, dance education, video production, and other fields, promoting the application and development of multimodal technology in the corresponding areas.
Telephoto Landscape Corpus
【Product Features】 High-quality images of architecture and plants, with no blurring within the full size of the image, ensuring that both the foreground and background show clear textures even when enlarged; no more than 5 images of the same subject from different angles to ensure diversity in the content captured. 【Image Specifications】 Resolution above 4k (shoot in the highest quality mode with the camera); focal length within the range of 185mm to 235mm.
DMS Diverse Drivers Corpus
This product library is a cabin DMS (Driver Monitoring System) for foreign adult data collection, solely capturing IR (Infrared) videos and images. The DMS captured 700 foreign adults, with 20% Blacks and 80% Whites. The shooting mode is individual, with 25 fixed cameras arranged inside the cabin for synchronized recording, and an additional camera for supporting shots. Props include hats (20%), regular glasses (25%), sunglasses (25%), masks (20%), with their configurations randomly overlapping. The vehicle models are 5-seater passenger cars (consisting of smart, BYD Dolphin, and BYD Song, totaling three vehicles), and the vehicles were stationary during the shooting. The lighting conditions of capturing include frontal lighting, back lighting, side lighting, interior car lighting, streetlights, shade under trees, oncoming headlights, cloudy days, and rainy days. The capturing content consists of both action and gaze, with videos for action capture and images for gaze capture, along with calibration data. The video action capture scenarios are divided into 18 basic scenes and 8 additional scenes, totaling 26 categories. All data include the first 18 basic actions, and some data also include the additional 8 scenes. During the action scene capture, the collectors will simultaneously collect head movements, which are divided into two actions. The collectors randomly perform one of the two actions, with each action accounting for 50% overall. The DMS product library captured approximately 730,000 video segments, with each segment lasting around 30 seconds (700 * 18 * 25 * [lighting conditions]). It also captured approximately 204,554,000 gaze and action images (700 * (11 * 25 * 19 + 19 * 26 * 19) * [lighting conditions]), 700,000 original calibration images (700 * 40 * 25), and another 700,000 calibration output images (700 * 40 * 25), which are stored in the images, detection, and reprojection folders respectively.
Chinese dialect Speech Corpus
【Pronunciation Speaker Information】 Gender: The ratio of male to female pronunciation speakers is approximately 1:1. Age: Pronunciation speakers cover the age range from 16 to 60 years old. Accents: Fujian, Guangdong, Hunan, Jiangxi, Wu (Suzhou), Yunnan, Guizhou, Wu (Shanghai), Tianjin, Anhui, Shandong, Henan, Liaoning (Shenyang/Anshan), Shaanxi, Shanxi, Hubei, Gansu, Wenzhou, Hebei, Liaoning (Dalian/Dandong), Wu (Zhejiang), Sichuan.
In-Vehicle Noise Corpus
Chinese Mandarin Speech Recognition Corpus
【Product Features】High sampling rate, in-vehicle corpus, indoor quiet collection, multiple scenarios (vehicle control, music, general, map, casual chat scenarios) Can be applied to in-vehicle and other common speech recognition scenarios. 【Audio Parameters】 16k: 1 person 0.5 hours 44.1k: 148 people 74.9 hours 48k:2463 people,1354.8 hours
Chinese Speech Corpus-Incabin
【Product Type】Chinese, Reading, Desktop Collection (16K) 【Product Features】Collected in-vehicle, various types of corpora (vehicle control, music, general, maps, casual conversation scenarios), over 100 recording scenarios. Applicable to the automotive field. 【Pronunciation Person Information】 Gender: Male 49%, Female 51% Age: Pronunciation people cover the age range of 15-60 years old, with approximately 10% over the age of 45. Accent: Equally distributed across the Chinese seven major accent regions.

Join our newsletter to stay updated

Thank you for signing up!

Stay informed and ahead with the latest updates, insights, and exclusive content delivered straight to your inbox.

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.

Filter by