DMS of Foreign Adult Product Corpus

This product library is a cabin DMS (Driver Monitoring System) for foreign adult data collection, solely capturing IR (Infrared) videos and images. The DMS captured 700 foreign adults, with 20% Blacks and 80% Whites. The shooting mode is individual, with 25 fixed cameras arranged inside the cabin for synchronized recording, and an additional camera for supporting shots. Props include hats (20%), regular glasses (25%), sunglasses (25%), masks (20%), with their configurations randomly overlapping. The vehicle models are 5-seater passenger cars (consisting of smart, BYD Dolphin, and BYD Song, totaling three vehicles), and the vehicles were stationary during the shooting. The lighting conditions of capturing include frontal lighting, back lighting, side lighting, interior car lighting, streetlights, shade under trees, oncoming headlights, cloudy days, and rainy days. The capturing content consists of both action and gaze, with videos for action capture and images for gaze capture, along with calibration data. The video action capture scenarios are divided into 18 basic scenes and 8 additional scenes, totaling 26 categories. All data include the first 18 basic actions, and some data also include the additional 8 scenes. During the action scene capture, the collectors will simultaneously collect head movements, which are divided into two actions. The collectors randomly perform one of the two actions, with each action accounting for 50% overall. The DMS product library captured approximately 730,000 video segments, with each segment lasting around 30 seconds (700 * 18 * 25 * [lighting conditions]). It also captured approximately 204,554,000 gaze and action images (700 * (11 * 25 * 19 + 19 * 26 * 19) * [lighting conditions]), 700,000 original calibration images (700 * 40 * 25), and another 700,000 calibration output images (700 * 40 * 25), which are stored in the images, detection, and reprojection folders respectively.
The product library size is approximately 45TB, containing over 23,950,000 files.
The data comprises 70% males and 30% females.
Collection environment
The images capture for action and gaze includes 30 gaze positions inside the car, with 11 fixed camera positions and 19 mobile camera positions. When the gaze is directed towards a fixed camera position, 25 fixed camera images are captured simultaneously; when the gaze is directed towards a mobile camera position, 25 fixed camera images and 1 mobile camera image are captured simultaneously.

People also searched for

High-Definition Dance Video Corpus
Product Features: This dataset has collected 100,000 dance videos, each averaging 30 seconds in length, at 4K resolution, including adults and teenagers with a foundation in dance, with a balanced gender ratio. It includes both solo and group dances, with high richness in videos from various angles such as front, side, back, and turning. Dance types include folk dance, jazz, street dance, and more. Application Fields: This dataset can be applied to virtual humans, VR, dance education, video production, and other fields, promoting the application and development of multimodal technology in the corresponding areas.
Telephoto Landscape Corpus
【Product Features】 High-quality images of architecture and plants, with no blurring within the full size of the image, ensuring that both the foreground and background show clear textures even when enlarged; no more than 5 images of the same subject from different angles to ensure diversity in the content captured. 【Image Specifications】 Resolution above 4k (shoot in the highest quality mode with the camera); focal length within the range of 185mm to 235mm.
Lip-movement Video Corpus
Product Features: This dataset was captured using high-definition cameras to record 208 people's lip and speech videos. It was collected in a quiet indoor environment, simulating various lighting conditions, including normal light, strong light, backlight, and dim light, with shooting distances of 0.5m and 1m, primarily 0.5m, accounting for about 90%. It includes both solo and group recordings. The subjects are mainly Mandarin speakers, with ages ranging from 7 to over 60 years old, mainly children and young to middle-aged individuals, with a balanced gender ratio. Audio was collected simultaneously with the video recording. Application Fields: This dataset can be applied to virtual humans, VR, and other fields, promoting the application and development of lip-reading technology in the corresponding areas.
Lip-reading Speech Video Corpus
This dataset covers 250 individuals, with each person recording no less than 600 short sentences, and the effective video duration for each individual is half an hour, which can be used for tasks such as face recognition and object detection.

Join our newsletter to stay updated

Thank you for signing up!

Stay informed and ahead with the latest updates, insights, and exclusive content delivered straight to your inbox.

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.