CV – 第 7 页 – DataoceanAI

Lip-reading Speech Video Corpus

The corpus uses six cameras and two microphone arrays to simultaneously capture the lip speech video data of speakers. The capture and filming scenario simulates the interior of a cockpit, with diverse shooting angles and lighting. Data is collected from 250 individuals, all of whom are adults, primarily middle-aged and young people. Each person's target effective recording time is approximately 0.5 hours, with an average of about 600 short sentences per person. The product library also extracts audio from any one of the six video routes captured for each ID, saving it as a separate audio file. The results from the six cameras will be aligned with an error of less than 30 milliseconds, and the two microphone results will also be synchronized with the camera results.

Facial recognition Object Detection

CV

Filter by

Lip-reading Speech Video Corpus

Masked Facial Image Corpus

Multi Angle Facial Corpus

Multi-age Life Photos of 90000 People

Multi-class Instance Segmentation Corpus

Multiple Faces Detection Corpus

Object Segmentation

Objects Detection and Segmentation Corpus

Online Meeting Facial Corpus

Get started

Filter by

Filter by

CV

Filter by

Lip-reading Speech Video Corpus

Masked Facial Image Corpus

Multi Angle Facial Corpus

Multi-age Life Photos of 90000 People

Multi-class Instance Segmentation Corpus

Multiple Faces Detection Corpus

Object Segmentation

Objects Detection and Segmentation Corpus

Online Meeting Facial Corpus

Get started

Join our newsletter to stay updated

Filter by

Filter by