The Foundation Powering Advanced AI

We empower more than 1100 AI enterprises and academic institutes on R&D with constantly offering over 1800 high quality OTS datasets and customized services, including Generative AI, Ethical AI and Machine Learning, that enable clients’ AI models to stay ahead in the market.

We empower more than 900 AI enterprises with our high-quality off-the-shelf datasets, and magnificent data collection & labeling services. There are more for you to find out!

Trusted by industry leaders

Datasets

Smart Cockpit-DMS diverse drivers Dataset

This product dataset is a foreign Caucasian collection of the cockpit DMS. Only IR videos and images were captured. The subjects were black and white, with approximately 20% of the subjects being black and 80% being white, and approximately 70% being male and 30% being female. The shooting mode was single-person shooting, with a total of 25 fixed cameras arranged to shoot simultaneously, and 1 camera was used for cooperative shooting. The props included hats, ordinary glasses, sunglasses, masks, etc., and the prop configurations were randomly and overlappingly distributed. The vehicle model was a 5-seater passenger car, and the vehicle was in a stationary parking state during the shooting. The lighting conditions for collection included frontal light, backlight, side light, interior light source, street lamps, shade, opposing headlights, etc. The collected content included two parts: actions and gaze. Video was used for action collection, and images and calibration data were used for gaze collection. The gaze-action collection included 30 gaze positions inside the vehicle, and the average number of basic action scenes in video action collection was approximately 18. Some collectors included 1 to 8 additional scenes, totaling 26 types.

Autonomous Driving Incabin

Add to Cart

Smart Cockpit-DMS diverse drivers Dataset

Autonomous Driving Incabin

Add to Cart

TTS

Hundreds of Sound Effects Corpus

The database includes 107 categories, with a total of 69,154 sets of audio files and corresponding labeling. The total duration of the audio is about 254.81 hours,including the zero-cleared/original silence at the beginning and ending that need to be retained. The duration of the labeling is about 248.59 hours, which only covers the duration of sound events. The sound effect content is organized into 4 first-level categories (Human Sounds, Animal Sounds, Environmental Sounds, Mechanical Sounds), 26 second-level categories, and 107 third-level categories (see Appendix). It basically covers the sound elements in common real-life scenarios. In the labeling process, the occurrence and duration of each sound effect (sound event) in the audio are annotated in the form of timestamps, and detailed descriptions are provided in natural Chinese language.

TTS Human Sound Human Voice

Add to Cart

CV Multimodal

High-Definition Dance Video Corpus

Product Features: This dataset has collected 100,000 dance videos, each averaging 30 seconds in length, at 4K resolution, including adults and teenagers with a foundation in dance, with a balanced gender ratio. It includes both solo and group dances, with high richness in videos from various angles such as front, side, back, and turning. Dance types include folk dance, jazz, street dance, and more. Application Fields: This dataset can be applied to virtual humans, VR, dance education, video production, and other fields, promoting the application and development of multimodal technology in the corresponding areas.

Dance Video Virtual human Dance Education

Add to Cart

Smart Cockpit-DMS diverse drivers Dataset

Autonomous Driving Incabin

Add to Cart

ASR

Mandarin Speech Recognition Corpus (Desktop)

This database is collected over Desktop microphones in quiet (office/home/Studio) environment, which were from 13895 speakers, including 6811 male and 7084 female. The total pure recording time is about 5867.9 hours, including the reasonable leading and trailing silence.

Education and learning Smart search Chinese

Add to Cart

CV Multimodal

Lip-movement dataset

This dataset uses cameras to collect audio and video data of lip movements. The shooting scene is an indoor quiet environment, and various light conditions are simulated, including normal light, strong light, backlight, and weak light, with the shooting distance ranging from 0.5m to 1m, with 0.5m accounting for approximately 90%. The shooting angle is frontal, and the image size mainly covers the upper body. In addition to single-person collection, the collection also simulates queuing scenarios. Approximately 30% of the data in each person's video is collected by multiple people, and in the multi-person scenarios, the number of people on screen is mostly 2. The collectors mainly speak Mandarin at a normal speed, and some collectors may have slight local accents. Each person records 10 sentences of text, with an average of 10 to 15 words per sentence. The ages of the collectors cover multiple age groups, with a majority being children and middle-aged people, and the gender ratio is balanced. While the video is being recorded, there is also a front-facing interface microphone recording synchronously, and the audio file comes from the collected video.

Virtual human VR Video Collection

Add to Cart

TTS

Hundreds of Sound Effects Corpus

TTS Human Sound Human Voice

Add to Cart

CV Multimodal

High-Definition Dance Video Corpus

Dance Video Virtual human Dance Education

Add to Cart

Smart Cockpit-DMS diverse drivers Dataset

Autonomous Driving Incabin

Add to Cart

ASR

Mandarin Speech Recognition Corpus (Desktop)

Education and learning Smart search Chinese

Add to Cart

CV Multimodal

Lip-movement dataset

Virtual human VR Video Collection

Add to Cart

Data Collection

We provide support for data collection in all languages and dialects, multi-scene images and video, and text corpus in multiple industries worldwide.

Data Labeling

We empower businesses with high-quality test and labeled data, accelerating AI R&D, deployment, and overall model performance. Our self-made platform and global network ensure data quality and support enterprises in building core AI competitiveness.

Autonomous Driving

We fully support 2D/3D/4D point cloud or image data annotation across all dimensions of the autonomous driving field.

This includes attribute recognition, object detection and tracking for persons, vehicles, road signs, traffic lights, and lane lines, as well as keypoints labeling with structural information, semantic segmentation, 3D point cloud annotation, 2D and 3D fusion annotation, point cloud lane line and road sign annotation, and 4D BEV large point cloud data annotation.

Learn More

Domain-Specific Expert Network

Our global network of experts spans a wide array of fields, ensuring that we can provide the specialized knowledge and skills your projects requirements. Whether you’re working on language-specific applications, complex coding challenges, or domain-specific AI solutions, our experts are ready to assist.

DOTS Platform

Our platform offers flexible project management, advanced algorithms, and support for over 200 annotation tasks, optimizing autonomous driving and other applications. With 400+ models, multilingual capabilities, and scalable deployment options, it caters to diverse needs across industries.

Industrial solutions

LLMs

Smart Healthcare

Internet

Retail

Smart Finance

Agentic AI

Smart Home

Autonomous Driving

Let's shape your
AI future together

Data Quality and Diversity

Dataocean AI provide over 1600 high-quality and diverse off the shelf datasets, which are fundamental for the success of machine learning and artificial intelligence projects. Emphasizing the meticulous processes of data acquisition, processing, and labeling, we employs to ensure data accuracy and variety, alongside the breadth and depth of its data coverage, can underline its commitment to excellence in this area.

Advanced Technologies and Platform

The comprehensive data platforms designed for AI applications, including a data engine for collecting, curating, and annotating data, and training and evaluating models. Combining AI-based techniques with human-in-the-loop, Dataocean AI delivers labeled data with unprecedented quality, scalability, and efficiency. This approach not only ensures the development of high-performing models but also facilitates sustainable and successful AI programs tailored to specific business needs .

Industry Expertise and Experience

With almost 20 years professional AI data project experience, we enable a deep understanding of specific customer‘s needs and challenges. This allows the company to provide tailored solutions to clients, helping them tackle complex issues and achieve their business objectives effectively.

Strong Security and Compliance

We place a high priority on data security and privacy, adhering to stringent security protocols and compliance standards while handling sensitive information. We are proud to offer our customers the highest-level security in accordance with the following standards: ISO 9001/27001/27701

Customer Success and Support

Dedicated to client success, we offer comprehensive support and services from the initial planning stages of a project to its final implementation and beyond. Highlighting how the company fosters long-term relationships through expert consultations, regular progress updates, and continuous technical support can showcase its commitment to customer satisfaction.

Data Quality and Diversity

Advanced Technologies and Platform

Industry Expertise and Experience

Strong Security and Compliance

Customer Success and Support

Resources

Case Study

Case Study｜‌Multilingual Emotional TTS Data Development Practice: Enabling More Natural Speech Synthesis

Blog

"Can You Interrupt AI Mid-Response?” Discover the Full-Duplex Power Behind GPT Realtime × Gemini — All Thanks to Full-Duplex Datasets!

9,000-Hour Chinese Full-Duplex Speech Recognition Corpus

Blog

The IEEE International Conference on Multimedia & Expo (ICME) 2025 Audio Encoder Capability Challenge

Case Study

Case Study｜‌Multilingual Emotional TTS Data Development Practice: Enabling More Natural Speech Synthesis

Blog

"Can You Interrupt AI Mid-Response?” Discover the Full-Duplex Power Behind GPT Realtime × Gemini — All Thanks to Full-Duplex Datasets!

9,000-Hour Chinese Full-Duplex Speech Recognition Corpus

Blog

The IEEE International Conference on Multimedia & Expo (ICME) 2025 Audio Encoder Capability Challenge

The Foundation Powering Advanced AI

Datasets

Data Collection

Large Language Models (LLM)

Multimodal

Automatic Speech Recognition (ASR)

Text To Speech (TTS)

Natural Language Processing (NLP)

Computer Vision (CV)

Autonomous Driving

Data Labeling

Large Language Models (LLM)

Multimodal

Automatic Speech Recognition (ASR)

Text To Speech (TTS)

Natural Language Processing (NLP)

Computer Vision (CV)

Autonomous Driving

Domain-Specific Expert Network

· STEM · Coding · Language · Finance/Medicine/Law/Other 30+ domains

· 200+ Global Languages · 40+ Coding Languages

· 50,000+ Experts with Advanced Qualifications

DOTS Platform

Flexibility in Project Workflow and Tool Configuration

Enhanced by Large-Scale Production Initiatives

Boosting Efficiency with Human-Machine Interaction

Scalable Performance and Versatile Deployment

Industrial solutions

LLMs

Smart Healthcare

Internet

Retail

Smart Finance

Agentic AI

Smart Home

Autonomous Driving

Let's shape your
AI future together

Resources

Get started

The Foundation Powering Advanced AI

Datasets

Data Collection

Large Language Models (LLM)

Multimodal

Automatic Speech Recognition (ASR)

Text To Speech (TTS)

Natural Language Processing (NLP)

Computer Vision (CV)

Autonomous Driving

Data Labeling

Large Language Models (LLM)

Multimodal

Automatic Speech Recognition (ASR)

Text To Speech (TTS)

Natural Language Processing (NLP)

Computer Vision (CV)

Autonomous Driving

Domain-Specific Expert Network

· STEM · Coding · Language · Finance/Medicine/Law/Other 30+ domains

· 200+ Global Languages · 40+ Coding Languages

· 50,000+ Experts with Advanced Qualifications

DOTS Platform

Flexibility in Project Workflow and Tool Configuration

Enhanced by Large-Scale Production Initiatives

Boosting Efficiency with Human-Machine Interaction

Scalable Performance and Versatile Deployment

Industrial solutions

LLMs

Smart Healthcare

Internet

Retail

Smart Finance

Agentic AI

Smart Home

Autonomous Driving

Let's shape your AI future together

Resources

Get started

Join our newsletter to stay updated

Let's shape your
AI future together