All Datasets

Search our off-the-shelf datasets.

Filter by

ASR

TTS

Model Evaluaion Report

NLP

Lexicon

Machine Translation

OCR

Multimodal

840 Person Image Collection by Front-facing Camera and Face 21Points Labeling

Image collection Yellow skin Camera

Accented English Pronunciation Evaluation Corpus (Word Level)

This database is collected over Mobile phones in quiet (office/home) environment, which were from 22 speakers, including 11 male and 11 female. The total pure recording time is about 11.38 hours, including the reasonable leading and trailing silence.

Education and learning Smart search Speech recognition

Aesthetic Composition Training Corpus

Images are captured by professional photographers. Composition types include rule-of-thirds, horizontal, diagonal, triangular, and central composition. All images are evaluated and annotated by personnel with high aesthetic standards. Each image meets at least one composition type and at most three composition types.

Aesthetics Video Corpus

Video Collection

Afghani Dari Conversational Speech Recognition Corpus (Telephone)

This database is collected over Telephone in quiet (office/home) environment, which were from 40 speakers, including 20 male and 20 female. The total pure recording time is about 40 hours, including the reasonable leading and trailing silence.

Daily conversation Multiple scenarios

Afghanistan Dari Pronunciation Lexicon

This Afghanistan Dari Pronunciation Lexicon, curated by DataoceanAI Inc., offers a wealth of linguistic resources tailored specifically for the Dari language as spoken in Afghanistan. With 30,075 meticulously crafted entries and an impressive 95.00% entry accuracy rate, this lexicon provides accurate pronunciation transcription in the popular XSAMPA phonemic system. It serves as indispensable training data for speech recognition, speech synthesis, and other language processing applications.

Speech recognition speech synthesis Dari

Afghanistan Pashto Pronunciation Lexicon

This Afghanistan Pashto Pronunciation Lexicon, curated by DataoceanAI Inc., offers a wealth of linguistic resources tailored specifically for the Pashto language as spoken in Afghanistan. With 50,170 meticulously crafted entries and an impressive 95.00% entry accuracy rate, this lexicon provides accurate pronunciation transcription in the popular XSAMPA phonemic system. It serves as indispensable training data for speech recognition, speech synthesis, and other language processing applications.

Speech recognition speech synthesis Pashto

African American English Speech Recognition Corpus (Mobile)

This database is collected over Mobile phones in quiet (office) environment, which were from 200 speakers, including 100 male and 100 female. The total pure recording time is about 120.5 hours, including the reasonable leading and trailing silence.

Education and learning Smart search Speech recognition

Afrikaans Speech Recognition Corpus (Mobile)

This database is collected over Mobile phones in quiet (office/home) environment, which were from 402 speakers, including 88 male and 314 female. The total pure recording time is about 235 hours, including the reasonable leading and trailing silence.

Education and learning Smart search Speech recognition

All Datasets

Filter by

840 Person Image Collection by Front-facing Camera and Face 21Points Labeling

Accented English Pronunciation Evaluation Corpus (Word Level)

Aesthetic Composition Training Corpus

Aesthetics Video Corpus

Afghani Dari Conversational Speech Recognition Corpus (Telephone)

Afghanistan Dari Pronunciation Lexicon

Afghanistan Pashto Pronunciation Lexicon

African American English Speech Recognition Corpus (Mobile)

Afrikaans Speech Recognition Corpus (Mobile)

Get started

Join our newsletter to stay updated

Filter by