Open Datasets: GigaSpeech 2 – 30,000 Hours of Southeast Asian Multilingual Speech Recognition Open Source Dataset Released

The term “Giga” originates from “gigantic,” reflecting the vast audio resources available on the internet. However, the quality of these audio resources varies significantly, and high-quality audio-text pairs are particularly scarce and expensive to annotate, especially for low-resource languages.  GigaSpeech, a highly successful open-source English dataset, addresses this issue by providing thousands of hours of […]

Dataocean AI New Datasets – July

Dataocean AI has launched new high quality datasets including minor language smart voice dataset, telephoto landscape image dataset, and multi-skin tone cabin video dataset. These resources aim to help enterprises develop more extensive and higher-quality large models and AI applications to meet the diverse needs of global users.   Arabic Speech Recognition Dataset Product Features: Arabic, […]

The era of “Movie-Her” has arrived:Unlocking the Emotional Data Behind GPT-4o

  GPT-4o can already be considered an emotionally rich and human-like intelligent voice assistant, or more accurately, a “new species” that is increasingly approaching human interaction. This powerful model also has the ability to understand and synthesize text, images, videos, and voice, and can even be seen as an unfinished version of GPT-5.  Watch how […]

The era of “Movie-Her” has arrived :Key Data for Humanlike Speech Synthesis Systems

As numerous tech companies race to enhance the multimodal capabilities of large models and strive to integrate functions like text summarization and image editing into mobile devices, OpenAI has launched a new product! CEO Samuel Harris Altman expressed his state with three letters: her (just like the movie “Her”). In the early morning of May […]

Dataocean AI New Datasets – May

In the field of artificial intelligence, the technology of large models is continuously driving innovation and development across various industries.  Dataocean AI has introduced new multilingual, multi-emotional, and multi-scenario intelligent voice data, as well as image data with Chinese element styles, to help companies develop more diverse and high-quality models and products to meet the […]

Chinese Continuous Visual Speech Recognition Challenge 2024

Initiated by the NCMMSC 2024 Organizing Committee and jointly hosted by Tsinghua University, Beijing University of Posts and Telecommunications, Speech Home and Dataocean AI, the second Chinese Continuous Visual Speech Recognition Challenge (CNVSRC 2024) kicks off today. We sincerely invite your participation and registration. Event Introduction Visual speech recognition, also known as lip reading, is […]