TTS

Search our off-the-shelf datasets.

Filter by
Language
Filter by Languages
Language
Devices
Devices
Applicable Fields
Applicable Fields
More
Applicable Scenarios
Applicable Scenarios
More
Hong Kong Cantonese Female Speech Synthesis Corpus
This dataset was recorded by a 28-year-old voice actor with a neutral-styled timbre.
Hong Kong Cantonese Female Speech Synthesis corpus (Multi-field)
This dataset was recorded by a 31-year-old female speaker with authentic pronunciation and a friendly, natural vocal quality in a professional recording studio. The recorded texts cover all phonemes, and the annotators have a professional linguistic background, ensuring the data meets the research and development needs for voice synthesis.
Hong Kong Cantonese Male Speech Synthesis Corpus
This dataset was recorded by a 31-year-old voice actor with a soft and steady timbre.
Hong Kong Cantonese Speech Synthesis Corpus – Spontaneous Monologues (Male & Female Voices)
Hungarian Female Speech Synthesis Corpus
This dataset was recorded by a single 33-year-old female speaker with authentic pronunciation and an elegant, mature vocal quality in a professional recording studio. The recorded texts encompass the full range of phonemes, and the annotators have a professional linguistic background, ensuring the data meets the research and development needs for voice synthesis.
Hungarian Male Speech Synthesis Corpus
This dataset was recorded by a single 30-year-old male speaker with authentic pronunciation and a calm, gentle vocal quality in a professional recording studio. The recorded texts encompass the full range of phonemes, and the annotators have a professional linguistic background, ensuring the data meets the research and development needs for voice synthesis.
Indian English Male and Female Multi-Emotion Speech Synthesis Corpus (Free Talk)
[Emotion] Five emotions: Excited, Sad, Angry, Fearful, and Empathetic. [Duration] Approximately 2 hours per speaker.
Indian English Natural Speech Synthesis Corpus (Read Speech)
[Style] Voice Assistant style, Podcast style, Audiobook style, and Online Learning style (four styles in total). [Duration] Approximately 2 hours per speaker, with about 0.5 hours for each style. [Content] Voice assistant data is recorded as single utterances, while the other styles are recorded in paragraph-level segments.
Indonesian Female Speech Synthesis Corpus (Customer Service Style)
This dataset was recorded by a 32-year-old female speaker with authentic pronunciation and a warm, gentle vocal quality in a professional recording studio. The recorded texts span the full range of phonemes, and the annotators have a professional linguistic background, ensuring the data meets the research and development needs for voice synthesis.

Join our newsletter to stay updated

Thank you for signing up!

Stay informed and ahead with the latest updates, insights, and exclusive content delivered straight to your inbox.

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.

Filter by
Filter by
Language
Filter by Languages
Language
Devices
Devices
Applicable Fields
Applicable Fields
More
Applicable Scenarios
Applicable Scenarios
More