TTS

Search our off-the-shelf datasets.

Filter by
Language
Filter by Languages
Language
Devices
Devices
Applicable Fields
Applicable Fields
More
Applicable Scenarios
Applicable Scenarios
More
American English Male Speech Synthesis Corpus
This dataset was recorded by a 58-year-old male speaker with authentic pronunciation and a mature, steady vocal quality in a professional recording studio. The recorded texts cover all phonemes, and the annotators have a professional linguistic background, ensuring the data meets the research and development needs for voice synthesis.
American English Male Speech Synthesis Corpus
This dataset was recorded by a 35-year-old voice actor with a mature and steady timbre.
American English Male Speech Synthesis Corpus
This dataset was recorded by a 33-year-old voice actor with a mature and steady timbre.
American English Male Speech Synthesis Corpus – Gentle and Mature (Aged 30-40)
Multi-emotion - Neutral, Happy, Angry, Sad, Shocked, Hateful, Scared, Shouting, Crying, Laughing, Weak
American English Male Speech Synthesis Corpus – Gentle and Warm Man (Aged 20-30)
Multi-emotion - Neutral, Happy, Angry, Sad, Shocked, Hateful, Scared, Shouting, Crying, Laughing, Weak
American English Male Speech Synthesis Corpus (Narrative Style)
This dataset was recorded by a 75-year-old male speaker with authentic pronunciation and a mature, steady vocal quality in a professional recording studio. The recorded texts cover all phonemes, and the annotators have a professional linguistic background, ensuring the data meets the research and development needs for voice synthesis.
American English Speech Synthesis Corpus (Reocrded by Phone of 620 Speakers)
This dataset was recorded by 620 speakers with authentic pronunciation and diverse vocal qualities (334 males and 286 females) in a quiet indoor environment. The recorded texts cover all phonemes, and the annotators have a professional linguistic background, ensuring the data meets the research and development needs for voice synthesis.
American English Synthesis Corpus
50 speakers, gender balanced, with pronunciation and prosody annotations
American Pop Songs Speech Synthesis Corpus (150 songs)
This dataset was recorded by 5 speakers with authentic pronunciation and diverse vocal qualities (one male and four females) in a professional recording studio. The recorded texts span the full range of phonemes, and the annotators have a professional linguistic background, ensuring the data meets the research and development needs for voice synthesis.

Join our newsletter to stay updated

Thank you for signing up!

Stay informed and ahead with the latest updates, insights, and exclusive content delivered straight to your inbox.

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.

Filter by
Filter by
Language
Filter by Languages
Language
Devices
Devices
Applicable Fields
Applicable Fields
More
Applicable Scenarios
Applicable Scenarios
More