The dataset includes 142 distinctive speakers with various emotions such as happiness, sadness, anger, surprise, calmness, dislike, fear, etc.; it can greatly enhance the naturalness and expressiveness of the model.
English Average Voice Synthesis Corpus – Conversation
Participants in pairs are recorded in the same studio, with each individual's voice captured in a separate audio file. No text transcriptions are currently available.