This dataset contains 27 hours of recordings from 6 males and 21 females. The pronunciation and intonation are precisely annotated to ensure high quality and usability. The voices are recorded by non-professional speakers, making the tones more natural, though some accents or hoarseness may be present. The topics of this dataset includes expanded conversational topics such as daily life, hobbies, and special skills.
English Average Voice Synthesis Corpus – Conversation
Participants in pairs are recorded in the same studio, with each individual's voice captured in a separate audio file. No text transcriptions are currently available.