This dataset was recorded by 10 speakers with authentic pronunciation and diverse vocal qualities (5 males and 5 females) in a professional recording studio. The recorded texts cover all phonemes, and the annotators have a professional linguistic background, ensuring the data meets the research and development needs for voice synthesis.
Daily language, finance, customer service, news, and novel fields.
Labeling Process
Text, audio, quality inspection, proofreading
Accuracy Rate
The accuracy rate of phonetic labeling is 99.5%.
Samples
Audio
Ayah bahkan tidak menangis ketika ditinggal dengan istri yang ia cintai
Kemudian menyambung kehidupan dengan cara menjadi pembantu rumah tangga
Mereka ditahan di sekolah pertanian yang dijadikan interneringkamp
Para staf ini cenderung harus menunggu perintah untuk menjalankan tugasnya
People also searched for
Chinese American English Synthesis Corpus
This datasets contains 80 speakers, with a balanced gender ratio, approximately 1.5 hours of data per speaker.
Existing labeling stages: Pronunciation, Prosody
Ongoing labeling: Phoneme boundaries
Overview: Focuses on common/fundamental language, includes everyday dialogue in a natural style