This dataset was recorded in noise environments such as cafes, restaurants, and streets, with a total of 54 speakers participating, including 27 males and 27 females. All speakers involved in the recording were professionally selected to ensure standard pronunciation and clear articulation. The recorded text covers information such as news and Twitter.