Hong Kong Cantonese Text Corpus - Word Segmentation and POS - DataoceanAI

Hong Kong Cantonese Text Corpus – Word Segmentation and POS

Cantonese Data labeling Content

Collecting from news or daily chat corpus, and performing word segmentation and part-of-speech tagging.

Specifications:

ID:

King-NLP-172

Language:

Cantonese

Size

300000 entries

Accuracy Rate

The accuracy of the labeling results is 95%

People also searched for

Tamil Text Normalization Corpus

Bengali Text Normalization Corpus

Swahili Text Normalization Corpus

Somali Text Normalization Corpus

Get started