The chinese dataset consists of 26 categories, total of 4,408 printed images and 510 handwritten images, covering most commonly used scenarios in daily life, with all data labeled.
Including PPT type, document type, natural light photography, screenshots, and handwriting type.
Labeling Content
Line-level bounding box labeling and transcription for the texts
Accuracy Rate
The accuracy of the labeling results is 97%
People also searched for
Korean Natural Scene OCR Image Corpus
This dataset consists of 8 categories and a total of 6788 printed images, covering most commonly encountered scenarios in daily life. The data was collected in Korea, and all the images in the dataset include labeling results.
This dataset consists of 8 categories and a total of 21218 printed images, covering most commonly encountered scenarios in daily life. The data was collected in India, and all the images in the dataset include labeling results.
This dataset consists of 9 categories and a total of 13882 printed images, covering most commonly encountered scenarios in daily life. The data was collected in Thailand, and all the images in the dataset include labeling results.
This dataset consists of 9 categories and a total of 14015 printed images, covering most commonly encountered scenarios in daily life. The data was collected in Vietnam, and all the images in the dataset include labeling results.