A Multimodal Network on Handwritten Chinese Character Error Correction


Haizhao Sun (Beijing University of Posts and Telecommunications), Yu Ning (Beijing University of Posts and Telecommunications), jixv (Beijing University of Posts and Telecommunications), Chuang Zhang (Beijing University of Posts and Telecommunications), Ming Wu (Beijing University of Post and Telecommunication)
The 35th British Machine Vision Conference

Abstract

Handwritten Chinese characters possess complex internal structures and a vast array of categories, making errors highly diverse. Therefore, Handwritten Chinese Character Error Correction (HCCEC) cannot be simply framed as a classification problem, but should be expressed as an open vocabulary question without predefined categories. Beyond visual information, Chinese characters also carry semantic information such as structure and components, which can be represented as Ideographic Description Sequences (IDS). To harness multiple modalities effectively, we adopt a human-inspired approach to error identification, discerning differences in components and structures between erroneous and correct characters. Accordingly, we propose a multi-modal encoder-decoder network incorporating CLIP training methodology. Through pre-training similar to CLIP, the model aligns handwritten characters with their corresponding IDS. The multi-modal decoder deciphers features combining image and semantic information, outputting IDS. With the output IDS, identifying and correcting errors becomes straightforward. The experimental results indicate that our method, as an approach not reliant on pre-defined categories, achieves performance comparable to that of closed-set classification methods with pre-defined categories in the HCCEC task.

Citation

@inproceedings{Sun_2024_BMVC,
author    = {Haizhao Sun and Yu Ning and jixv and Chuang Zhang and Ming Wu},
title     = {A Multimodal Network on Handwritten Chinese Character Error Correction},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0882.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection