Synthetic Sample Extension in Implementation of Tangut Character Databases
- Authors: Meng Y.1,2, Yuan X.1, Wei X.1, Yang W.2, Chen Y.2
-
Affiliations:
- School of Electronic and Information Engineering Beijing Jiaotong University
- School of Physics and Electronic-Electrical Engineering Ningxia University
- Issue: Vol 52, No 4 (2018)
- Pages: 334-343
- Section: Article
- URL: https://journal-vniispk.ru/0146-4116/article/view/175528
- DOI: https://doi.org/10.3103/S0146411618040089
- ID: 175528
Cite item
Abstract
The Tangut script was a logographic writing system used for the extinct Tangut language of the Western Xia Dynasty, which spanned 1038 to 1227. The technic of optical character recognition, machine learning, and computer vision will help greatly in the unscrambling of the character in the ancient scripts. But all these technics are based on the character database, which provides learning samples and test standards. In the process of building the Tangut Character Databases using the ancient Tangut scripts as a data source, it is found that the problem of imbalanced class distribution significantly compromises the performance of learning algorithms. A method of synthetic sample generation was proposed in this paper to improve the performance of learning and recognition of Tangut characters. The comparison of recognition accuracy between the learning base in the original data set and the synthetic generated data set was demonstrated, and presented an impressive superiority utilizing the researchers’ method. The organization of Tangut character databases was also introduced in this paper.
About the authors
Yifei Meng
School of Electronic and Information Engineering Beijing Jiaotong University; School of Physics and Electronic-Electrical Engineering Ningxia University
Author for correspondence.
Email: river_dance@163.com
China, Beijing; Yinchuan
Xue Yuan
School of Electronic and Information Engineering Beijing Jiaotong University
Email: river_dance@163.com
China, Beijing
Xueye Wei
School of Electronic and Information Engineering Beijing Jiaotong University
Email: river_dance@163.com
China, Beijing
Wenhui Yang
School of Physics and Electronic-Electrical Engineering Ningxia University
Email: river_dance@163.com
China, Yinchuan
Yan Chen
School of Physics and Electronic-Electrical Engineering Ningxia University
Email: river_dance@163.com
China, Yinchuan
Supplementary files
