AILAW Journal 2025 Journal Article
Adversarial training flat-lattice transformer for named entity recognition of chinese legal texts
- Jiabao Wang
- Kaixuan Wang
- Yang Weng
- Xin Li
Abstract Judgment documents are the legally binding written conclusion made by the court based on the facts of the case and the law. Due to the use of professional terms and nested combinations of words, potential information of judgment documents has not been deeply excavated. Named Entity Recognition (NER) is a necessary task in Natural Language Processing (NLP), and has been widely introduced into Chinese texts processing for many years. However, the professional terms and nested words lead to the boundaries between entities being blurred, which can not accurately divide entities. In this paper, a new NER model Adversarial Training Flat-Lattice Transformer (AT-Flat) which combines adversarial training and Flat-Lattice Transformer (Flat) is proposed to weaken these problems. In AT-Flat, the Flat combines character and word information to get sequence information, and the CRF is used to output the final entity prediction results. Moreover, the key point is an adversarial training framework introduced to integrate task-shared word boundary information from Chinese Word Segmentation (CWS) task into Chinese NER task. The framework is able to filter out the noise caused by CWS task and further enhance the effect of Chinese NER task. More importantly, experiments on NER task of Chinese traffic accident and financial lending judgment documents show that our method outperforms other state-of-the-art methods. These verify that our method can effectively alleviate the problem of poor NER effect caused by professional terms and nested words. In addition, three public Chinese NER datasets were also used to evaluate our method.