拼写纠正系列

NLP 中文拼写检测实现思路

NLP 中文拼写检测纠正算法整理

NLP 英文拼写算法,如果提升 100W 倍的性能?

NLP 中文拼写检测纠正 Paper

java 实现中英文拼写检查和错误纠正?可我只会写 CRUD 啊!

一个提升英文单词拼写检测性能 1000 倍的算法?

单词拼写纠正-03-leetcode edit-distance 72.力扣编辑距离

NLP 开源项目

nlp-hanzi-similar 汉字相似度

word-checker 拼写检测

pinyin 汉字转拼音

opencc4j 繁简体转换

sensitive-word 敏感词

论文地址

可以参考:https://paperswithcode.com/task/chinese-spell-checking

论文梳理

Chinese Text Correction Papers

This repo aims to keep tracking related work in Chinese text correction, including Chinese Spell Checking (CSC) and Chinese Grammatical Error Correction (CGEC).

该仓库旨在跟踪与中文文本修正相关的工作,包括中文拼写检查(CSC)和中文语法错误修正(CGEC)。

2024

|paper|conference|resource|citation|labels| |:—:|:—:|:—:|:—:|:—:| |Chinese Spelling Correction as Rephrasing Language Model|AAAI2024|[pdf [code] |citation||

2023

|paper|conference|resource|citation|labels| |:—:|:—:|:—:|:—:|:—:| |A Frustratingly Easy Plug-and-Play Detection-and-Reasoning Module for Chinese Spelling Check|EMNLP2023|[pdf] [code] |citation|| |Disentangled Phonetic Representation for Chinese Spelling Correction|ACL2023|[pdf] [code] |citation|| |Rethinking Masked Language Modeling for Chinese Spelling Correction|ACL2023|[pdf] [code] |citation|| |PTCSpell: Pre-trained Corrector Based on Character Shape and Pinyin for Chinese Spelling Correction|ACL2023|[pdf]|citation|| |Investigating Glyph-Phonetic Information for Chinese Spell Checking: What Works and What’s Next?|ACL2023|[pdf] [code] |-|| |Are Pre-trained Language Models Useful for Model Ensemble in Chinese Grammatical Error Correction?|ACL2023|[pdf] [code] |-|| |NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts|ACL2023|[pdf] [code] |citation||

2022

paper conference resource citation labels
Linguistic Rules-Based Corpus Generation for Native Chinese Grammatical Error Correction EMNLP2022 [pdf][code] citation
FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction EMNLP2022 [pdf][code] citation
SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser EMNLP2022 [pdf][code] citation
Revisiting Grammatical Error Correction Evaluation and Beyond EMNLP2022 [pdf][code] citation
From Spelling to Grammar: A New Framework for Chinese Grammatical Error Correction EMNLP2022 [pdf] citation
Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity EMNLP2022 [pdf][code] citation
Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation AAAI2022 [pdf] citation
Non-Autoregressive Chinese ASR Error Correction with Phonological Training NAACL2022 [pdf] citation
MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction NAACL2022 [pdf] [code] citation
Improving Chinese Grammatical Error Detection via Data augmentation by Conditional Error Generation ACL2022 [pdf] citation
CRASpell: A Contextual Typo Robust Approach to Improve Chinese Spelling Correction ACL2022 [pdf] [code] citation
MDCSpell: A Multi-task Detector-Corrector Framework for Chinese Spelling Correction ACL2022 [pdf] citation
The Past Mistake is the Future Wisdom: Error-driven Contrastive Probability Optimization for Chinese Spell Checking ACL2022 [pdf] citation

2021

|paper|conference|resource|citation|labels| |:—:|:—:|:—:|:—:|:—:| |Correcting Chinese Spelling Errors with Phonetic Pre-training|ACL2021|[pdf] [code] |citation|| |Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking|ACL2021|[pdf] [code] |citation|| |PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction|ACL2021|[pdf] [code] |citation|| |Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models|ACL2021|[pdf] [code] |citation|| |PHMOSpell: Phonological and Morphological Knowledge Guided Chinese Spelling Check|ACL2021|[pdf] |citation|| |Global Attention Decoder for Chinese Spelling Error Correction|ACL2021|[pdf] |citation|| |Dynamic Connected Networks for Chinese Spelling Check|ACL2021|[pdf] |citation|| |Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction|ACL2021|[pdf] [code] |citation|| |SpellBERT: A Lightweight Pretrained Model for Chinese Spelling Check|EMNLP2021|[pdf] [code] |citation|| |DCSpell: A Detector-Corrector Framework for Chinese Spelling Error Correction|SIGIR2021|[pdf] |citation|| |Think Twice: A Post-Processing Approach for the Chinese Spelling Error Correction|AppliedScience|[pdf] |citation||

2020

|paper|conference|resource|citation|labels| |:—:|:—:|:—:|:—:|:—:| |Spelling Error Correction with Soft-Masked BERT|ACL2020|[pdf] |citation|| |Spellgcn: Incorporating phonological and visual similarities into language models for chinese spelling check|ACL2020|[pdf] [code] |citation|| |Chunk-based Chinese Spelling Check with Global Optimization|EMNLP2020|[pdf] |citation|| |MaskGEC: Improving Neural Grammatical Error Correction via Dynamic Masking|AAAL2020|[pdf] |citation|| |Combining ResNet and Transformer for Chinese Grammatical Error Diagnosis|AACL2020|[pdf] |citation|| |Overview of NLPTEA-2020 Shared Task for Chinese Grammatical Error Diagnosis|AACL2020|[pdf] |citation| |

before

|paper|conference|resource|citation|labels| |:—:|:—:|:—:|:—:|:—:| |FASPell: A Fast, Adaptable, Simple, Powerful Chinese Spell Checker Based On DAE-Decoder Paradigm|EMNLP2019|[pdf] [code] |citation|| |A Hybrid Approach to Automatic Corpus Generation for Chinese Spelling Checking|EMNLP2018|[pdf] [code] |citation| |

参考资料

https://github.com/nghuyong/Chinese-text-correction-papers/blob/main/Readme.md