Zhiming Lin Papers

AAAI Conference 2026 Conference Paper

CEC-Zero: Zero-Supervision Character Error Correction with Self-Generated Rewards

Zhiming Lin
Kai Zhao
Sophie Zhang
Peilai Yu
Canran Xiao

Large-scale Chinese spelling correction (CSC) remains critical for real-world text processing, yet existing LLMs and supervised methods lack robustness to novel errors and rely on costly annotations. We introduce CEC-Zero, a zerosupervision reinforcement learning framework that addresses this by enabling LLMs to correct their own mistakes. CEC-Zero synthesizes errorful inputs from clean text, computes cluster-consensus rewards via semantic similarity and candidate agreement, and optimizes the policy with PPO. It outperforms supervised baselines by 10–13 F1 points and strong LLM fine-tunes by 5–8 points across 9 benchmarks, with theoretical guarantees of unbiased rewards and convergence.CEC-Zero establishes a label-free paradigm for robust, scalable CSC, unlocking LLM potential in noisy text pipelines.

PDF Details DOI

AAAI Conference 2026 Conference Paper

From Points to Coalitions: Hierarchical Contrastive Shapley Values for Prioritizing Data Samples

Canran Xiao
Jiabao Dou
Zhiming Lin
Zong Ke
Liwei Hou

How should we quantify the value of each training example when datasets are large, heterogeneous, and geometrically structured? Classical Data-Shapley answers in principle, but its O(n!) complexity and point-wise perspective are ill-suited to modern scales. We propose Hierarchical Contrastive Data Valuation (HCDV), a three-stage framework that (i) learns a contrastive, geometry-preserving representation, (ii) organizes the data into a balanced coarse-to-fine hierarchy of clusters, and (iii) assigns Shapley-style pay-offs to coalitions via local Monte-Carlo games whose budgets are propagated downward. HCDV collapses the factorial burden to O(T∑ℓKℓ) = O(TKmax log n), rewards examples that sharpen decision boundaries, and regularizes outliers through curvature-based smoothness. We prove that HCDV approximately satisfies the four Shapley axioms with surplus loss O(η log n), enjoys sub-Gaussian coalition deviation Õ(1/√T), and incurs at most kε∞ regret for top-k selection. Experiments on four benchmarks — tabular, vision, streaming, and a 45 M-sample CTR task — plus the OpenDataVal suite show that HCDV lifts accuracy by up to +5 pp, slashes valuation time by up to 100×, and directly supports tasks such as augmentation filtering, low-latency streaming updates, and fair marketplace payouts.

PDF Details DOI

Possible papers

CEC-Zero: Zero-Supervision Character Error Correction with Self-Generated Rewards

From Points to Coalitions: Hierarchical Contrastive Shapley Values for Prioritizing Data Samples