Author name cluster

Ke Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

73 papers

2 author rows

EAAI Journal 2026 Journal Article

A Chinese financial event knowledge graph-based retrieval-augmented generation framework for financial question answering

Haitao Cheng
Ke Wang
Qi Wang
Tao Liu
Kai Sheng

Details DOI

AAAI Conference 2026 Conference Paper

Activations as Features: Probing LLMs for Generalizable Essay Scoring Representations

Jinwei Chi
Ke Wang
Yu Chen
Xuanye Lin
Qiang Xu

Automated essay scoring (AES) is a challenging task in cross-prompt settings due to the diversity of scoring criteria. While previous studies have focused on the output of large language models (LLMs) to improve scoring accuracy, we believe activations from intermediate layers may also provide valuable information. To explore this possibility, we evaluated the discriminative power of LLMs’ activations in cross-prompt essay scoring task. Specifically, we used activation to fit probes and further analyzed the effects of different models and input content of LLMs on this discriminative power. By computing the directions of essays across various trait dimensions under different prompts, we analyzed the variation in evaluation perspectives of large language models concerning essay types and traits. Results show that the activations possess strong discriminative power in evaluating essay quality and that LLMs can adapt their evaluation perspectives to different traits and essay types, effectively handling the diversity of scoring criteria in cross-prompt settings.

PDF Details DOI

TCS Journal 2026 Journal Article

Analysis of key reuse security for Aigis.KEM

Ke Wang
Haodong Jiang
Zhenfeng Zhang
Long Chen
Huiqin Xie

Details DOI

EAAI Journal 2026 Journal Article

Distribution-guided and perturbation based contrastive clustering for specific emitter identification

Ke Wang
Yiting Gao
Hao Huang
Jiao Wang
Jiaxu Liu
Yao Zheng
Jianqing Li

Details DOI

AAAI Conference 2026 Conference Paper

From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench

Weikang Shi
Houxing Ren
Junting Pan
Aojun Zhou
Ke Wang
Zimu Lu
Yunqiao Yang
Yuxuan Hu

Large Language Models (LLMs) show significant potential in AI mathematical tutoring, yet current evaluations often rely on simplistic metrics or narrow pedagogical scenarios, failing to assess comprehensive, multi-turn teaching effectiveness. In this paper, we introduce KMP-Bench, a comprehensive K-8 Mathematical Pedagogical Benchmark designed to assess LLMs from two complementary perspectives. The first module, KMP-Dialogue, evaluates holistic pedagogical capabilities against six core principles (e.g., Challenge, Explanation, Feedback), leveraging a novel multi-turn dialogue dataset constructed by weaving together diverse pedagogical components. The second module, KMP-Skills, provides a granular assessment of foundational tutoring abilities, including multi-turn problem-solving, error detection and correction, and problem generation. Our evaluations on KMP-Bench reveal a key disparity: while leading LLMs excel at tasks with verifiable solutions, they struggle with the nuanced application of pedagogical principles. Additionally, we present KMP-Pile, a large-scale (150K) dialogue dataset. Models fine-tuned on KMP-Pile show substantial improvement on KMP-Bench, underscoring the value of pedagogically-rich training data for developing more effective AI math tutors.

PDF Details DOI

AAAI Conference 2026 Conference Paper

HDGS: Hierarchical Dynamic Gaussian Splatting for Urban Driving Scenes

Fudong Ge
Jin Gao
Hanshi Wang
Yiwei Zhang
Ke Wang
Weiming Hu
Zhipeng Zhang

This paper tackles the challenging task of achieving storage-efficient yet high-fidelity motion representation in large-scale dynamic 3D Gaussian Splatting. Our motivation stems from the truth that existing urban-scale methods, which rely on massive and unstructured individual Gaussians for scene modeling, face a critical scalability bottleneck. Inspired by recent advances in the 3DGS-based compression beyond autonomous driving, we address this challenge by leveraging the compression capability of anchor-driven methods. However, this is non-trivial as our exploratory experiments reveal that the direct application of this paradigm to dynamic, large-scale urban scenes results in performance degradation. We attribute this phenomenon to the hierarchical anchor design that severely loses dynamic information. To this end, we propose Hierarchical Dynamic Gaussian Splatting (HDGS), a novel framework designed to adapt the anchor-based Gaussian paradigm to 4D urban environments. We first establish a local support network to reinforce inter-anchor consistency, mitigating geometric and appearance fractures caused by supervision attenuation in deep hierarchies. Then, we handle heterogeneous object motion via coarse-to-fine decomposition, where high-level anchors model coarse dynamics and low-level anchors refine them with residual deformations. Third, we introduce a hybrid supervision scheme that fuses global geometric constraints and local pixel-level cues to alleviate geometrically inconsistent reconstruction under sparse LiDAR. Extensive experiments show that HDGS reduces storage by 69.0% while maintaining or even improving rendering fidelity compared to state-of-the-art methods.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Negative Entity Suppression for Zero-Shot Captioning with Synthetic Images

Zimao Lu
Hui Xu
Bing Liu
Ke Wang

Text-only training provides an attractive approach to address data scarcity challenges in zero-shot image captioning (ZIC), avoiding the expense of collecting paired image-text annotations. However, although these approaches perform well within training domains, they suffer from poor cross-domain generalization, often producing hallucinated content when encountering novel visual environments. Retrieval-based methods attempt to mitigate this limitation by leveraging external knowledge, but they can paradoxically exacerbate hallucination when retrieved captions contain entities irrelevant to the inputs. We introduce the concept of negative entities—objects that appear in generated caption but are absent from the input—and propose Negative Entity Suppression (NES) to tackle this challenge. NES seamlessly integrates three stages: (1) it employs synthetic images to ensure consistent image-to-text retrieval across both training and inference; (2) it filters negative entities from retrieved content to enhance accuracy; and (3) it applies attention-level suppression using identified negative entities to further minimize the impact of hallucination-prone features. Evaluation across multiple benchmarks demonstrates that NES maintains competitive in-domain performance while improving cross-domain transfer and reducing hallucination rates, achieving new state-of-the-art results in ZIC.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Semantic Document Derendering: SVG Reconstruction via Vision-Language Modeling

Adam Hazimeh
Ke Wang
Mark Collier
Gilles Baechler
Efi Kokiopoulou
Pascal Frossard

Multimedia documents such as slide presentations and posters are designed to be interactive and easy to modify. Yet, they are often distributed in a static raster format, which limits editing and customization. Restoring their editability requires converting these raster images back into structured vector formats. However, existing geometric raster vectorization methods, which rely on low-level primitives like curves and polygons, fall short at this task. Specifically, when applied to complex documents like slides, they fail to preserve the high-level structure, resulting in a flat collection of shapes where the semantic distinction between image and text elements is lost. To overcome this limitation, we address the problem of semantic document derendering by introducing SliDer, a novel framework that uses Vision-Language Models (VLMs) to derender slide images as compact and editable Scalable Vector Graphic (SVG) representations. SliDer detects and extracts the attributes from individual image and text elements in a raster input and organizes them into a coherent SVG format. Crucially, the model iteratively refines its predictions during inference in a process analogous to human design, generating SVG code that more faithfully reconstructs the original raster upon rendering. Furthermore, we introduce Slide2SVG, a novel dataset comprising raster-SVG pairs of slide documents curated from real-world scientific presentations, to facilitate future research in this domain. Our results demonstrate that SliDer achieves a reconstruction LPIPS of 0.069, and is favored by human evaluators in 82.9% of cases compared to the strongest zero-shot VLM baseline.

PDF Details DOI

EAAI Journal 2025 Journal Article

A small sample crop seed quality detection system based on spectral-image bimodal fusion technology

Qixing Tang
Longbao Liu
Dong Wei
Juan Liao
Zhongliang Liu
Leizi Jiao
Yachao Liu
Ke Wang

Details DOI

IJCAI Conference 2025 Conference Paper

Accelerating Adversarial Training on Under-Utilized GPU

Zhuoxin Zhan
Ke Wang
Pulei Xiong

Deep neural networks are vulnerable to adversarial attacks and adversarial training has been proposed to defend against such attacks by adaptively generating attacks, i. e. , adversarial examples, during training. However, adversarial training is significantly slower than traditional training due to the search for worst attacks for each minibatch. To speed up adversarial training, existing work has considered a subset of a minibatch for generating attacks and reduced the steps in the search for attacks. We propose a novel adversarial training acceleration method, called AttackRider, by exploring under-utilized GPU hardware to reduce the number of calls to attack generation without increasing the time of each call. We characterize the extent of under-utilization of GPU for given GPU and model size, hence the potential for speedup, and present the application scenarios where this opportunity exists. The results on various machine learning tasks and datasets show that AttackRider can speed up state-of-the-art adversarial training algorithms with comparable robust accuracy. The source code of AttackRider is available at https: //github. com/zxzhan/AttackRider.

PDF Details DOI

ECAI Conference 2025 Conference Paper

ASMA-Tune: Unlocking LLMs' Assembly Code Comprehension via Structural-Semantic Instruction Tuning

Xinyi Wang
Jiashui Wang
Jinbo Su
Ke Wang
Peng Chen
Yanming Liu
Long Liu
Xiang Li

Assembly code analysis and comprehension play critical roles in applications like reverse engineering, yet they face substantial challenges due to low information density and a lack of explicit syntactic structures. While traditional masked language modeling (MLM) approaches do not explicitly focus on natural language interaction, emerging decoder-focused large language models (LLMs) demonstrate partial success in binary analysis yet remain underexplored for holistic comprehension. We present Assembly Augmented Tuning (ASMA-Tune), an end-to-end structural-semantic instruction tuning framework that synergizes encoder architecture with decoder-based LLMs through a projector module, where the assembly encoder extracts hardware-level structural features, the projector bridges representations with the semantic space, and the instruction-tuned LLM preserves natural language capabilities. Experimental results demonstrate three key advantages: (1) State-of-the-art performance in assembly comprehension with +39. 7% Recall@1 and +17. 8% MRR improvements over GPT-4-Turbo, (2) Consistent enhancements across base models (24. 6–107. 4% Recall@1 and 15. 2–106. 3% MRR on Qwen2. 5-Coder, Deepseek-Coder and CodeLlama variants), and (3) Superior instruction-following capabilities (41. 5%–118% improvements) with controlled code generation degradation (–8. 9% to –35% across architectures).

Details

EAAI Journal 2025 Journal Article

Automatic sleep staging based on single-channel ballistocardiogram signals and multiple scales temporal feature analysis

Ke Wang
Bingyang Zhu
Banteng Liu
Jingyao Liang
Tan Lv
Jianfeng Wu

Details DOI

EAAI Journal 2025 Journal Article

Autonomous dynamic formation for maritime target tracking using multi-agent reinforcement learning

Hua Wang
Jiaxin Li
Hao Tao
Junnan Liu
Chaochao Li
Ke Wang
Mingliang Xu

Details DOI

AAAI Conference 2025 Conference Paper

Beyond Spatial Domain: Cross-domain Promoted Fourier Convolution Helps Single Image Dehazing

Xiaozhe Zhang
Haidong Ding
Fengying Xie
Linpeng Pan
Yue Zi
Ke Wang
Haopeng Zhang

Vanilla convolution and window-based self-attention have shown significant success in image dehazing. However, they are constrained by limited receptive fields and ignore frequency gaps between dehazed and clear images. The former hampers the modeling of global dependencies, while the latter impedes the learning of high-frequency features, leading to suboptimal performance. In this paper, we propose the Joint Spatial and Fourier Convolutional Network (JSFC-Net), which leverages Fourier transformation to simultaneously address the two aforementioned problems with low computational overhead. We introduce the Frequency-Spatial Promoted and Physical Learning Block, which extracts high-level features from the spatial domain and frequency domain in parallel. We design a simple yet effective solution that uses spatial features to promote and modulate frequency features in a multi-scale manner, achieving refinement of frequency features and addressing robustness issue caused by global sensitivity. Additionally, we present the Receptive Field Selection Module to facilitate improved fusion of spatial and frequency domain features. Finally, we introduce frequency loss to further narrow frequency gaps. Comprehensive experiments on multiple datasets demonstrate that JSFC-Net is significantly superior to SOTA dehazing methods.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

COME: Adding Scene-Centric Forecasting Control to Occupancy World Model

Yining Shi
Kun Jiang
Qiang Meng
Ke Wang
Jiabao Wang
Wenchao Sun
Tuopu Wen
MengMeng Yang

World models are critical for autonomous driving to simulate environmental dynamics and generate synthetic data. Existing methods struggle to disentangle ego-vehicle motion (perspective shifts) from scene evolvement (agent interactions), leading to suboptimal predictions. Instead, we propose to separate environmental changes from ego-motion by leveraging the scene-centric coordinate systems. In this paper, we introduce COME: a framework that integrates scene-centric forecasting Control into the Occupancy world ModEl. Specifically, COME first generates ego-irrelevant, spatially consistent future features through a scene-centric prediction branch, which are then converted into scene condition using a tailored ControlNet. These condition features are subsequently injected into the occupancy world model, enabling more accurate and controllable future occupancy predictions. Experimental results on the nuScenes-Occ3D dataset show that COME achieves consistent and significant improvements over state-of-the-art (SOTA) methods across diverse configurations, including different input sources (ground-truth, camera-based, fusion-based occupancy) and prediction horizons (3s and 8s). For example, under the same settings, COME achieves 26. 3% better mIoU metric than DOME and 23. 7% better mIoU metric than UniScene. These results highlight the efficacy of disentangled representation learning in enhancing spatio-temporal prediction fidelity for world models. Code is available at https: //github. com/synsin0/COME.