Arrow Research search

Author name cluster

Ziwei Huang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

AAAI Conference 2026 Conference Paper

PEOAT: Personalization-Guided Evolutionary Question Assembly for One-Shot Adaptive Testing

  • Xiaoshan Yu
  • Ziwei Huang
  • Shangshang Yang
  • Ziwen Wang
  • Haiping Ma
  • Xingyi Zhang

With the rapid advancement of intelligent education, Computerized Adaptive Testing (CAT) has attracted increasing attention by integrating educational psychology with deep learning technologies. Unlike traditional paper-and-pencil testing, CAT aims to efficiently and accurately assess ex- aminee abilities by adaptively selecting the most suitable items during the assessment process. However, its real-time and sequential nature presents limitations in practical scenarios, particularly in large-scale assessments where interaction costs are high, or in sensitive domains such as psychological evaluations where minimizing noise and interfer- ence is essential. These challenges constrain the applicability of conventional CAT methods in time-sensitive or resource- constrained environments. To this end, we first introduce a novel task called one-shot adaptive testing (OAT), which aims to select a fixed set of optimal items for each test-taker in a one-time selection. Meanwhile, we propose PEOAT, a Personalization-guided Evolutionary question assembly framework for One-hot Adaptive Testing from the perspec- tive of combinatorial optimization. Specifically, we began by designing a personalization-aware initialization strategy that integrates differences between examinee ability and ex- ercise difficulty, using multi-strategy sampling to construct a diverse and informative initial population. Building on this, we proposed a cognitive-enhanced evolutionary framework incorporating schema-preserving crossover and cognitively guided mutation to enable efficient exploration through infor- mative signals. To maintain diversity without compromising fitness, we further introduced a diversity-aware environmen- tal selection mechanism. The effectiveness of PEOAT is val- idated through extensive experiments on two datasets, com- plemented by case studies that uncovered valuable insights.

NeurIPS Conference 2025 Conference Paper

A data and task-constrained mechanistic model of the mouse outer retina shows robustness to contrast variations

  • Kyra Kadhim
  • Jonas Beck
  • Ziwei Huang
  • Jakob H Macke
  • Fred Rieke
  • Thomas Euler
  • Michael Deistler
  • Philipp Berens

Visual processing starts in the outer retina where photoreceptors transform light into electrochemical signals. These signals are modulated by inhibition from horizontal cells and sent to the inner retina via excitatory bipolar cells. The outer retina is thought to play an important role in contrast invariant coding of visual information, but how the different cell types implement this computation together remains incompletely understood. To understand the role of each cell type, we developed a fully-differentiable biophysical model of a circular patch of mouse outer retina. The model includes 200 cone photoreceptors with a realistic phototransduction cascade and ribbon synapses as well as horizontal and bipolar cells, all with cell-type specific ion channels. Going beyond decades of work constraining biophysical models of neurons only by experimental data, we used a dual approach, constraining some parameters of the model with available measurements and others by a visual task: (1) We fit the parameters of the cone models to whole cell patch-clamp measurements of photocurrents and two-photon glutamate imaging measurements of synaptic release. (2) We then trained the spatiotemporal outer retina model with photoreceptors and the other cell types to perform a visual classification task with varying contrast and luminance levels. We found that our outer retina model could learn to solve the classification task despite contrast and luminance variance in the stimuli. Testing different cell type compositions and connectivity patterns, we found that feedback from horizontal cells did not further improve task performance beyond that of excitatory photoreceptors and bipolar cells. This is surprising given that horizontal cells are positioned to mediate communication across cones and that they add to the model's number of trainable parameters. Finally, we found that our model generalized better to out of distribution contrast levels than a linear classifier. Our work shows how the nonlinearities found in the outer retina can accomplish contrast invariant classification and teases apart the contributions of different cell types.

AAAI Conference 2025 Conference Paper

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

  • Wenyi Xiao
  • Ziwei Huang
  • Leilei Gan
  • Wanggui He
  • Haoyuan Li
  • Zhelun Yu
  • Fangxun Shu
  • Hao Jiang

The rapidly developing Large Vision Language Models (LVLMs) still face the hallucination phenomena where the generated responses do not align with the given contexts, significantly restricting the usages of LVLMs. Most previous work detects and mitigates hallucination at the coarse-grained level or requires expensive annotation (e.g., labeling by human experts or proprietary models). To address these issues, we propose detecting and mitigating hallucinations in LVLMs via fine-grained AI feedback. The basic idea is that we generate a small-size sentence-level hallucination annotation dataset by proprietary models, whereby we train a detection model which can perform sentence-level hallucination detection. Then, we propose a detect-then-rewrite pipeline to automatically construct preference dataset for hallucination mitigation training. Furthermore, we propose differentiating the severity of hallucinations, and introducing a Hallucination Severity-Aware Direct Preference Optimization (HSA-DPO) which prioritizes the mitigation of critical hallucination in LVLMs by incorporating the severity of hallucinations into preference learning. Extensive experiments on hallucination detection and mitigation benchmarks demonstrate that our method sets a new state-of-the-art in hallucination detection on MHaluBench, surpassing GPT-4V and Gemini, and reduces the hallucination rate by 36.1% on AMBER and 76.3% on Object HalBench compared to the base model.

AAAI Conference 2025 Conference Paper

MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis

  • Wanggui He
  • Siming Fu
  • Mushui Liu
  • Xierui Wang
  • Wenyi Xiao
  • Fangxun Shu
  • Yi Wang
  • Lei Zhang

Auto-regressive models have made significant progress in the realm of text-to-image synthesis, yet devising an appropriate model architecture and training strategy to achieve a satisfactory level remains an important avenue of exploration. In this work, we introduce MARS, a novel framework for T2I generation that incorporates a specially designed Semantic Vision-Language Integration Expert (SemVIE). This innovative component integrates pre-trained LLMs by independently processing linguistic and visual information—freezing the textual component while fine-tuning the visual component. This methodology preserves the NLP capabilities of LLMs while imbuing them with exceptional visual understanding. Building upon the powerful base of the pre-trained Qwen-7B, MARS stands out with its bilingual generative capabilities corresponding to both English and Chinese language prompts and the capacity for joint image and text generation. The flexibility of this framework lends itself to migration towards any-to-any task adaptability. Furthermore, MARS employs a multi-stage training strategy that first establishes robust image-text alignment through complementary bidirectional tasks and subsequently concentrates on refining the T2I generation process, significantly augmenting text-image synchrony and the granularity of image details. Notably, MARS requires only 9% of the GPU days needed by SD1.5, yet it achieves remarkable results across a variety of benchmarks, illustrating the training efficiency and the potential for swift deployment in various applications.

ICLR Conference 2025 Conference Paper

Rethinking Light Decoder-based Solvers for Vehicle Routing Problems

  • Ziwei Huang
  • Jianan Zhou 0002
  • Zhiguang Cao
  • Yixin Xu

Light decoder-based solvers have gained popularity for solving vehicle routing problems (VRPs) due to their efficiency and ease of integration with reinforcement learning algorithms. However, they often struggle with generalization to larger problem instances or different VRP variants. This paper revisits light decoder-based approaches, analyzing the implications of their reliance on static embeddings and the inherent challenges that arise. Specifically, we demonstrate that in the light decoder paradigm, the encoder is implicitly tasked with capturing information for all potential decision scenarios during solution construction within a single set of embeddings, resulting in high information density. Furthermore, our empirical analysis reveals that the overly simplistic decoder struggles to effectively utilize this dense information, particularly as task complexity increases, which limits generalization to out-of-distribution (OOD) settings. Building on these insights, we show that enhancing the decoder capacity, with a simple addition of identity mapping and a feed-forward layer, can considerably alleviate the generalization issue. Experimentally, our method significantly enhances the OOD generalization of light decoder-based approaches on large-scale instances and complex VRP variants, narrowing the gap with the heavy decoder paradigm. Our code is available at: https://github.com/ziweileonhuang/reld-nco.