Arrow Research search

Author name cluster

Junhao Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

AAAI Conference 2026 Conference Paper

Flow-Based Knowledge Transfer for Efficient Large Model Distillation

  • Xinye Yang
  • Junhao Wang
  • RuiLi
  • Haosen Sun
  • Xuesheng Zhang
  • Zebang Liu
  • Gaochao Xu
  • Yiwei Chen

Traditional knowledge distillation relies on simple MSE or KL divergence losses that fail to capture the complex distributional relationships between teacher and student model representations. We propose FlowDistill, a novel distillation framework that employs normalizing flows to model and transfer the intricate knowledge distributions from teacher to student models. Our approach introduces three key innovations: (1) Invertible Knowledge Mapping using continuous normalizing flows (CNFs) to learn bijective transformations between teacher and student representation spaces, enabling precise knowledge transfer without information loss, (2) Flow-Guided Progressive Distillation that gradually increases the complexity of knowledge transfer by learning hierarchical flow transformations from simple to complex distributions, and (3) Conditional Flow Networks that adapt knowledge transfer based on input context and task requirements. Unlike previous diffusion-based distillation methods such as DiffKD that suffer from computational overhead due to iterative denoising processes and information loss during noise addition, our flow-based approach provides exact invertible transformations with significantly reduced computational cost. Extensive experiments on ImageNet classification, COCO object detection, and Cityscapes semantic segmentation demonstrate that FlowDistill achieves superior performance with 2.1% accuracy improvement over DiffKD on ResNet-34 to ResNet-18 distillation while reducing inference time by 3.5×. Our method establishes new state-of-the-art results across multiple distillation benchmarks and provides theoretical guarantees for lossless knowledge transfer through invertible flow transformations.

ICRA Conference 2025 Conference Paper

Heuristically Guided Compilation for Task Assignment and Path Finding

  • Zheng Chen
  • Changlin Chen
  • Yiran Ni
  • Junhao Wang

We investigate the Combined Target-Assignment and Path-Finding (TAPF) problem that computes both task assignments and collision-free paths for multiple agents, that is, each agent is required to select a target from an underlying set, reaching which leads to a payoff. There is a cost closely related to the time required for each agent to reach the goal. The objective is to maximize the minimum gain generated by the agents. We proposed a Compilation-Based Approach with Heuristics (TA-CBWH) to approximate the optimal solution, behind which are two critical ideas: (i) for a specific task assignment, we formulate an integer linear programming (ILP) and create the iteration combined with large neighborhood search (LNS) to quickly improve the solution quality to near-optimal; (ii) regarding distinct task assignments, a switching mechanism is developed to determine the most promising iteration while progressively eliminating unnecessary task assignments. Comparative experiments demonstrate that TA-CBWH outperforms a wide range of existing approaches across various maps and different numbers of agents.

NeurIPS Conference 2025 Conference Paper

SWE-bench Goes Live!

  • LingHao Zhang
  • Shilin He
  • Chaoyun Zhang
  • Yu Kang
  • Bowen Li
  • Chengxing Xie
  • Junhao Wang
  • Maoquan Wang

The issue-resolving task, where a model generates patches to fix real-world bugs, has emerged as a key benchmark for evaluating the capabilities of large language models (LLMs). While SWE-bench has become the dominant benchmark in this domain, it suffers from several limitations: it has not been updated since its release, is restricted to only 12 repositories, and relies heavily on manual effort for constructing test instances and setting up executable environments, significantly limiting its scalability. We present SWE-bench-Live, a live-updatable benchmark designed to address these limitations. SWE-bench-Live currently includes 1, 890 tasks derived from real GitHub issues created since 2024, spanning 223 repositories. Each task is accompanied by a dedicated Docker image to ensure reproducible execution. Additionally, we introduce an automated curation pipeline that streamlines the entire process from instance creation to environment setup, removing manual bottlenecks and enabling scalability and continuous updates. We evaluate a range of state-of-the-art models and agent frameworks on SWE-bench-Live, offering detailed empirical insights into their real-world bug-fixing capabilities. By providing a fresh, diverse, and executable benchmark grounded in live repository activity, SWE-bench-Live supports reliable, large-scale assessment of code LLMs and code agents in realistic development settings.

RLDM Conference 2019 Conference Abstract

PAC-Bayesian Analysis of Counterfactual Risk in Stochastic Contextual Ban- dits

  • Junhao Wang
  • Bogdan Mazoure
  • Gavin McCracken
  • David A Venuto

This work tackles the off-policy evaluation problem within the contextual bandit setting, where only the action and reward recommended by the logging policy were recorded and thus available at evalu- ation. This kind of situation is encountered in applications where one wants to compute the optimal policy using data previously collected in an offline manner. Previous work have extended the PAC-Bayesian anal- ysis to this setting, providing bounds on the clipped importance sampling risk estimator using a recent reg- ularization technique known as counterfactual risk minimization. The contribution of this work is to tighten this existing result through the application of various PAC-Bayesian concentration inequalities: Kullback- Leibler divergence, Bernstein, and Azuma-Hoeffding. This yields bounds on the empirical risk estimator that either converge at a faster rate given the amount of prior data, or that are more robust to the clipping factor.