Arrow Research search

Author name cluster

Jun Han

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
2 author rows

Possible papers

12

AAAI Conference 2026 Conference Paper

GeoNum: Bridging Numerical Continuity and Language Semantics via Geometric Embedding

  • Shengkai Jin
  • Tianyu Chen
  • Chonghan Gao
  • Jun Han

Large language models excel at semantic reasoning yet struggle with numerical tasks because tokenization disrupts geometric continuity. Traditional methods fragment numerically close values into inconsistent token sequences, severing the correspondence between numerical proximity and representational similarity, which is essential for numerical cognition. We introduce GeoNum, a geometrically coherent numerical embedding based on polar coordinate decomposition. By encoding integer magnitudes through classification and fractional components via trigonometric regression, GeoNum constructs a continuous manifold where numerical distance is preserved geometrically. A three-stage framework progressively integrates GeoNum into pretrained language models via self-supervised pretraining, projection alignment, and efficient adaptation. Experimental results across diverse arithmetic benchmarks demonstrate consistent gains in high-precision accuracy and improved interpolation and extrapolation, underscoring the promising benefits of geometric continuity for numerical modeling in large language models.

TMLR Journal 2025 Journal Article

Guided Discrete Diffusion for Electronic Health Record Generation

  • Jun Han
  • Zixiang Chen
  • Yongqian Li
  • Yiwen Kou
  • Eran Halperin
  • Robert E. Tillman
  • Quanquan Gu

Electronic health records (EHRs) are a pivotal data source that enables numerous applications in computational medicine, e.g., disease progression prediction, clinical trial design, and health economics and outcomes research. Despite wide usability, their sensitive nature raises privacy and confidentially concerns, which limit potential use cases. To tackle these challenges, we explore the use of generative models to synthesize artificial, yet realistic EHRs. While diffusion-based methods have recently demonstrated state-of-the-art performance in generating other data modalities and overcome the training instability and mode collapse issues that plague previous GAN-based approaches, their applications in EHR generation remain underexplored. The discrete nature of tabular medical code data in EHRs poses challenges for high-quality data generation, especially for continuous diffusion models. To this end, we introduce a novel tabular EHR generation method, EHR-D3PM, which enables both unconditional and conditional generation using the discrete diffusion model. Our experiments demonstrate that EHR-D3PM significantly outperforms existing generative baselines on comprehensive fidelity and utility metrics while maintaining less attribute and membership vulnerability risks. Furthermore, we show EHR-D3PM is effective as a data augmentation method and enhances performance on downstream tasks when combined with real data.

NeurIPS Conference 2025 Conference Paper

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

  • Hyungjoo Chae
  • Seonghwan Kim
  • Junhee Cho
  • Seungone Kim
  • Seungjun Moon
  • Gyeom Hwangbo
  • Dongha Lim
  • Minjin Kim

Web navigation is a unique domain that can automate many repetitive real-life tasks and is challenging as it requires long-horizon sequential decision making beyond typical multimodal large language model (MLLM) tasks. Yet, specialized reward models for web navigation that can be utilized during both training and test-time have been absent until now. Despite the importance of speed and cost-effectiveness, prior works have utilized MLLMs as reward models, which poses significant constraints for real-world deployment. To address this, in this work, we propose the first process reward model (PRM) called Web-Shepherd which could assess web navigation trajectories in a step-level. To achieve this, we first construct the WebPRM Collection, a large-scale dataset with 40K step-level preference pairs and annotated checklists spanning diverse domains and difficulty levels. Next, we also introduce the WebRewardBench, the first meta-evaluation benchmark for evaluating PRMs. In our experiments, we observe that our Web-Shepherd achieves about 30 points better accuracy compared to using GPT-4o on WebRewardBench. Furthermore, when testing on WebArena-lite by using GPT-4o-mini as the policy and Web-Shepherd as the verifier, we achieve 10. 9 points better performance, in 10x less cost compared to using GPT-4o-mini as the verifier. Our model, dataset, and code are publicly available at https: //github. com/kyle8581/Web-Shepherd.

ICLR Conference 2022 Conference Paper

Generative Principal Component Analysis

  • Zhaoqiang Liu
  • Jiulong Liu
  • Subhroshekhar Ghosh
  • Jun Han
  • Jonathan Scarlett

In this paper, we study the problem of principal component analysis with generative modeling assumptions, adopting a general model for the observed matrix that encompasses notable special cases, including spiked matrix recovery and phase retrieval. The key assumption is that the first principal eigenvector lies near the range of an $L$-Lipschitz continuous generative model with bounded $k$-dimensional inputs. We propose a quadratic estimator, and show that it enjoys a statistical rate of order $\sqrt{\frac{k\log L}{m}}$, where $m$ is the number of samples. Moreover, we provide a variant of the classic power method, which projects the calculated data onto the range of the generative model during each iteration. We show that under suitable conditions, this method converges exponentially fast to a point achieving the above-mentioned statistical rate. This rate is conjectured in~\citep{aubin2019spiked,cocola2020nonasymptotic} to be the best possible even when we only restrict to the special case of spiked matrix models. We perform experiments on various image datasets for spiked matrix and phase retrieval models, and illustrate performance gains of our method to the classic power method and the truncated power method devised for sparse principal component analysis.

IJCAI Conference 2022 Conference Paper

Projected Gradient Descent Algorithms for Solving Nonlinear Inverse Problems with Generative Priors

  • Zhaoqiang Liu
  • Jun Han

In this paper, we propose projected gradient descent (PGD) algorithms for signal estimation from noisy nonlinear measurements. We assume that the unknown signal lies near the range of a Lipschitz continuous generative model with bounded inputs. In particular, we consider two cases when the nonlinear link function is either unknown or known. For unknown nonlinearity, we make the assumption of sub-Gaussian observations and propose a linear least-squares estimator. We show that when there is no representation error, the sensing vectors are Gaussian, and the number of samples is sufficiently large, with high probability, a PGD algorithm converges linearly to a point achieving the optimal statistical rate using arbitrary initialization. For known nonlinearity, we assume monotonicity, and make much weaker assumptions on the sensing vectors and allow for representation error. We propose a nonlinear least-squares estimator that is guaranteed to enjoy an optimal statistical rate. A corresponding PGD algorithm is provided and is shown to also converge linearly to the estimator using arbitrary initialization. In addition, we present experimental results on image datasets to demonstrate the performance of our PGD algorithms.

ICLR Conference 2021 Conference Paper

Disentangled Recurrent Wasserstein Autoencoder

  • Jun Han
  • Martin Renqiang Min
  • Ligong Han
  • Li Erran Li
  • Xuan Zhang

Learning disentangled representations leads to interpretable models and facilitates data generation with style transfer, which has been extensively studied on static data such as images in an unsupervised learning framework. However, only a few works have explored unsupervised disentangled sequential representation learning due to challenges of generating sequential data. In this paper, we propose recurrent Wasserstein Autoencoder (R-WAE), a new framework for generative modeling of sequential data. R-WAE disentangles the representation of an input sequence into static and dynamic factors (i.e., time-invariant and time-varying parts). Our theoretical analysis shows that, R-WAE minimizes an upper bound of a penalized form of the Wasserstein distance between model distribution and sequential data distribution, and simultaneously maximizes the mutual information between input data and different disentangled latent factors, respectively. This is superior to (recurrent) VAE which does not explicitly enforce mutual information maximization between input data and disentangled latent representations. When the number of actions in sequential data is available as weak supervision information, R-WAE is extended to learn a categorical latent representation of actions to improve its disentanglement. Experiments on a variety of datasets show that our models outperform other baselines with the same settings in terms of disentanglement and unconditional video generation both quantitatively and qualitatively.

ICRA Conference 2020 Conference Paper

LINS: A Lidar-Inertial State Estimator for Robust and Efficient Navigation

  • Chao Qin
  • Haoyang Ye
  • Christian E. Pranata
  • Jun Han
  • Shuyang Zhang
  • Ming Liu 0001

We present LINS, a lightweight lidar-inertial state estimator, for real-time ego-motion estimation. The proposed method enables robust and efficient navigation for ground vehicles in challenging environments, such as feature-less scenes, via fusing a 6-axis IMU and a 3D lidar in a tightly-coupled scheme. An iterated error-state Kalman filter (ESKF) is designed to correct the estimated state recursively by generating new feature correspondences in each iteration, and to keep the system computationally tractable. Moreover, we use a robocentric formulation that represents the state in a moving local frame in order to prevent filter divergence in a long run. To validate robustness and generalizability, extensive experiments are performed in various scenarios. Experimental results indicate that LINS offers comparable performance with the state-of-the-art lidar-inertial odometry in terms of stability and accuracy and has order-of-magnitude improvement in speed.

AAAI Conference 2019 Conference Paper

Biomedical Image Segmentation via Representative Annotation

  • Hao Zheng
  • Lin Yang
  • Jianxu Chen
  • Jun Han
  • Yizhe Zhang
  • Peixian Liang
  • Zhuo Zhao
  • Chaoli Wang

Deep learning has been applied successfully to many biomedical image segmentation tasks. However, due to the diversity and complexity of biomedical image data, manual annotation for training common deep learning models is very timeconsuming and labor-intensive, especially because normally only biomedical experts can annotate image data well. Human experts are often involved in a long and iterative process of annotation, as in active learning type annotation schemes. In this paper, we propose representative annotation (RA), a new deep learning framework for reducing annotation effort in biomedical image segmentation. RA uses unsupervised networks for feature extraction and selects representative image patches for annotation in the latent space of learned feature descriptors, which implicitly characterizes the underlying data while minimizing redundancy. A fully convolutional network (FCN) is then trained using the annotated selected image patches for image segmentation. Our RA scheme offers three compelling advantages: (1) It leverages the ability of deep neural networks to learn better representations of image data; (2) it performs one-shot selection for manual annotation and frees annotators from the iterative process of common active learning based annotation schemes; (3) it can be deployed to 3D images with simple extensions. We evaluate our RA approach using three datasets (two 2D and one 3D) and show our framework yields competitive segmentation results comparing with state-of-the-art methods.

NeurIPS Conference 2019 Conference Paper

Deep Generative Video Compression

  • Salvator Lombardo
  • Jun Han
  • Christopher Schroers
  • Stephan Mandt

The usage of deep generative models for image compression has led to impressive performance gains over classical codecs while neural video compression is still in its infancy. Here, we propose an end-to-end, deep generative modeling approach to compress temporal sequences with a focus on video. Our approach builds upon variational autoencoder (VAE) models for sequential data and combines them with recent work on neural image compression. The approach jointly learns to transform the original sequence into a lower-dimensional representation as well as to discretize and entropy code this representation according to predictions of the sequential VAE. Rate-distortion evaluations on small videos from public data sets with varying complexity and diversity show that our model yields competitive results when trained on generic video content. Extreme compression performance is achieved when training the model on specialized content.

NeurIPS Conference 2016 Conference Paper

Bootstrap Model Aggregation for Distributed Statistical Learning

  • Jun Han
  • Qiang Liu

In distributed, or privacy-preserving learning, we are often given a set of probabilistic models estimated from different local repositories, and asked to combine them into a single model that gives efficient statistical estimation. A simple method is to linearly average the parameters of the local models, which, however, tends to be degenerate or not applicable on non-convex models, or models with different parameter dimensions. One more practical strategy is to generate bootstrap samples from the local models, and then learn a joint model based on the combined bootstrap set. Unfortunately, the bootstrap procedure introduces additional noise and can significantly deteriorate the performance. In this work, we propose two variance reduction methods to correct the bootstrap noise, including a weighted M-estimator that is both statistically efficient and practically powerful. Both theoretical and empirical analysis is provided to demonstrate our methods.

IJCAI Conference 2013 Conference Paper

Crowdsourcing-Assisted Query Structure Interpretation

  • Jun Han
  • Ju Fan
  • Lizhu Zhou

Structured Web search incorporating data from structured sources into search engine results has attracted much attention from both academic and industrial communities. To understand user’s intent, query structure interpretation is proposed to analyze the structure of queries in a query log and map query terms to the semantically relevant attributes of data sources in a target domain. Existing methods assume all queries should be classified to the target domain, and thus they are limited when interpreting queries from different domains in real query logs. To address the problem, we introduce a human-machine hybrid method by utilizing crowdsourcing platforms. Our method selects a small number of query terms and asks the crowdsourcing workers to interpret them, and then infers the interpretations based on the crowdsourcing results. To improve the performance, we propose an iterative probabilistic inference method based on a similarity graph of query terms, and select the most useful query terms for crowdsourcing by considering their domain-relevance and gained benefit. We evaluate our method on a real query log, and the experimental results show that our method outperforms the state-of-the-art method.

EAAI Journal 1996 Journal Article

Optimization of feedforward neural networks

  • Jun Han
  • Claudio Moraga
  • Stefan Sinne

This paper presents some novel approaches in the design of neural networks with one or two hidden layers trained by the backpropagation algorithm. First, hybrid neural networks that have different activation functions for different layers in fully connected feedforward neural networks are introduced. Second, a variant sigmoid function with three parameters is discussed. The parameters are the dynamic range, symmetry and slope of the function respectively. It is illustrated how these parameters influence the speed of backpropagation learning, and a parametric feedforward network with different parameter configurations in different layers is introduced. By regulating and modifying parameter configurations of the sigmoid function in different layers the error signal problem, oscillation problem and asymmetrical input problem can be reduced. Furthermore, hybrid optimization methods for the dynamic parameters are introduced: Genetic algorithms are used to optimize the initial parameter configuration. The dynamic parameters are adjusted on-line using gradient descent methods. Sequential adjustment algorithms are derived in order to avoid the moving target problem and to increase the stability of the gradient descent methods. The new schemes have advantages in both convergence speed and generalization capability. Experimental results on the two-spirals problem are provided.