Author name cluster

Xi Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

2 author rows

EAAI Journal 2026 Journal Article

A Center-Focused Transformer for hyperspectral image classification

Chaoxu Yang
Jia Duan
Xi Liu
Lianchong Zhang
Jiangbing Sun
Yan Zhang
Wei Ren

Details DOI

EAAI Journal 2026 Journal Article

Ground-to-Aerial Scene Adaptation: Unsupervised drone video action recognition via domain adaptation

Feng Yang
Zhijia Li
Xi Liu
Fulin Luo
Anyong Qin
Tiecheng Song
Yue Zhao
Chenqiang Gao

Details DOI

EAAI Journal 2025 Journal Article

BeltDiff: Diffusion-based self-labeled generation system for conveyor belt damage detection

Peixian Zhuang
Yuanxiu Cai
Xi Liu
Xianchao Zheng
Fuheng Xiao
Jiangyun Li

Details DOI

NeurIPS Conference 2025 Conference Paper

Bézier Splatting for Fast and Differentiable Vector Graphics Rendering

Xi Liu
Chaoyi Zhou
Nanxuan Zhao
Siyu Huang

Differentiable vector graphics (VGs) are widely used in image vectorization and vector synthesis, while existing representations are costly to optimize and struggle to achieve high-quality rendering results for high-resolution images. This work introduces a new differentiable VG representation, dubbed Bézier Splatting, that enables fast yet high-fidelity VG rasterization. Bézier Splatting samples 2D Gaussians along Bézier curves, which naturally provide positional gradients at object boundaries. Thanks to the efficient splatting-based differentiable rasterizer, Bézier Splatting achieves 30× and 150× faster per forward and backward rasterization step for open curves compared to DiffVG. Additionally, we introduce an adaptive pruning and densification strategy that dynamically adjusts the spatial distribution of curves to escape local minima, further improving VG quality. Furthermore, our new VG representation supports conversion to standard XML-based SVG format, enhancing interoperability with existing VG tools and pipelines. Experimental results show that Bézier Splatting significantly outperforms existing methods with better visual fidelity and significant optimization speedup.

PDF Details

ICLR Conference 2025 Conference Paper

Latent Radiance Fields with 3D-aware 2D Representations

Chaoyi Zhou
Xi Liu
Feng Luo
Siyu Huang

Latent 3D reconstruction has shown great promise in empowering 3D semantic understanding and 3D generation by distilling 2D features into the 3D space. However, existing approaches struggle with the domain gap between 2D feature space and 3D representations, resulting in degraded rendering performance. To address this challenge, we propose a novel framework that integrates 3D awareness into the 2D latent space. The framework consists of three stages: (1) a correspondence-aware autoencoding method that enhances the 3D consistency of 2D latent representations, (2) a latent radiance field (LRF) that lifts these 3D-aware 2D representations into 3D space, and (3) a VAE-Radiance Field (VAE-RF) alignment strategy that improves image decoding from the rendered 2D representations. Extensive experiments demonstrate that our method outperforms the state-of-the-art latent 3D reconstruction approaches in terms of synthesis performance and cross-dataset generalizability across diverse indoor and outdoor scenes. To our knowledge, this is the first work showing the radiance field representations constructed from 2D latent representations can yield photorealistic 3D reconstruction performance.

Details

NeurIPS Conference 2024 Conference Paper

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

Xi Liu
Chaoyi Zhou
Siyu Huang

Novel-view synthesis aims to generate novel views of a scene from multiple inputimages or videos, and recent advancements like 3D Gaussian splatting (3DGS)have achieved notable success in producing photorealistic renderings with efficientpipelines. However, generating high-quality novel views under challenging settings, such as sparse input views, remains difficult due to insufficient information inunder-sampled areas, often resulting in noticeable artifacts. This paper presents3DGS-Enhancer, a novel pipeline for enhancing the representation quality of3DGS representations. We leverage 2D video diffusion priors to address thechallenging 3D view consistency problem, reformulating it as achieving temporalconsistency within a video generation process. 3DGS-Enhancer restores view-consistent latent features of rendered novel views and integrates them with theinput views through a spatial-temporal decoder. The enhanced views are thenused to fine-tune the initial 3DGS model, significantly improving its renderingperformance. Extensive experiments on large-scale datasets of unbounded scenesdemonstrate that 3DGS-Enhancer yields superior reconstruction performance andhigh-fidelity rendering results compared to state-of-the-art methods. The projectwebpage is https: //xiliu8006. github. io/3DGS-Enhancer-project.

PDF Details DOI

ICML Conference 2024 Conference Paper

Wukong: Towards a Scaling Law for Large-Scale Recommendation

Buyun Zhang
Liang Luo
Yuxin Chen 0001
Jade Nie
Xi Liu
Shen Li
Yanli Zhao
Yuchen Hao

Scaling laws play an instrumental role in the sustainable improvement in model quality. Unfortunately, recommendation models to date do not exhibit such laws similar to those observed in the domain of large language models, due to the inefficiencies of their upscaling mechanisms. This limitation poses significant challenges in adapting these models to increasingly more complex real-world datasets. In this paper, we propose an effective network architecture based purely on stacked factorization machines, and a synergistic upscaling strategy, collectively dubbed Wukong, to establish a scaling law in the domain of recommendation. Wukong’s unique design makes it possible to capture diverse, any-order of interactions simply through taller and wider layers. We conducted extensive evaluations on six public datasets, and our results demonstrate that Wukong consistently outperforms state-of-the-art models quality-wise. Further, we assessed Wukong’s scalability on an internal, large-scale dataset. The results show that Wukong retains its superiority in quality over state-of-the-art models, while holding the scaling law across two orders of magnitude in model complexity, extending beyond 100 GFLOP/example, where prior arts fall short.

Details

ICRA Conference 2023 Conference Paper

Multiple Surgical Instruments Tracking-By-Prediction With Graph Hierarchy

Rui Guo
Xi Liu
Ziheng Wang
Anthony M. Jarc

Current research strive has tremendously changed the horizon of computer vision tasks in multiple agents tracking. Nevertheless, in the research of robotic assisted surgery, reliable surgical instrument tracking imposes challenge due to the high complexity in state modeling for the hierarchical structure of the instrument versus de-coupling the spatial-temporal correlations naturally embedded in the task. In this paper, we present a new tracking paradigm integrating the trajectory prediction to reduce the data association error that is propagated from the false detection. As a key component in the system, a proposed predictor disentangles the hierarchical modeling and agent kinematic learning by introducing inductive attention mechanism in spatial-temporal graph network. Experiments on real anatomical datasets show that our tracking-by-prediction scheme improves overall localization accuracy over the frames by up to 81%, in comparison to the generic pipelines of tracking, even with transductive graph representation learning, with a large margin of gain in terms of precise localization.

Details

NeurIPS Conference 2022 Conference Paper

Learning from Few Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales

Tao Liu
P. R. Kumar
Ruida Zhou
Xi Liu

Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. Particularly important is the ability to incorporate domain knowledge of invariances, e. g. , translational invariance of images. Kernels based on the \textit{maximum} similarity over a group of transformations are not generally positive definite. Perhaps it is for this reason that they have not been studied theoretically. We address this lacuna and show that positive definiteness indeed holds \textit{with high probability} for kernels based on the maximum similarity in the small training sample set regime of interest, and that they do yield the best results in that regime. We also show how additional properties such as their ability to incorporate local features at multiple spatial scales, e. g. , as done in CNNs through max pooling, and to provide the benefits of composition through the architecture of multiple layers, can also be embedded into SVMs. We verify through experiments on widely available image sets that the resulting SVMs do provide superior accuracy in comparison to well-established deep neural network benchmarks for small sample sizes.

PDF Details

NeurIPS Conference 2021 Conference Paper

Reinforced Few-Shot Acquisition Function Learning for Bayesian Optimization

Bing-Jing Hsieh
Ping-Chun Hsieh
Xi Liu

Bayesian optimization (BO) conventionally relies on handcrafted acquisition functions (AFs) to sequentially determine the sample points. However, it has been widely observed in practice that the best-performing AF in terms of regret can vary significantly under different types of black-box functions. It has remained a challenge to design one AF that can attain the best performance over a wide variety of black-box functions. This paper aims to attack this challenge through the perspective of reinforced few-shot AF learning (FSAF). Specifically, we first connect the notion of AFs with Q-functions and view a deep Q-network (DQN) as a surrogate differentiable AF. While it serves as a natural idea to combine DQN and an existing few-shot learning method, we identify that such a direct combination does not perform well due to severe overfitting, which is particularly critical in BO due to the need of a versatile sampling policy. To address this, we present a Bayesian variant of DQN with the following three features: (i) It learns a distribution of Q-networks as AFs based on the Kullback-Leibler regularization framework. This inherently provides the uncertainty required in sampling for BO and mitigates overfitting. (ii) For the prior of the Bayesian DQN, we propose to use a demo policy induced by an off-the-shelf AF for better training stability. (iii) On the meta-level, we leverage the meta-loss of Bayesian model-agnostic meta-learning, which serves as a natural companion to the proposed FSAF. Moreover, with the proper design of the Q-networks, FSAF is general-purpose in that it is agnostic to the dimension and the cardinality of the input domain. Through extensive experiments, we demonstrate that the FSAF achieves comparable or better regrets than the state-of-the-art benchmarks on a wide variety of synthetic and real-world test functions.

PDF Details

AAAI Conference 2021 Conference Paper

Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits

Yu-Heng Hung
Ping-Chun Hsieh
Xi Liu
P. R. Kumar

Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems. We develop novel index policies that we prove achieve order-optimality, and show that they achieve empirical performance competitive with the state-of-the-art benchmark methods in extensive experiments. The new policies achieve this with low computation time per pull for linear bandits, and thereby resulting in both favorable regret as well as computational efficiency.

PDF Details

IJCAI Conference 2018 Conference Paper

MUSCAT: Multi-Scale Spatio-Temporal Learning with Application to Climate Modeling

Jianpeng Xu
Xi Liu
Tyler Wilson
Pang-Ning Tan
Pouyan Hatami
Lifeng Luo

In climate and environmental sciences, vast amount of spatio-temporal data have been generated at varying spatial resolutions from satellite observations and computer models. Integrating such diverse sources of data has proven to be useful for building prediction models as the multi-scale data may capture different aspects of the Earth system. In this paper, we present a novel framework called MUSCAT for predictive modeling of multi-scale, spatio-temporal data. MUSCAT performs a joint decomposition of multiple tensors from different spatial scales, taking into account the relationships between the variables. The latent factors derived from the joint tensor decomposition are used to train the spatial and temporal prediction models at different scales for each location. The outputs from these ensemble of spatial and temporal models will be aggregated to generate future predictions. An incremental learning algorithm is also proposed to handle the massive size of the tensors. Experimental results on real-world data from the United States Historical Climate Network (USHCN) showed that MUSCAT outperformed other competing methods in more than 70\% of the locations.

PDF Details