Arrow Research search

Author name cluster

Wei Xing

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

26 papers
2 author rows

Possible papers

26

AAAI Conference 2026 Conference Paper

Inpaint-Anywhere: Zero-Shot Multi-Identity Inpainting with Efficient Diffusion Transformer

  • Junsheng Luan
  • Lei Zhao
  • Wei Xing

Subject-driven generation, which aims to synthesize visual content for a given identity V* with specific attributes, has garnered increasing attention in recent years. While existing methods demonstrate impressive identity consistency for both single and multiple identities, they often lack user-specified spatial control. Recent approaches, such as OminiControl-2 and EasyControl, enable inpainting conditioned on a single identity but fall short in multi-identity scenarios. In this paper, we introduce BoundID, a dataset synthesis pipeline for generating multi-identity images with bounding box annotations, and introduce Inpaint-Anywhere, a diffusion transformer framework for multi-identity inpainting. Given multiple identity references and corresponding masks, our method simultaneously generates all desired identities at precise locations while achieving both high identity and prompt fidelity. Extensive experiments show that Inpaint-Anywhere achieves state-of-the-art performance in multi-identity inpainting.

NeurIPS Conference 2025 Conference Paper

CAMO: Convergence-Aware Multi-Fidelity Bayesian Optimization

  • Wei Xing
  • Zhenjie Lu
  • Akeel Shah

Existing Multi-fidelity Bayesian Optimization (MFBO) methods ignore the convergence behavior of the multi-fidelity surrogate as the fidelity increases, leading to inefficient exploration and suboptimal performance. We introduce CAMO (Convergence-Aware Multi-fidelity Optimization), a principled framework based on Linear Fidelity Differential Equations (LFiDEs) that explicitly encodes convergence of fidelity-indexed outputs and employs a closed-form nonstationary kernel. We rigorously prove the existence and pointwise/uniform convergence to the high fidelity surrogate under mild restrictions and provide new convergence results for general FiDEs using smooth, non-smooth and even non-convex Lyapunov functions, establishing a bridge between MFBO and the theory of subgradient flows in non-smooth optimisation theory. Combined with a fidelity-aware acquisition function, CAMO outperforms state-of-the-art MFBO methods on a majority of synthetic and real-world benchmarks, with up to a four-fold improvement in optimisation performance and a dramatic speed-up in convergence. CAMO offers a tractable and theoretically grounded approach to convergence-aware MFBO.

AAAI Conference 2025 Conference Paper

Cascaded Diffusion Models for Virtual Try-On: Improving Control and Resolution

  • Guangyuan Li
  • Yongkang Wang
  • Junsheng Luan
  • Lei Zhao
  • Wei Xing
  • Huaizhong Lin
  • Binkai Ou

Previous virtual try-on methods have employed ControlNet architecture in exemplar-based inpainting diffusion models to guide the generation of try-on images, preserving the garment's features and enhancing the realism of the generated images. While these methods have maintained the identity of the garment and improved the naturalness of the generated images, they still face the following limitations: (1) For garments with complex features, such as intricate text, patterns, and uncommon styles, they struggle to retain these detailed features in the generated try-on images. (2) They are limited to generating try-on images at a maximum resolution of 1K, which may not meet the demands of real-world scenarios, where higher resolutions might be required. To address the aforementioned issues, in this paper, we propose a Cascaded Diffusion Model for virtual try-on to enhance both image controllability and resolution. We call it CDM-VTON. Specifically, we design two diffusion models: the Multi-Conditioned Diffusion Model (MC-DM) and the Super-Resolution Diffusion Model (SR-DM). The former generates low-resolution try-on images while preserving the garment's complex features, and the latter enhances the resolution of these images. Additionally, we incorporate a multi-control integration module in the MC-DM, which injects multiple control conditions into a frozen denoising U-Net to ensure that the generated try-on images retain complex garment features. Our experimental results demonstrate that our method outperforms previous approaches in preserving garment details and generating authentic virtual try-on images, both qualitatively and quantitatively.

AAAI Conference 2024 Conference Paper

ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank

  • Zhanjie Zhang
  • Quanwei Zhang
  • Wei Xing
  • Guangyuan Li
  • Lei Zhao
  • Jiakai Sun
  • Zehua Lan
  • Junsheng Luan

Artistic style transfer aims to repaint the content image with the learned artistic style. Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches. Small model-based approaches can preserve the content strucuture, but fail to produce highly realistic stylized images and introduce artifacts and disharmonious patterns; Pre-trained large-scale model-based approaches can generate highly realistic stylized images but struggle with preserving the content structure. To address the above issues, we propose ArtBank, a novel artistic style transfer framework, to generate highly realistic stylized images while preserving the content structure of the content images. Specifically, to sufficiently dig out the knowledge embedded in pre-trained large-scale models, an Implicit Style Prompt Bank (ISPB), a set of trainable parameter matrices, is designed to learn and store knowledge from the collection of artworks and behave as a visual prompt to guide pre-trained large-scale models to generate highly realistic stylized images while preserving content structure. Besides, to accelerate training the above ISPB, we propose a novel Spatial-Statistical-based self-Attention Module (SSAM). The qualitative and quantitative experiments demonstrate the superiority of our proposed method over state-of-the-art artistic style transfer methods. Code is available at https://github.com/Jamie-Cheung/ArtBank.

AAAI Conference 2024 Conference Paper

Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation

  • Tianyi Chu
  • Wei Xing
  • Jiafu Chen
  • Zhizhong Wang
  • Jiakai Sun
  • Lei Zhao
  • Haibo Chen
  • Huaizhong Lin

Existing generative adversarial network (GAN) based conditional image generative models typically produce fixed output for the same conditional input, which is unreasonable for highly subjective tasks, such as large-mask image inpainting or style transfer. On the other hand, GAN-based diverse image generative methods require retraining/fine-tuning the network or designing complex noise injection functions, which is computationally expensive, task-specific, or struggle to generate high-quality results. Given that many deterministic conditional image generative models have been able to produce high-quality yet fixed results, we raise an intriguing question: is it possible for pre-trained deterministic conditional image generative models to generate diverse results without changing network structures or parameters? To answer this question, we re-examine the conditional image generation tasks from the perspective of adversarial attack and propose a simple and efficient plug-in projected gradient descent (PGD) like method for diverse and controllable image generation. The key idea is attacking the pre-trained deterministic generative models by adding a micro perturbation to the input condition. In this way, diverse results can be generated without any adjustment of network structures or fine-tuning of the pre-trained models. In addition, we can also control the diverse results to be generated by specifying the attack direction according to a reference text or image. Our work opens the door to applying adversarial attack to low-level vision tasks, and experiments on various conditional image generation tasks demonstrate the effectiveness and superiority of the proposed method.

AAAI Conference 2024 Conference Paper

PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping

  • Jiafu Chen
  • Wei Xing
  • Jiakai Sun
  • Tianyi Chu
  • Yiling Huang
  • Boyan Ji
  • Lei Zhao
  • Huaizhong Lin

3D scene stylization refers to transform the appearance of a 3D scene to match a given style image, ensuring that images rendered from different viewpoints exhibit the same style as the given style image, while maintaining the 3D consistency of the stylized scene. Several existing methods have obtained impressive results in stylizing 3D scenes. However, the mod- els proposed by these methods need to be re-trained when applied to a new scene. In other words, their models are cou- pled with a specific scene and cannot adapt to arbitrary other scenes. To address this issue, we propose a novel 3D scene stylization framework to transfer an arbitrary style to an ar- bitrary scene, without any style-related or scene-related re- training. Concretely, we first map the appearance of the 3D scene into a 2D style pattern space, which realizes complete disentanglement of the geometry and appearance of the 3D scene and makes our model be generalized to arbitrary 3D scenes. Then we stylize the appearance of the 3D scene in the 2D style pattern space via a prompt-based 2D stylization al- gorithm. Experimental results demonstrate that our proposed framework is superior to SOTA methods in both visual qual- ity and generalization.

IJCAI Conference 2024 Conference Paper

Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt

  • Zhanjie Zhang
  • Quanwei Zhang
  • Huaizhong Lin
  • Wei Xing
  • Juncheng Mo
  • Shuaicheng Huang
  • Jinheng Xie
  • Guangyuan Li

Artistic style transfer aims to transfer the learned artistic style onto an arbitrary content image, generating artistic stylized images. Existing generative adversarial network-based methods fail to generate highly realistic stylized images and always introduce obvious artifacts and disharmonious patterns. Recently, large-scale pre-trained diffusion models opened up a new way for generating highly realistic artistic stylized images. However, diffusion model-based methods generally fail to preserve the content structure of input content images well, introducing some undesired content structure and style patterns. To address the above problems, we propose a novel pre-trained diffusion-based artistic style transfer method, called LSAST, which can generate highly realistic artistic stylized images while preserving the content structure of input content images well, without bringing obvious artifacts and disharmonious style patterns. Specifically, we introduce a Step-aware and Layer-aware Prompt Space, a set of learnable prompts, which can learn the style information from the collection of artworks and dynamically adjusts the input images' content structure and style pattern. To train our prompt space, we propose a novel inversion method, called Step-ware and Layer-aware Prompt Inversion, which allows the prompt space to learn the style information of the artworks collection. In addition, we inject a pre-trained conditional branch of ControlNet into our LSAST, which further improved our framework's ability to maintain content structure. Extensive experiments demonstrate that our proposed method can generate more highly realistic artistic stylized images than the state-of-the-art artistic style transfer methods. Code is available at https: //github. com/Jamie-Cheung/LSAST.

NeurIPS Conference 2023 Conference Paper

ContinuAR: Continuous Autoregression For Infinite-Fidelity Fusion

  • Wei Xing
  • Yuxin Wang
  • Zheng Xing

Multi-fidelity fusion has become an important surrogate technique, which provides insights into expensive computer simulations and effectively improves decision-making, e. g. , optimization, with less computational cost. Multi-fidelity fusion is much more computationally efficient compared to traditional single-fidelity surrogates. Despite the fast advancement of multi-fidelity fusion techniques, they lack a systematic framework to make use of the fidelity indicator, deal with high-dimensional and arbitrary data structure, and scale well to infinite-fidelity problems. In this work, we first generalize the popular autoregression (AR) to derive a novel linear fidelity differential equation (FiDE), paving the way to tractable infinite-fidelity fusion. We generalize FiDE to a high-dimensional system, which also provides a unifying framework to seemly bridge the gap between many multi- and single-fidelity GP-based models. We then propose ContinuAR, a rank-1 approximation solution to FiDEs, which is tractable to train, compatible with arbitrary multi-fidelity data structure, linearly scalable to the output dimension, and most importantly, delivers consistent SOTA performance with a significant margin over the baseline methods. Compared to the SOTA infinite-fidelity fusion, IFC, ContinuAR achieves up to 4x improvement in accuracy and 62, 500x speedup in training time.

AAAI Conference 2023 Conference Paper

Generative Image Inpainting with Segmentation Confusion Adversarial Training and Contrastive Learning

  • Zhiwen Zuo
  • Lei Zhao
  • Ailin Li
  • Zhizhong Wang
  • Zhanjie Zhang
  • Jiafu Chen
  • Wei Xing
  • Dongming Lu

This paper presents a new adversarial training framework for image inpainting with segmentation confusion adversarial training (SCAT) and contrastive learning. SCAT plays an adversarial game between an inpainting generator and a segmentation network, which provides pixel-level local training signals and can adapt to images with free-form holes. By combining SCAT with standard global adversarial training, the new adversarial training framework exhibits the following three advantages simultaneously: (1) the global consistency of the repaired image, (2) the local fine texture details of the repaired image, and (3) the flexibility of handling images with free-form holes. Moreover, we propose the textural and semantic contrastive learning losses to stabilize and improve our inpainting model's training by exploiting the feature representation space of the discriminator, in which the inpainting images are pulled closer to the ground truth images but pushed farther from the corrupted images. The proposed contrastive losses better guide the repaired images to move from the corrupted image data points to the real image data points in the feature representation space, resulting in more realistic completed images. We conduct extensive experiments on two benchmark datasets, demonstrating our model's effectiveness and superiority both qualitatively and quantitatively.

AAAI Conference 2023 Conference Paper

MicroAST: Towards Super-fast Ultra-Resolution Arbitrary Style Transfer

  • Zhizhong Wang
  • Lei Zhao
  • Zhiwen Zuo
  • Ailin Li
  • Haibo Chen
  • Wei Xing
  • Dongming Lu

Arbitrary style transfer (AST) transfers arbitrary artistic styles onto content images. Despite the recent rapid progress, existing AST methods are either incapable or too slow to run at ultra-resolutions (e.g., 4K) with limited resources, which heavily hinders their further applications. In this paper, we tackle this dilemma by learning a straightforward and lightweight model, dubbed MicroAST. The key insight is to completely abandon the use of cumbersome pre-trained Deep Convolutional Neural Networks (e.g., VGG) at inference. Instead, we design two micro encoders (content and style encoders) and one micro decoder for style transfer. The content encoder aims at extracting the main structure of the content image. The style encoder, coupled with a modulator, encodes the style image into learnable dual-modulation signals that modulate both intermediate features and convolutional filters of the decoder, thus injecting more sophisticated and flexible style signals to guide the stylizations. In addition, to boost the ability of the style encoder to extract more distinct and representative style signals, we also introduce a new style signal contrastive loss in our model. Compared to the state of the art, our MicroAST not only produces visually superior results but also is 5-73 times smaller and 6-18 times faster, for the first time enabling super-fast (about 0.5 seconds) AST at 4K ultra-resolutions.

IJCAI Conference 2023 Conference Paper

TeSTNeRF: Text-Driven 3D Style Transfer via Cross-Modal Learning

  • Jiafu Chen
  • Boyan Ji
  • Zhanjie Zhang
  • Tianyi Chu
  • Zhiwen Zuo
  • Lei Zhao
  • Wei Xing
  • Dongming Lu

Text-driven 3D style transfer aims at stylizing a scene according to the text and generating arbitrary novel views with consistency. Simply combining image/video style transfer methods and novel view synthesis methods results in flickering when changing viewpoints, while existing 3D style transfer methods learn styles from images instead of texts. To address this problem, we for the first time design an efficient text-driven model for 3D style transfer, named TeSTNeRF, to stylize the scene using texts via cross-modal learning: we leverage an advanced text encoder to embed the texts in order to control 3D style transfer and align the input text and output stylized images in latent space. Furthermore, to obtain better visual results, we introduce style supervision, learning feature statistics from style images and utilizing 2D stylization results to rectify abrupt color spill. Extensive experiments demonstrate that TeSTNeRF significantly outperforms existing methods and provides a new way to guide 3D style transfer.

IJCAI Conference 2023 Conference Paper

VGOS: Voxel Grid Optimization for View Synthesis from Sparse Inputs

  • Jiakai Sun
  • Zhanjie Zhang
  • Jiafu Chen
  • Guangyuan Li
  • Boyan Ji
  • Lei Zhao
  • Wei Xing

Neural Radiance Fields (NeRF) has shown great success in novel view synthesis due to its state-of-the-art quality and flexibility. However, NeRF requires dense input views (tens to hundreds) and a long training time (hours to days) for a single scene to generate high-fidelity images. Although using the voxel grids to represent the radiance field can significantly accelerate the optimization process, we observe that for sparse inputs, the voxel grids are more prone to overfitting to the training views and will have holes and floaters, which leads to artifacts. In this paper, we propose VGOS, an approach for fast (3-5 minutes) radiance field reconstruction from sparse inputs (3-10 views) to address these issues. To improve the performance of voxel-based radiance field in sparse input scenarios, we propose two methods: (a) We introduce an incremental voxel training strategy, which prevents overfitting by suppressing the optimization of peripheral voxels in the early stage of reconstruction. (b) We use several regularization techniques to smooth the voxels, which avoids degenerate solutions. Experiments demonstrate that VGOS achieves state-of-the-art performance for sparse inputs with super-fast convergence. Code will be available at https: //github. com/SJoJoK/VGOS.

IJCAI Conference 2022 Conference Paper

DivSwapper: Towards Diversified Patch-based Arbitrary Style Transfer

  • Zhizhong Wang
  • Lei Zhao
  • Haibo Chen
  • Zhiwen Zuo
  • Ailin Li
  • Wei Xing
  • Dongming Lu

Gram-based and patch-based approaches are two important research lines of style transfer. Recent diversified Gram-based methods have been able to produce multiple and diverse stylized outputs for the same content and style images. However, as another widespread research interest, the diversity of patch-based methods remains challenging due to the stereotyped style swapping process based on nearest patch matching. To resolve this dilemma, in this paper, we dive into the crux of existing patch-based methods and propose a universal and efficient module, termed DivSwapper, for diversified patch-based arbitrary style transfer. The key insight is to use an essential intuition that neural patches with higher activation values could contribute more to diversity. Our DivSwapper is plug-and-play and can be easily integrated into existing patch-based and Gram-based methods to generate diverse results for arbitrary styles. We conduct theoretical analyses and extensive experiments to demonstrate the effectiveness of our method, and compared with state-of-the-art algorithms, it shows superiority in diversity, quality, and efficiency.

NeurIPS Conference 2022 Conference Paper

GAR: Generalized Autoregression for Multi-Fidelity Fusion

  • Yuxin Wang
  • Zheng Xing
  • Wei Xing

In many scientific research and engineering applications, where repeated simulations of complex systems are conducted, a surrogate is commonly adopted to quickly estimate the whole system. To reduce the expensive cost of generating training examples, it has become a promising approach to combine the results of low-fidelity (fast but inaccurate) and high-fidelity (slow but accurate) simulations. Despite the fast developments of multi-fidelity fusion techniques, most existing methods require particular data structures and do not scale well to high-dimensional output. To resolve these issues, we generalize the classic autoregression (AR), which is wildly used due to its simplicity, robustness, accuracy, and tractability, and propose generalized autoregression (GAR) using tensor formulation and latent features. GAR can deal with arbitrary dimensional outputs and arbitrary multifidelity data structure to satisfy the demand of multi-fidelity fusion for complex problems; it admits a fully tractable likelihood and posterior requiring no approximate inference and scales well to high-dimensional problems. Furthermore, we prove the autokrigeability theorem based on GAR in the multi-fidelity case and develop CIGAR, a simplified GAR with the same predictive mean accuracy but requires significantly less computation. In experiments of canonical PDEs and scientific computational examples, the proposed method consistently outperforms the SOTA methods with a large margin (up to 6x improvement in RMSE) with only a few high-fidelity training samples.

IJCAI Conference 2022 Conference Paper

Style Fader Generative Adversarial Networks for Style Degree Controllable Artistic Style Transfer

  • Zhiwen Zuo
  • Lei Zhao
  • Shuobin Lian
  • Haibo Chen
  • Zhizhong Wang
  • Ailin Li
  • Wei Xing
  • Dongming Lu

Artistic style transfer is the task of synthesizing content images with learned artistic styles. Recent studies have shown the potential of Generative Adversarial Networks (GANs) for producing artistically rich stylizations. Despite the promising results, they usually fail to control the generated images' style degree, which is inflexible and limits their applicability for practical use. To address the issue, in this paper, we propose a novel method that for the first time allows adjusting the style degree for existing GAN-based artistic style transfer frameworks in real time after training. Our method introduces two novel modules into existing GAN-based artistic style transfer frameworks: a Style Scaling Injection (SSI) module and a Style Degree Interpretation (SDI) module. The SSI module accepts the value of Style Degree Factor (SDF) as the input and outputs parameters that scale the feature activations in existing models, offering control signals to alter the style degrees of the stylizations. And the SDI module interprets the output probabilities of a multi-scale content-style binary classifier as the style degrees, providing a mechanism to parameterize the style degree of the stylizations. Moreover, we show that after training our method can enable existing GAN-based frameworks to produce over-stylizations. The proposed method can facilitate many existing GAN-based artistic style transfer frameworks with marginal extra training overheads and modifications. Extensive qualitative evaluations on two typical GAN-based style transfer models demonstrate the effectiveness of the proposed method for gaining style degree control for them.

AAAI Conference 2022 Conference Paper

Texture Reformer: Towards Fast and Universal Interactive Texture Transfer

  • Zhizhong Wang
  • Lei Zhao
  • Haibo Chen
  • Ailin Li
  • Zhiwen Zuo
  • Wei Xing
  • Dongming Lu

In this paper, we present the texture reformer, a fast and universal neural-based framework for interactive texture transfer with user-specified guidance. The challenges lie in three aspects: 1) the diversity of tasks, 2) the simplicity of guidance maps, and 3) the execution efficiency. To address these challenges, our key idea is to use a novel feed-forward multiview and multi-stage synthesis procedure consisting of I) a global view structure alignment stage, II) a local view texture refinement stage, and III) a holistic effect enhancement stage to synthesize high-quality results with coherent structures and fine texture details in a coarse-to-fine fashion. In addition, we also introduce a novel learning-free view-specific texture reformation (VSTR) operation with a new semantic map guidance strategy to achieve more accurate semanticguided and structure-preserved texture transfer. The experimental results on a variety of application scenarios demonstrate the effectiveness and superiority of our framework. And compared with the state-of-the-art interactive texture transfer algorithms, it not only achieves higher quality results but, more remarkably, also is 2-5 orders of magnitude faster.

NeurIPS Conference 2021 Conference Paper

Artistic Style Transfer with Internal-external Learning and Contrastive Learning

  • Haibo Chen
  • Lei Zhao
  • Zhizhong Wang
  • Huiming Zhang
  • Zhiwen Zuo
  • Ailin Li
  • Wei Xing
  • Dongming Lu

Although existing artistic style transfer methods have achieved significant improvement with deep neural networks, they still suffer from artifacts such as disharmonious colors and repetitive patterns. Motivated by this, we propose an internal-external style transfer method with two contrastive losses. Specifically, we utilize internal statistics of a single style image to determine the colors and texture patterns of the stylized image, and in the meantime, we leverage the external information of the large-scale style dataset to learn the human-aware style information, which makes the color distributions and texture patterns in the stylized image more reasonable and harmonious. In addition, we argue that existing style transfer methods only consider the content-to-stylization and style-to-stylization relations, neglecting the stylization-to-stylization relations. To address this issue, we introduce two contrastive losses, which pull the multiple stylization embeddings closer to each other when they share the same content or style, but push far away otherwise. We conduct extensive experiments, showing that our proposed method can not only produce visually more harmonious and satisfying artistic images, but also promote the stability and consistency of rendered video clips.

JBHI Journal 2021 Journal Article

Two-Way MR-Forest Based Growing Path Classification for Malignancy Estimation of Pulmonary Nodules

  • Hongbo Zhu
  • Guangjie Han
  • Chuan Lin
  • Min Wang
  • Mohsen Guizani
  • Jianxia Hou
  • Wei Xing

This paper proposes a two-way multi-ringed forest (TMR-Forest) to estimating the malignancy of the pulmonary nodules for false positive reduction (FPR). Based on our previous work of deep decision framework, named MR-Forest, we generate a growing path mode on predefined pseudo-timeline of $L$ time slots to build pseudo-spatiotemporal features. It synchronously works with FPR based on MR-Forest to help predict the labels from a dynamic perspective. Concretely, Mask R-CNN is first used to recommend the bounding boxes of ROIs and classify their pathological features. Afterward, hierarchical attribute matching is introduced to obtain the input ROIs’ attribute layouts and select the candidates for their growing path generation. The selected ROIs can replace the fixed-sized ROIs’ fitting results at different time slots for data augmentation. A two-stage counterfactual path elimination is used to screen out the input paths of the cascade forest. Finally, a simple label selection strategy is executed to output the predicted label to point out the input nodule's malignancy. On 1034 scans of the merged dataset, the framework can report more accurate malignancy labels to achieve a better CPM score of 0. 912, which exceeds those of MR-Forest and 3DDCNNs about 2. 8% and 4. 7%, respectively.

UAI Conference 2020 Conference Paper

Adversarial Learning for 3D Matching

  • Wei Xing
  • Brian D. Ziebart

Structured prediction of objects in spaces that are inherently difficult to search or compactly characterize is a particularly challenging task. For example, though bipartite matchings in two dimensions can be tractably optimized and learned, the higher-dimensional generalization—3D matchings—are NP-hard to optimally obtain and the set of potential solutions cannot be compactly characterized. Though approximation is therefore necessary, prevalent structured prediction methods inherit the weaknesses they possess in the two-dimensional setting either suffering from inconsistency or intractability—even when the approximations are sufficient. In this paper, we explore extending an adversarial approach to learning bipartite matchings that avoids these weaknesses to the three dimensional setting. We assess the benefits compared to margin-based methods on a three-frame tracking problem.

AAAI Conference 2020 Conference Paper

Infinite ShapeOdds: Nonparametric Bayesian Models for Shape Representations

  • Wei Xing
  • Shireen Elhabian
  • Robert Kirby
  • Ross T. Whitaker
  • Shandian Zhe

Learning compact representations for shapes (binary images) is important for many applications. Although neural network models are very powerful, they usually involve many parameters, require substantial tuning efforts and easily overfit small datasets, which are common in shape-related applications. The state-of-the-art approach, ShapeOdds, as a latent Gaussian model, can effectively prevent overfitting and is more robust. Nonetheless, it relies on a linear projection assumption and is incapable of capturing intrinsic nonlinear shape variations, hence may leading to inferior representations and structure discovery. To address these issues, we propose In- finite ShapeOdds (InfShapeOdds), a Bayesian nonparametric shape model, which is flexible enough to capture complex shape variations and discover hidden cluster structures, while still avoiding overfitting. Specifically, we use matrix Gaussian priors, nonlinear feature mappings and the kernel trick to generalize ShapeOdds to a shape-variate Gaussian process model, which can grasp various nonlinear correlations among the pixels within and across (different) shapes. To further discover the hidden structures in data, we place a Dirichlet process mixture (DPM) prior over the representations to jointly infer the cluster number and memberships. Finally, we exploit the Kronecker-product structure in our model to develop an efficient, truncated variational expectation-maximization algorithm for model estimation. On synthetic and real-world data, we show the advantage of our method in both representation learning and latent structure discovery.

NeurIPS Conference 2020 Conference Paper

Multi-Fidelity Bayesian Optimization via Deep Neural Networks

  • Shibo Li
  • Wei Xing
  • Robert Kirby
  • Shandian Zhe

Bayesian optimization (BO) is a popular framework for optimizing black-box functions. In many applications, the objective function can be evaluated at multiple fidelities to enable a trade-off between the cost and accuracy. To reduce the optimization cost, many multi-fidelity BO methods have been proposed. Despite their success, these methods either ignore or over-simplify the strong, complex correlations across the fidelities. While the acquisition function is therefore easy and convenient to calculate, these methods can be inefficient in estimating the objective function. To address this issue, we propose Deep Neural Network Multi-Fidelity Bayesian Optimization (DNN-MFBO) that can flexibly capture all kinds of complicated relationships between the fidelities to improve the objective function estimation and hence the optimization performance. We use sequential, fidelity-wise Gauss-Hermite quadrature and moment-matching to compute a mutual information-based acquisition function in a tractable and highly efficient way. We show the advantages of our method in both synthetic benchmark datasets and real-world applications in engineering design.

IJCAI Conference 2020 Conference Paper

Scalable Gaussian Process Regression Networks

  • Shibo Li
  • Wei Xing
  • Robert M. Kirby
  • Shandian Zhe

Gaussian process regression networks (GPRN) are powerful Bayesian models for multi-output regression, but their inference is intractable. To address this issue, existing methods use a fully factorized structure (or a mixture of such structures) over all the outputs and latent functions for posterior approximation, which, however, can miss the strong posterior dependencies among the latent variables and hurt the inference quality. In addition, the updates of the variational parameters are inefficient and can be prohibitively expensive for a large number of outputs. To overcome these limitations, we propose a scalable variational inference algorithm for GPRN, which not only captures the abundant posterior dependencies but also is much more efficient for massive outputs. We tensorize the output space and introduce tensor/matrix-normal variational posteriors to capture the posterior correlations and to reduce the parameters. We jointly optimize all the parameters and exploit the inherent Kronecker product structure in the variational model evidence lower bound to accelerate the computation. We demonstrate the advantages of our method in several real-world applications.

AAAI Conference 2018 Conference Paper

ARC: Adversarial Robust Cuts for Semi-Supervised and Multi-Label Classification

  • Sima Behpour
  • Wei Xing
  • Brian Ziebart

Many structured prediction tasks arising in computer vision and natural language processing tractably reduce to making minimum cost cuts in graphs with edge weights learned using maximum margin methods. Unfortunately, the hinge loss used to construct these methods often provides a particularly loose bound on the loss function of interest (e. g. , the Hamming loss). We develop Adversarial Robust Cuts (ARC), an approach that poses the learning task as a minimax game between predictor and “label approximator” based on minimum cost graph cuts. Unlike maximum margin methods, this game-theoretic perspective always provides meaningful bounds on the Hamming loss. We conduct multi-label and semi-supervised binary prediction experiments that demonstrate the benefits of our approach.

UAI Conference 2015 Conference Paper

Adversarial Cost-Sensitive Classification

  • Kaiser Asif
  • Wei Xing
  • Sima Behpour
  • Brian D. Ziebart

In many classification settings, mistakes incur different application-dependent penalties based on the predicted and actual class labels. Costsensitive classifiers minimizing these penalties are needed. We propose a robust minimax approach for producing classifiers that directly minimize the cost of mistakes as a convex optimization problem. This is in contrast to previous methods that minimize the empirical risk using a convex surrogate for the cost of mistakes, since minimizing the empirical risk of the actual cost-sensitive loss is generally intractable. By treating properties of the training data as uncertain, our approach avoids these computational difficulties. We develop theory and algorithms for our approach and demonstrate its benefits on cost-sensitive classification tasks.

NeurIPS Conference 2015 Conference Paper

Adversarial Prediction Games for Multivariate Losses

  • Hong Wang
  • Wei Xing
  • Kaiser Asif
  • Brian Ziebart

Multivariate loss functions are used to assess performance in many modern prediction tasks, including information retrieval and ranking applications. Convex approximations are typically optimized in their place to avoid NP-hard empirical risk minimization problems. We propose to approximate the training data instead of the loss function by posing multivariate prediction as an adversarial game between a loss-minimizing prediction player and a loss-maximizing evaluation player constrained to match specified properties of training data. This avoids the non-convexity of empirical risk minimization, but game sizes are exponential in the number of predicted variables. We overcome this intractability using the double oracle constraint generation method. We demonstrate the efficiency and predictive performance of our approach on tasks evaluated using the precision at k, the F-score and the discounted cumulative gain.