Author name cluster

Haibo Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

1 author row

AAAI Conference 2026 Conference Paper

RMLer: Synthesizing Novel Objects Across Diverse Categories via Reinforcement Mixing Learning

Jun Li
Zikun Chen
Haibo Chen
Shuo Chen
Jian Yang

Novel object synthesis by integrating distinct textual concepts from diverse categories remains a significant challenge in text-to-image generation. Existing methods often suffer from insufficient concept mixing, lack of rigorous evaluation, and suboptimal outputs, resulting in conceptual imbalance, superficial combinations, or mere juxtapositions. To address these limitations, we propose Reinforcement Mixing Learning (RMLer), a framework that formulates cross-category concept fusion as a reinforcement learning problem: mixed features serve as states, mixing strategies as actions, and visual outcomes as rewards. Specifically, we design an MLP policy network to predict dynamic coefficients for blending cross-category text embeddings. We further introduce visual rewards based on (1) semantic similarity and (2) compositional balance between the fused object and its constituent concepts, and optimize the policy via proximal policy optimization. At inference time, a selection strategy leverages these rewards to curate the highest-quality fused objects. Extensive experiments demonstrate that RMLer synthesizes coherent, high-fidelity objects from diverse categories and consistently outperforms existing methods. Our work provides a robust framework for generating novel visual concepts, with promising applications in film, gaming, and design.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Out-of-Distribution Generalized Graph Anomaly Detection with Homophily-aware Environment Mixup

Sibo Tian
Xin Wang
Zeyang Zhang
Haibo Chen
Wenwu Zhu

Graph anomaly detection (GAD) is widely prevalent in scenarios such as financial fraud detection, anti-money laundering, and social bot detection. However, structural distribution shifts are commonly observed in real-world GAD data due to selection bias, resulting in reduced homophily. Existing GAD methods tend to rely on homophilic shortcuts when trained on high-homophily structures, limiting their ability to generalize well to data with low homophily under structural distribution shifts. In this study, we propose to handle structural distribution shifts by generating novel environments characterized by diverse homophilic structures and utilizing invariant patterns, i. e. , features and structures with the capability of stable prediction across structural distribution shifts, which face two challenges: (1) How to discover invariant patterns from entangled features and structures, as structures are sensitive to varying homophilic distributions. (2) How to systematically construct new environments with diverse homophilic structures. To address these challenges, we propose the Ego-Neighborhood Disentangled Encoder with Homophily-aware Environment Mixup (HEM), which effectively handles structural distribution shifts in GAD by discovering invariant patterns. Specifically, we first propose an ego-neighborhood disentangled encoder to decouple the learning of feature embeddings and structural embeddings, which facilitates subsequent improvements in the invariance of structural embeddings for prediction. Next, we introduce a homophily-aware environment mixup that dynamically adjusts edge weights through adversarial learning, effectively generating environments with diverse structural distributions. Finally, we iteratively train the classifier and environment mixup via adversarial training, simultaneously improving the diversity of constructed environments and discovering invariant patterns under structural distribution shifts. Extensive experiments on real-world datasets demonstrate that our method outperforms existing baselines and achieves state-of-the-art performance under structural distribution shift conditions.

PDF Details

AAAI Conference 2024 Conference Paper

Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation

Tianyi Chu
Wei Xing
Jiafu Chen
Zhizhong Wang
Jiakai Sun
Lei Zhao
Haibo Chen
Huaizhong Lin

Existing generative adversarial network (GAN) based conditional image generative models typically produce fixed output for the same conditional input, which is unreasonable for highly subjective tasks, such as large-mask image inpainting or style transfer. On the other hand, GAN-based diverse image generative methods require retraining/fine-tuning the network or designing complex noise injection functions, which is computationally expensive, task-specific, or struggle to generate high-quality results. Given that many deterministic conditional image generative models have been able to produce high-quality yet fixed results, we raise an intriguing question: is it possible for pre-trained deterministic conditional image generative models to generate diverse results without changing network structures or parameters? To answer this question, we re-examine the conditional image generation tasks from the perspective of adversarial attack and propose a simple and efficient plug-in projected gradient descent (PGD) like method for diverse and controllable image generation. The key idea is attacking the pre-trained deterministic generative models by adding a micro perturbation to the input condition. In this way, diverse results can be generated without any adjustment of network structures or fine-tuning of the pre-trained models. In addition, we can also control the diverse results to be generated by specifying the attack direction according to a reference text or image. Our work opens the door to applying adversarial attack to low-level vision tasks, and experiments on various conditional image generation tasks demonstrate the effectiveness and superiority of the proposed method.

PDF Details DOI

AAAI Conference 2024 Conference Paper

PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping

Jiafu Chen
Wei Xing
Jiakai Sun
Tianyi Chu
Yiling Huang
Boyan Ji
Lei Zhao
Huaizhong Lin

3D scene stylization refers to transform the appearance of a 3D scene to match a given style image, ensuring that images rendered from different viewpoints exhibit the same style as the given style image, while maintaining the 3D consistency of the stylized scene. Several existing methods have obtained impressive results in stylizing 3D scenes. However, the mod- els proposed by these methods need to be re-trained when applied to a new scene. In other words, their models are cou- pled with a specific scene and cannot adapt to arbitrary other scenes. To address this issue, we propose a novel 3D scene stylization framework to transfer an arbitrary style to an ar- bitrary scene, without any style-related or scene-related re- training. Concretely, we first map the appearance of the 3D scene into a 2D style pattern space, which realizes complete disentanglement of the geometry and appearance of the 3D scene and makes our model be generalized to arbitrary 3D scenes. Then we stylize the appearance of the 3D scene in the 2D style pattern space via a prompt-based 2D stylization al- gorithm. Experimental results demonstrate that our proposed framework is superior to SOTA methods in both visual qual- ity and generalization.

PDF Details DOI

AAAI Conference 2023 Conference Paper

MicroAST: Towards Super-fast Ultra-Resolution Arbitrary Style Transfer

Zhizhong Wang
Lei Zhao
Zhiwen Zuo
Ailin Li
Haibo Chen
Wei Xing
Dongming Lu

Arbitrary style transfer (AST) transfers arbitrary artistic styles onto content images. Despite the recent rapid progress, existing AST methods are either incapable or too slow to run at ultra-resolutions (e.g., 4K) with limited resources, which heavily hinders their further applications. In this paper, we tackle this dilemma by learning a straightforward and lightweight model, dubbed MicroAST. The key insight is to completely abandon the use of cumbersome pre-trained Deep Convolutional Neural Networks (e.g., VGG) at inference. Instead, we design two micro encoders (content and style encoders) and one micro decoder for style transfer. The content encoder aims at extracting the main structure of the content image. The style encoder, coupled with a modulator, encodes the style image into learnable dual-modulation signals that modulate both intermediate features and convolutional filters of the decoder, thus injecting more sophisticated and flexible style signals to guide the stylizations. In addition, to boost the ability of the style encoder to extract more distinct and representative style signals, we also introduce a new style signal contrastive loss in our model. Compared to the state of the art, our MicroAST not only produces visually superior results but also is 5-73 times smaller and 6-18 times faster, for the first time enabling super-fast (about 0.5 seconds) AST at 4K ultra-resolutions.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

DivSwapper: Towards Diversified Patch-based Arbitrary Style Transfer

Zhizhong Wang
Lei Zhao
Haibo Chen
Zhiwen Zuo
Ailin Li
Wei Xing
Dongming Lu

Gram-based and patch-based approaches are two important research lines of style transfer. Recent diversified Gram-based methods have been able to produce multiple and diverse stylized outputs for the same content and style images. However, as another widespread research interest, the diversity of patch-based methods remains challenging due to the stereotyped style swapping process based on nearest patch matching. To resolve this dilemma, in this paper, we dive into the crux of existing patch-based methods and propose a universal and efficient module, termed DivSwapper, for diversified patch-based arbitrary style transfer. The key insight is to use an essential intuition that neural patches with higher activation values could contribute more to diversity. Our DivSwapper is plug-and-play and can be easily integrated into existing patch-based and Gram-based methods to generate diverse results for arbitrary styles. We conduct theoretical analyses and extensive experiments to demonstrate the effectiveness of our method, and compared with state-of-the-art algorithms, it shows superiority in diversity, quality, and efficiency.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

Style Fader Generative Adversarial Networks for Style Degree Controllable Artistic Style Transfer

Zhiwen Zuo
Lei Zhao
Shuobin Lian
Haibo Chen
Zhizhong Wang
Ailin Li
Wei Xing
Dongming Lu

Artistic style transfer is the task of synthesizing content images with learned artistic styles. Recent studies have shown the potential of Generative Adversarial Networks (GANs) for producing artistically rich stylizations. Despite the promising results, they usually fail to control the generated images' style degree, which is inflexible and limits their applicability for practical use. To address the issue, in this paper, we propose a novel method that for the first time allows adjusting the style degree for existing GAN-based artistic style transfer frameworks in real time after training. Our method introduces two novel modules into existing GAN-based artistic style transfer frameworks: a Style Scaling Injection (SSI) module and a Style Degree Interpretation (SDI) module. The SSI module accepts the value of Style Degree Factor (SDF) as the input and outputs parameters that scale the feature activations in existing models, offering control signals to alter the style degrees of the stylizations. And the SDI module interprets the output probabilities of a multi-scale content-style binary classifier as the style degrees, providing a mechanism to parameterize the style degree of the stylizations. Moreover, we show that after training our method can enable existing GAN-based frameworks to produce over-stylizations. The proposed method can facilitate many existing GAN-based artistic style transfer frameworks with marginal extra training overheads and modifications. Extensive qualitative evaluations on two typical GAN-based style transfer models demonstrate the effectiveness of the proposed method for gaining style degree control for them.

PDF Details DOI

AAAI Conference 2022 Conference Paper

Texture Reformer: Towards Fast and Universal Interactive Texture Transfer

Zhizhong Wang
Lei Zhao
Haibo Chen
Ailin Li
Zhiwen Zuo
Wei Xing
Dongming Lu

In this paper, we present the texture reformer, a fast and universal neural-based framework for interactive texture transfer with user-specified guidance. The challenges lie in three aspects: 1) the diversity of tasks, 2) the simplicity of guidance maps, and 3) the execution efficiency. To address these challenges, our key idea is to use a novel feed-forward multiview and multi-stage synthesis procedure consisting of I) a global view structure alignment stage, II) a local view texture refinement stage, and III) a holistic effect enhancement stage to synthesize high-quality results with coherent structures and fine texture details in a coarse-to-fine fashion. In addition, we also introduce a novel learning-free view-specific texture reformation (VSTR) operation with a new semantic map guidance strategy to achieve more accurate semanticguided and structure-preserved texture transfer. The experimental results on a variety of application scenarios demonstrate the effectiveness and superiority of our framework. And compared with the state-of-the-art interactive texture transfer algorithms, it not only achieves higher quality results but, more remarkably, also is 2-5 orders of magnitude faster.

PDF Details

NeurIPS Conference 2021 Conference Paper

Artistic Style Transfer with Internal-external Learning and Contrastive Learning

Haibo Chen
Lei Zhao
Zhizhong Wang
Huiming Zhang
Zhiwen Zuo
Ailin Li
Wei Xing
Dongming Lu

Although existing artistic style transfer methods have achieved significant improvement with deep neural networks, they still suffer from artifacts such as disharmonious colors and repetitive patterns. Motivated by this, we propose an internal-external style transfer method with two contrastive losses. Specifically, we utilize internal statistics of a single style image to determine the colors and texture patterns of the stylized image, and in the meantime, we leverage the external information of the large-scale style dataset to learn the human-aware style information, which makes the color distributions and texture patterns in the stylized image more reasonable and harmonious. In addition, we argue that existing style transfer methods only consider the content-to-stylization and style-to-stylization relations, neglecting the stylization-to-stylization relations. To address this issue, we introduce two contrastive losses, which pull the multiple stylization embeddings closer to each other when they share the same content or style, but push far away otherwise. We conduct extensive experiments, showing that our proposed method can not only produce visually more harmonious and satisfying artistic images, but also promote the stability and consistency of rendered video clips.

PDF Details