Author name cluster

Xiaogang Jin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

1 author row

AAAI Conference 2026 Conference Paper

F.A.C.U.L.: Language-Based Interaction with AI Companions in Gaming

Wenya Wei
Sipeng Yang
Qixian Zhou
Ruochen Liu
Xuelei Zhang
Yifu Yuan
Yan Jiang
Yongle Luo

In cooperative video games, traditional AI companions are deployed to assist players, who control them using hotkeys or command wheels to issue predefined commands such as ''attack'', ''defend'', or ''retreat''. Despite their simplicity, these methods, which lack target specificity, limit players' ability to give complex tactical instructions and hinder immersive gameplay experiences. To address this, we propose the FPS AI Companion who Understands Language (F.A.C.U.L.), the first real-time AI system that enables players to communicate and collaborate with AI companions using natural language. By integrating natural language processing with a confidence-based framework, F.A.C.U.L. efficiently decomposes complex commands and interprets player intent. It also employs a dynamic entity retrieval method for environmental awareness, aligning human intentions with decision-making. Unlike traditional rule-based systems, our method supports real-time language interactions, enabling players to issue complex commands such as ''clear the second floor,'' ''take cover behind that tree,'' or ''retreat to the river''. The system provides real-time behavioral responses and vocal feedback, ensuring seamless tactical collaboration. Using the popular FPS game Arena Breakout: Infinite as a case study, we present comparisons demonstrating the efficacy of our approach and discuss the advantages and limitations of AI companions based on real-world user feedback.

PDF Details DOI

AAAI Conference 2024 Conference Paper

A General Implicit Framework for Fast NeRF Composition and Rendering

Xinyu Gao
Ziyi Yang
Yunlu Zhao
Yuxiang Sun
Xiaogang Jin
Changqing Zou

A variety of Neural Radiance Fields (NeRF) methods have recently achieved remarkable success in high render speed. However, current accelerating methods are specialized and incompatible with various implicit methods, preventing real-time composition over various types of NeRF works. Because NeRF relies on sampling along rays, it is possible to provide general guidance for acceleration. To that end, we propose a general implicit pipeline for composing NeRF objects quickly. Our method enables the casting of dynamic shadows within or between objects using analytical light sources while allowing multiple NeRF objects to be seamlessly placed and rendered together with any arbitrary rigid transformations. Mainly, our work introduces a new surface representation known as Neural Depth Fields (NeDF) that quickly determines the spatial relationship between objects by allowing direct intersection computation between rays and implicit surfaces. It leverages an intersection neural network to query NeRF for acceleration instead of depending on an explicit spatial structure.Our proposed method is the first to enable both the progressive and interactive composition of NeRF objects. Additionally, it also serves as a previewing plugin for a range of existing NeRF works.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

MaskFactory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation

Haotian Qian
YD Chen
Shengtao Lou
Fahad S. Khan
Xiaogang Jin
Deng-Ping Fan

Dichotomous Image Segmentation (DIS) tasks require highly precise annotations, and traditional dataset creation methods are labor intensive, costly, and require extensive domain expertise. Although using synthetic data for DIS is a promising solution to these challenges, current generative models and techniques struggle with the issues of scene deviations, noise-induced errors, and limited training sample variability. To address these issues, we introduce a novel approach, Mask Factory, which provides a scalable solution for generating diverse and precise datasets, markedly reducing preparation time and costs. We first introduce a general mask editing method that combines rigid and non-rigid editing techniques to generate high-quality synthetic masks. Specially, rigid editing leverages geometric priors from diffusion models to achieve precise viewpoint transformations under zero-shot conditions, while non-rigid editing employs adversarial training and self-attention mechanisms for complex, topologically consistent modifications. Then, we generate pairs of high-resolution image and accurate segmentation mask using a multi-conditional control generation method. Finally, our experiments on the widely-used DIS5K dataset benchmark demonstrate superior performance in quality and efficiency compared to existing methods. The code is available at https: //qian-hao-tian. github. io/MaskFactory/.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

RobIR: Robust Inverse Rendering for High-Illumination Scenes

Ziyi Yang
Yanzhen Chen
Xinyu Gao
Yazhen Yuan
Yu Wu
Xiaowei Zhou
Xiaogang Jin

Implicit representation has opened up new possibilities for inverse rendering. However, existing implicit neural inverse rendering methods struggle to handle strongly illuminated scenes with significant shadows and slight reflections. The existence of shadows and reflections can lead to an inaccurate understanding of the scene, making precise factorization difficult. To this end, we present RobIR, an implicit inverse rendering approach that uses ACES tone mapping and regularized visibility estimation to reconstruct accurate BRDF of the object. By accurately modeling the indirect radiance field, normal, visibility, and direct light simultaneously, we are able to accurately decouple environment lighting and the object's PBR materials without imposing strict constraints on the scene. Even in high-illumination scenes with shadows and specular reflections, our method can recover high-quality albedo and roughness with no shadow interference. RobIR outperforms existing methods in both quantitative and qualitative evaluations.

PDF Details DOI

AAAI Conference 2024 Conference Paper

SocialCVAE: Predicting Pedestrian Trajectory via Interaction Conditioned Latents

Wei Xiang
Haoteng YIN
He Wang
Xiaogang Jin

Pedestrian trajectory prediction is the key technology in many applications for providing insights into human behavior and anticipating human future motions. Most existing empirical models are explicitly formulated by observed human behaviors using explicable mathematical terms with deterministic nature, while recent work has focused on developing hybrid models combined with learning-based techniques for powerful expressiveness while maintaining explainability. However, the deterministic nature of the learned steering behaviors from the empirical models limits the models' practical performance. To address this issue, this work proposes the social conditional variational autoencoder (SocialCVAE) for predicting pedestrian trajectories, which employs a CVAE to explore behavioral uncertainty in human motion decisions. SocialCVAE learns socially reasonable motion randomness by utilizing a socially explainable interaction energy map as the CVAE's condition, which illustrates the future occupancy of each pedestrian's local neighborhood area. The energy map is generated using an energy-based interaction model, which anticipates the energy cost (i.e., repulsion intensity) of pedestrians' interactions with neighbors. Experimental results on two public benchmarks including 25 scenes demonstrate that SocialCVAE significantly improves prediction accuracy compared with the state-of-the-art methods, with up to 16.85% improvement in Average Displacement Error (ADE) and 69.18% improvement in Final Displacement Error (FDE). Code is available at: https://github.com/ViviXiang/SocialCVAE.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting

Ziyi Yang
Xinyu Gao
Yang-Tian Sun
Yi-Hua Huang
Xiaoyang Lyu
Wen Zhou
Shaohui Jiao
Xiaojuan Qi

The recent advancements in 3D Gaussian splatting (3D-GS) have not only facilitated real-time rendering through modern GPU rasterization pipelines but have also attained state-of-the-art rendering quality. Nevertheless, despite its exceptional rendering quality and performance on standard datasets, 3D-GS frequently encounters difficulties in accurately modeling specular and anisotropic components. This issue stems from the limited ability of spherical harmonics (SH) to represent high-frequency information. To overcome this challenge, we introduce Spec-Gaussian, an approach that utilizes an anisotropic spherical Gaussian (ASG) appearance field instead of SH for modeling the view-dependent appearance of each 3D Gaussian. Additionally, we have developed a coarse-to-fine training strategy to improve learning efficiency and eliminate floaters caused by overfitting in real-world scenes. Our experimental results demonstrate that our method surpasses existing approaches in terms of rendering quality. Thanks to ASG, we have significantly improved the ability of 3D-GS to model scenes with specular and anisotropic components without increasing the number of 3D Gaussians. This improvement extends the applicability of 3D GS to handle intricate scenarios with specular and anisotropic surfaces.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Progressive Feature Polishing Network for Salient Object Detection

Bo Wang
Quan Chen
Min Zhou
Zhiqiang Zhang
Xiaogang Jin
Kun Gai

Feature matters for salient object detection. Existing methods mainly focus on designing a sophisticated structure to incorporate multi-level features and ﬁlter out cluttered features. We present Progressive Feature Polishing Network (PFPN), a simple yet effective framework to progressively polish the multi-level features to be more accurate and representative. By employing multiple Feature Polishing Modules (FPMs) in a recurrent manner, our approach is able to detect salient objects with ﬁne details without any post-processing. A FPM parallelly updates the features of each level by directly incorporating all higher level context information. Moreover, it can keep the dimensions and hierarchical structures of the feature maps, which makes it ﬂexible to be integrated with any CNN-based models. Empirical experiments show that our results are monotonically getting better with increasing number of FPMs. Without bells and whistles, PFPN outperforms the state-of-the-art methods signiﬁcantly on ﬁve benchmark datasets under various evaluation metrics. Our code is available at: https: //github. com/chenquan-cq/PFPN.

PDF Details

EAAI Journal 2014 Journal Article

From design methodology to evolutionary design: An interactive creation of marble-like textile patterns

Shufang Lu
P.Y. Mok
Xiaogang Jin

In this paper, by the integration of design methodology theories with evolutionary computation, a new design system is developed to evolve preferred designs on complex marbling patterns using interactive ‘perceptual selection’. The system is formulated in a way to assist the productive–deductive–inductive design reasoning process of the users. Therefore, complex mathematical functions do not cognitively overload the designers, who are released for more critical tasks of aesthetic assessment and new design rules induction. With the implementation on a graphics-processing unit (GPU), real-time complex marbling patterns can be created by the system. The system encourages creativity in the design process and accelerates new design generation. In addition, the resulting patterns fulfil the textile industry requirements of repeat and can be output as vector images.

Details DOI