Arrow Research search

Author name cluster

Maoguo Gong

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

19 papers
2 author rows

Possible papers

19

AAAI Conference 2025 Conference Paper

AdvDisplay: Adversarial Display Assembled by Thermoelectric Cooler for Fooling Thermal Infrared Detectors

  • Hao Li
  • Fanggao Wan
  • Yue Su
  • Yue Wu
  • Mingyang Zhang
  • Maoguo Gong

When the current physical adversarial patches cannot deceive thermal infrared detectors, the existing techniques implement adversarial attacks from scratch, such as digital patch generation, material production, and physical deployment. Besides, it is difficult to finely regulate infrared radiation. To address these issues, this paper designs an adversarial thermal display (AdvDisplay ) by assembling thermoelectric coolers (TECs) as an array. Specifically, to reduce the gap between patches in the physical and digital worlds and decrease the power of AdvDisplay device, heat transfer loss and electric power loss are designed to guide the patch optimization. In addition, a precise temperature control scheme for AdvDisplay is proposed based on proportional-integral-derivative (PID) control. Due to the accurate temperature regulation and the reusability of AdvDisplay, our method is able to improve the attack success rate and the efficiency of physical deployments. Extensive experimental results indicate that the proposed method possesses superior adversarial effectiveness compared to other methods and demonstrates strong robustness in physical attacks.

AAAI Conference 2025 Conference Paper

FedFSL-CFRD: Personalized Federated Few-Shot Learning with Collaborative Feature Representation Disentanglement

  • Shanfeng Wang
  • Jianzhao Li
  • Zaitian Liu
  • Yourun Zhang
  • Maoguo Gong

Federated few-shot learning (FedFSL) aims to enable the clients to obtain personalized generalization models for unseen categories with only a small number of referenceable samples in the distributed collaborative training paradigm. Most existing FedFSL-related algorithms suffer from domain bias and feature coupling in the presence of data heterogeneity and sample scarcity. In this work, we propose a collaborative feature representation disentanglement (CFRD) scheme for FedFSL to address these issues. After each client receives the global aggregation parameters, the original feature representation is decoupled into global communal features and local personality features with personalized bias representation, to maintain both global consistency and local relevance in the first feature representation disentanglement. On the few-shot metric space about the second feature representation disentanglement, category-independent information is encoded by class-specific and class-irrelevant reconstructions to separate the discriminative features. The proposed scheme collaboratively accomplishes global domain bias feature disentanglement and local category degradation feature disentanglement from client-wise and class-wise. Experiments on three few-shot benchmark datasets conforming to the FedFSL paradigm demonstrate that our proposed method outperforms state-of-the-art approaches in both global generality and local specificity.

AAAI Conference 2025 Conference Paper

MUCD: Unsupervised Point Cloud Change Detection via Masked Consistency

  • Yue Wu
  • Zhipeng Wang
  • Yongzhe Yuan
  • Maoguo Gong
  • Hao Li
  • Mingyang Zhang
  • Wenping Ma
  • Qiguang Miao

3D Change Detection (3DCD) has gradually become another research hotspot after image change detection. Recent works focus on using artificial labels for supervised or weakly-supervised training of siamese networks to segment changed points. However, labeling every points of multi-temporal point clouds is very expensive and time-consuming. In addition, these works lack effective self-supervised signals, and existing self-supervised signals often fail to capture sufficiently rich change information. To solve this problem, we assume that the powerful representation of 3D objects should model the consistency information of unchanged regions and distinguish different objects. Based on this assumption, we propose a new unsupervised framework called MUCD to learn change information of multi-temporal point clouds through bidirectional optimization of change segmentor and feature extractor. The training of network is divided into two stages. We first design a foreknowledge point contrastive loss based on the characteristics of the 3DCD task to initialize the feature extractor, and then propose a masked consistency loss to further learn the shared geometric information of unchanged regions in the multi-temporal point clouds, utilizing it as a free and powerful supervised signal to train a change segmentor. In the inference stage, only the segmentor is used to take multi-temporal point clouds as input and produce change segmentation result. Extensive experiments are conducted on SLPCCD and Urb3DCD, two real-world datasets of streets and urban buildings, to verify that our proposed unsupervised method is highly competitive and even outperforms supervised methods in scenes where semantic information changes occur, exhibiting better performance in generalization ability and robustness.

AAAI Conference 2025 Conference Paper

Partial Point Cloud Registration with Multi-view 2D Image Learning

  • Yue Zhang
  • Yue Wu
  • Wenping Ma
  • Maoguo Gong
  • Hao Li
  • Biao Hou

Learning representations from numerous 2D image data has shown promising performance, yet very few works apply this representations to point cloud registration. In this paper, we explore how to leverage the 2D information to assist the point cloud registration, and propose IAPReg, an Image-Assisted Partial 3D point cloud Registration framework with the multi-view images generated by the input point cloud. It is expected to enrich 3D information with 2D knowledge, and leverage 2D knowledge to assist with point cloud registration. Specifically, we create multi-view depth maps by projecting the input point cloud from several specific views, and then extract 2D and 3D features using some well-established models. To fuse the information learned from 2D and 3D modalities, inter-modality multi-view learning module is proposed to enhance geometric information and complement semantic information. Weighted SVD is a common method to reduce the impact of inaccurate correspondences on registration. However, determining the correspondence weights is not trivial. Therefore, we design a 2D-weighted SVD method, where the 2D knowledge is employed to provide weight information of correspondences. Extensive experiments perform that our method outperform the state-of-the-art method without additional 2D training data.

NeurIPS Conference 2025 Conference Paper

PointTruss: K-Truss for Point Cloud Registration

  • Yue Wu
  • Jun Jiang
  • Yongzhe Yuan
  • Maoguo Gong
  • Qiguang Miao
  • Hao Li
  • Mingyang Zhang
  • Wenping Ma

Point cloud registration is a fundamental task in 3D computer vision. Recent advances have shown that graph-based methods are effective for outlier rejection in this context. However, existing clique-based methods impose overly strict constraints and are NP-hard, making it difficult to achieve both robustness and efficiency. While the k-core reduces computational complexity, which only considers node degree and ignores higher-order topological structures such as triangles, limiting its effectiveness in complex scenarios. To overcome these limitations, we introduce the $k$-truss from graph theory into point cloud registration, leveraging triangle support as a constraint for inlier selection. We further propose a consensus voting-based low-scale sampling strategy to efficiently extract the structural skeleton of the point cloud prior to $k$-truss decomposition. Additionally, we design a spatial distribution score that balances coverage and uniformity of inliers, preventing selections that concentrate on sparse local clusters. Extensive experiments on KITTI, 3DMatch, and 3DLoMatch demonstrate that our method consistently outperforms both traditional and learning-based approaches in various indoor and outdoor scenarios, achieving state-of-the-art results.

AAAI Conference 2025 Conference Paper

Where Precision Meets Efficiency: Transformation Diffusion Model for Point Cloud Registration

  • Yongzhe Yuan
  • Yue Wu
  • Xiaolong Fan
  • Maoguo Gong
  • Qiguang Miao
  • Wenping Ma

We propose a transformation diffusion model for point cloud registration to balance precision and efficiency. Our method formulates point cloud registration as a denoising diffusion process from noisy transformation to object transformation, which is represented by quaternion and translation. Specifically, in training stage, object transformation diffuses from ground-truth transformation to random distribution, and the model learns to reverse this noising process. In sampling stage, the model refines randomly generated transformation to the optimal transformation in a progressive way. We derive the variational bound in closed form for training and provide instantiation of the model. Our diffusion model maps transformation into latent space, and splits the transformation into two components (rotation and translation) based on the fact that they belong to different solution spaces. In addition, our work provides the following crucial findings: (i) Point cloud registration, one of the representative discriminative tasks, can be solved by a generative way and mapped into latent space to obtain new unified probabilistic formulation. (ii) Our model, Transformation Diffusion Model (TDM) can be a plug-and-play agent for point cloud registration, making our method applicable to different deep registration networks. Experimental results on synthetic and real-world datasets demonstrate that, in correspondence-free and correspondence-based scenarios, TDM can both achieve exceeding 60% performance improvements and higher efficiency simultaneously.

AAAI Conference 2024 Conference Paper

Enhancing Hyperspectral Images via Diffusion Model and Group-Autoencoder Super-resolution Network

  • Zhaoyang Wang
  • Dongyang Li
  • Mingyang Zhang
  • Hao Luo
  • Maoguo Gong

Existing hyperspectral image (HSI) super-resolution (SR) methods struggle to effectively capture the complex spectral-spatial relationships and low-level details, while diffusion models represent a promising generative model known for their exceptional performance in modeling complex relations and learning high and low-level visual features. The direct application of diffusion models to HSI SR is hampered by challenges such as difficulties in model convergence and protracted inference time. In this work, we introduce a novel Group-Autoencoder (GAE) framework that synergistically combines with the diffusion model to construct a highly effective HSI SR model (DMGASR). Our proposed GAE framework encodes high-dimensional HSI data into low-dimensional latent space where the diffusion model works, thereby alleviating the difficulty of training the diffusion model while maintaining band correlation and considerably reducing inference time. Experimental results on both natural and remote sensing hyperspectral datasets demonstrate that the proposed method is superior to other state-of-the-art methods both visually and metrically.

AAAI Conference 2024 Conference Paper

Entropy Induced Pruning Framework for Convolutional Neural Networks

  • Yiheng Lu
  • Ziyu Guan
  • Yaming Yang
  • Wei Zhao
  • Maoguo Gong
  • Cai Xu

Structured pruning techniques have achieved great compression performance on convolutional neural networks for image classification tasks. However, the majority of existing methods are sensitive with respect to the model parameters, and their pruning results may be unsatisfactory when the original model is trained poorly. That is, they need the original model to be fully trained, to obtain useful weight information. This is time-consuming, and makes the effectiveness of the pruning results dependent on the degree of model optimization. To address the above issue, we propose a novel metric named Average Filter Information Entropy (AFIE). It decomposes the weight matrix of each layer into a low-rank space, and quantifies the filter importance based on the distribution of the normalized eigenvalues. Intuitively, the eigenvalues capture the covariance among filters, and therefore could be a good guide for pruning. Since the distribution of eigenvalues is robust to the updating of parameters, AFIE can yield a stable evaluation for the importance of each filter no matter whether the original model is trained fully. We implement our AFIE-based pruning method for three popular CNN models of AlexNet, VGG-16, and ResNet-50, and test them on three widely-used image datasets MNIST, CIFAR-10, and ImageNet, respectively. The experimental results are encouraging. We surprisingly observe that for our methods, even when the original model is trained with only one epoch, the AFIE score of each filter keeps identical to the results when the model is fully-trained. This fully indicates the effectiveness of the proposed pruning method.

AAAI Conference 2024 Conference Paper

M3SOT: Multi-Frame, Multi-Field, Multi-Space 3D Single Object Tracking

  • Jiaming Liu
  • Yue Wu
  • Maoguo Gong
  • Qiguang Miao
  • Wenping Ma
  • Cai Xu
  • Can Qin

3D Single Object Tracking (SOT) stands a forefront task of computer vision, proving essential for applications like autonomous driving. Sparse and occluded data in scene point clouds introduce variations in the appearance of tracked objects, adding complexity to the task. In this research, we unveil M3SOT, a novel 3D SOT framework, which synergizes multiple input frames (template sets), multiple receptive fields (continuous contexts), and multiple solution spaces (distinct tasks) in ONE model. Remarkably, M3SOT pioneers in modeling temporality, contexts, and tasks directly from point clouds, revisiting a perspective on the key factors influencing SOT. To this end, we design a transformer-based network centered on point cloud targets in the search area, aggregating diverse contextual representations and propagating target cues by employing historical frames. As M3SOT spans varied processing perspectives, we've streamlined the network—trimming its depth and optimizing its structure—to ensure a lightweight and efficient deployment for SOT applications. We posit that, backed by practical construction, M3SOT sidesteps the need for complex frameworks and auxiliary components to deliver sterling results. Extensive experiments on benchmarks such as KITTI, nuScenes, and Waymo Open Dataset demonstrate that M3SOT achieves state-of-the-art performance at 38 FPS. Our code and models are available at https://github.com/ywu0912/TeamCode.git.

AAAI Conference 2024 Conference Paper

Neural Gaussian Similarity Modeling for Differential Graph Structure Learning

  • Xiaolong Fan
  • Maoguo Gong
  • Yue Wu
  • Zedong Tang
  • Jieyi Liu

Graph Structure Learning (GSL) has demonstrated considerable potential in the analysis of graph-unknown non-Euclidean data across a wide range of domains. However, constructing an end-to-end graph structure learning model poses a challenge due to the impediment of gradient flow caused by the nearest neighbor sampling strategy. In this paper, we construct a differential graph structure learning model by replacing the non-differentiable nearest neighbor sampling with a differentiable sampling using the reparameterization trick. Under this framework, we argue that the act of sampling nearest neighbors may not invariably be essential, particularly in instances where node features exhibit a significant degree of similarity. To alleviate this issue, the bell-shaped Gaussian Similarity (GauSim) modeling is proposed to sample non-nearest neighbors. To adaptively model the similarity, we further propose Neural Gaussian Similarity (NeuralGauSim) with learnable parameters featuring flexible sampling behaviors. In addition, we develop a scalable method by transferring the large-scale graph to the transition graph to significantly reduce the complexity. Experimental results demonstrate the effectiveness of the proposed methods.

ICML Conference 2024 Conference Paper

PointMC: Multi-instance Point Cloud Registration based on Maximal Cliques

  • Yue Wu 0004
  • Xidao Hu
  • Yongzhe Yuan
  • Xiaolong Fan
  • Maoguo Gong
  • Hao Li 0009
  • Mingyang Zhang 0002
  • Qiguang Miao

Multi-instance point cloud registration is the problem of estimating multiple rigid transformations between two point clouds. Existing solutions rely on global spatial consistency of ambiguity and the time-consuming clustering of highdimensional correspondence features, making it difficult to handle registration scenarios where multiple instances overlap. To address these problems, we propose a maximal clique based multiinstance point cloud registration framework called PointMC. The key idea is to search for maximal cliques on the correspondence compatibility graph to estimate multiple transformations, and cluster these transformations into clusters corresponding to different instances to efficiently and accurately estimate all poses. PointMC leverages a correspondence embedding module that relies on local spatial consistency to effectively eliminate outliers, and the extracted discriminative features empower the network to circumvent missed pose detection in scenarios involving multiple overlapping instances. We conduct comprehensive experiments on both synthetic and real-world datasets, and the results show that the proposed PointMC yields remarkable performance improvements.

IJCAI Conference 2018 Conference Paper

Feature Hashing for Network Representation Learning

  • Qixiang Wang
  • Shanfeng Wang
  • Maoguo Gong
  • Yue Wu

The goal of network representation learning is to embed nodes so as to encode the proximity structures of a graph into a continuous low-dimensional feature space. In this paper, we propose a novel algorithm called node2hash based on feature hashing for generating node embeddings. This approach follows the encoder-decoder framework. There are two main mapping functions in this framework. The first is an encoder to map each node into high-dimensional vectors. The second is a decoder to hash these vectors into a lower dimensional feature space. More specifically, we firstly derive a proximity measurement called expected distance as target which combines position distribution and co-occurrence statistics of nodes over random walks so as to build a proximity matrix, then introduce a set of T different hash functions into feature hashing to generate uniformly distributed vector representations of nodes from the proximity matrix. Compared with the existing state-of-the-art network representation learning approaches, node2hash shows a competitive performance on multi-class node classification and link prediction tasks on three real-world networks from various domains.

IJCAI Conference 2017 Conference Paper

DRLnet: Deep Difference Representation Learning Network and An Unsupervised Optimization Framework

  • Puzhao Zhang
  • Maoguo Gong
  • Hui Zhang
  • Jia Liu

Change detection and analysis (CDA) is an important research topic in the joint interpretation of spatial-temporal remote sensing images. The core of CDA is to effectively represent the difference and measure the difference degree between bi-temporal images. In this paper, we propose a novel difference representation learning network (DRLnet) and an effective optimization framework without any supervision. Difference measurement, difference representation learning and unsupervised clustering are combined as a single model, i. e. , DRLnet, which is driven to learn clustering-friendly and discriminative difference representations (DRs) for different types of changes. Further, DRLnet is extended into a recurrent learning framework to update and reuse limited training samples and prevent the semantic gaps caused by the saltation in the number of change types from over-clustering stage to the desired one. Experimental results identify the effectiveness of the proposed framework.

IJCAI Conference 2017 Conference Paper

Modeling Hebb Learning Rule for Unsupervised Learning

  • Jia Liu
  • Maoguo Gong
  • Qiguang Miao

This paper presents to model the Hebb learning rule and proposes a neuron learning machine (NLM). Hebb learning rule describes the plasticity of the connection between presynaptic and postsynaptic neurons and it is unsupervised itself. It formulates the updating gradient of the connecting weight in artificial neural networks. In this paper, we construct an objective function via modeling the Hebb rule. We make a hypothesis to simplify the model and introduce a correlation based constraint according to the hypothesis and stability of solutions. By analysis from the perspectives of maintaining abstract information and increasing the energy based probability of observed data, we find that this biologically inspired model has the capability of learning useful features. NLM can also be stacked to learn hierarchical features and reformulated into convolutional version to extract features from 2-dimensional data. Experiments on single-layer and deep networks demonstrate the effectiveness of NLM in unsupervised feature learning.

AAAI Conference 2017 Short Paper

Neuron Learning Machine for Representation Learning

  • Jia Liu
  • Maoguo Gong
  • Qiguang Miao

This paper presents a novel neuron learning machine (NLM) which can extract hierarchical features from data. We focus on the single-layer neural network architecture and propose to model the network based on the Hebbian learning rule. Hebbian learning rule describes how synaptic weight changes with the activations of presynaptic and postsynaptic neurons. We model the learning rule as the objective function by considering the simplicity of the network and stability of solutions. We make a hypothesis and introduce a correlation based constraint according to the hypothesis. We find that this biologically inspired model has the ability of learning useful features from the perspectives of retaining abstract information. NLM can also be stacked to learn hierarchical features and reformulated into convolutional version to extract features from 2-dimensional data.

IJCAI Conference 2017 Conference Paper

Self-paced Convolutional Neural Networks

  • Hao Li
  • Maoguo Gong

Convolutional neural networks (CNNs) have achieved breakthrough performance in many pattern recognition tasks. In order to distinguish the reliable data from the noisy and confusing data, we improve CNNs with self-paced learning (SPL) for enhancing the learning robustness of CNNs. In the proposed self-paced convolutional network (SPCN), each sample is assigned to a weight to reflect the easiness of the sample. Then a dynamic self-paced function is incorporated into the leaning objective of CNN to jointly learn the parameters of CNN and the latent weight variable. SPCN learns the samples from easy to complex and the sample weights can dynamically control the learning rates for converging to better values. To gain more insights of SPCN, theoretical studies are conducted to show that SPCN converges to a stationary solution and is robust to the noisy and confusing data. Experimental results on MNIST and rectangles datasets demonstrate that the proposed method outperforms baseline methods.

AAAI Conference 2016 Conference Paper

Multi-Objective Self-Paced Learning

  • Hao Li
  • Maoguo Gong
  • Deyu Meng
  • Qiguang Miao

Current self-paced learning (SPL) regimes adopt the greedy strategy to obtain the solution with a gradually increasing pace parameter while where to optimally terminate this increasing process is difficult to determine. Besides, most SPL implementations are very sensitive to initialization and short of a theoretical result to clarify where SPL converges to with pace parameter increasing. In this paper, we propose a novel multi-objective self-paced learning (MOSPL) method to address these issues. Specifically, we decompose the objective functions as two terms, including the loss and the self-paced regularizer, respectively, and treat the problem as the compromise between these two objectives. This naturally reformulates the SPL problem as a standard multi-objective issue. A multi-objective evolutionary algorithm is used to optimize the two objectives simultaneously to facilitate the rational selection of a proper pace parameter. The proposed technique is capable of ameliorating a set of solutions with respect to a range of pace parameters through finely compromising these solutions inbetween, and making them perform robustly even under bad initialization. A good solution can then be naturally achieved from these solutions by making use of some offthe-shelf tools in multi-objective optimization. Experimental results on matrix factorization and action recognition demonstrate the superiority of the proposed method against the existing issues in current SPL research.

NeurIPS Conference 2005 Conference Paper

Response Analysis of Neuronal Population with Synaptic Depression

  • Wentao Huang
  • Licheng Jiao
  • Shan Tan
  • Maoguo Gong

In this paper, we aim at analyzing the characteristic of neuronal population responses to instantaneous or time-dependent inputs and the role of synapses in neural information processing. We have derived an evolution equation of the membrane potential density function with synaptic depression, and obtain the formulas for analytic computing the response of instantaneous re rate. Through a technical analysis, we arrive at several signi cant conclusions: The background inputs play an important role in information processing and act as a switch betwee temporal integration and coincidence detection. the role of synapses can be regarded as a spatio-temporal lter; it is important in neural information processing for the spatial distribution of synapses and the spatial and temporal relation of inputs. The instantaneous input frequency can affect the response amplitude and phase delay.