Author name cluster

Guangyong Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

26 papers

2 author rows

AAAI Conference 2026 Conference Paper

Unifying Multi-View Knowledge for Graph Learning via Model Collaboration

Zhihao Wu
Jielong Lu
Zihan Fang
Jinyu Cai
Guangyong Chen
Jiajun Bu
Haishuai Wang

With the increasing scale and complexity of graph data, node attributes are also becoming richer and more complex, particularly in the form of informative text. Classic GNNs equipped with shallow attribute encoders are no longer sufficient to handle such data independently, making model collaboration across heterogeneous architectures an inevitable trend. Recently, the integration of Large Language Models (LLMs) and GNNs has attracted significant attention, yet the inherent disparity between these models remains a key challenge. Promising solutions have considered fine-tuning Small Language Models (SLMs) to bridge the gap between GNNs and frozen LLMs. However, this introduces another problem: these heterogeneous models bring complementary knowledge, but how to effectively integrate them and allow mutual refinement becomes a significant research gap. To address these challenges, we introduce COLA, a collaborative large–small model framework that enables seamless cooperation among semantic LLMs, task-specific fine-tuned SLMs, and structure-aware GNNs. COLA features a unique Consensus–Complement Coordination Mechanism (C3M), wherein its Mixture-of-Coordinators (MoC) architecturally aligns the LLM and SLM. Built upon this, a flexible graph-knowledge infusion strategy encourages the joint alignment and graph knowledge learning of textual representations. Extensive evaluations across nine diverse datasets show that COLA consistently achieves state-of-the-art performance, validating the effectiveness and generality of our collaborative paradigm.

PDF Details DOI

ECAI Conference 2025 Conference Paper

DiTAC: Discrete Teamwork Abstraction for Ad Hoc Collaboration

Jing Wang 0055
Pengjie Gu
Mengchen Zhao
Guangyong Chen
Furui Liu
Pheng-Ann Heng

Training autonomous agents to collaborate with unknown teammates in cooperative multi-agent environments remains a fundamental challenge in ad hoc teamwork research. Conventional approaches rely heavily on online interactions with arbitrary teammates under the assumption of full observability. However, in real-world scenarios, teammate policies are often inaccessible, making historical trajectory rollouts a more practical alternative. We propose DiTAC, a method that learns discrete teamwork abstractions for ad hoc collaboration by automatically extracting latent cooperation patterns from short trajectory segments and adapting effectively to diverse teammate behaviors. To mitigate the out-of-distribution challenge, we constrain learned representations within a discrete code-book. Furthermore, we employ a masked bidirectional transformer architecture to infer teammate behaviors from local observations, thereby relaxing the full observability assumption. Empirical results demonstrate that DiTAC significantly outperforms existing baselines and its variants across widely-used ad hoc teamwork tasks.

Details

AAMAS Conference 2025 Conference Paper

Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer

Yaodong Yang
Guangyong Chen
Hongyao Tang
Furui Liu
Danruo Deng
Pheng-Ann Heng

Overestimation in single-agent reinforcement learning has been extensively studied. In contrast, overestimation in the multiagent setting has received comparatively little attention although it increases with the number of agents and leads to severe learning instability. Previous works concentrate on reducing overestimation in the estimation process of target Q-value. They ignore the follow-up optimization process of online Q-network, thus making it hard to fully address the complex multiagent overestimation problem. To solve this challenge, in this study, we first establish an iterative estimation-optimization analysis framework for multiagent value-mixing Q-learning. Our analysis reveals that multiagent overestimation not only comes from the computation of target Qvalue but also accumulates in the online Q-network’s optimization. Motivated by it, we propose the Dual Ensembled Multiagent Q- Learning with Hypernet Regularizer algorithm to tackle multiagent overestimation from two aspects. First, we extend the random ensemble technique into the estimation of target individual and global Q-values to derive a lower update target. Second, we propose a novel hypernet regularizer on hypernetwork weights and biases to constrain the optimization of online global Q-network to prevent overestimation accumulation. Extensive experiments in MPE and SMAC show that the proposed method successfully addresses overestimation across various tasks1.

PDF

IJCAI Conference 2025 Conference Paper

MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models

Donghao Zhou
Jiancheng Huang
Jinbin Bai
Jiaze Wang
Hao Chen
Guangyong Chen
Xiaowei Hu
Pheng-Ann Heng

Text-to-image diffusion models can generate high-quality images but lack fine-grained control of visual concepts, limiting their creativity. Thus, we introduce component-controllable personalization, a new task that enables users to customize and reconfigure individual components within concepts. This task faces two challenges: semantic pollution, where undesired elements disrupt the target concept, and semantic imbalance, which causes disproportionate learning of the target concept and component. To address these, we design MagicTailor, a framework that uses Dynamic Masked Degradation to adaptively perturb unwanted visual semantics and Dual-Stream Balancing for more balanced learning of desired visual semantics. The experimental results show that MagicTailor achieves superior performance in this task and enables more personalized and creative image generation.

PDF Details DOI

AAAI Conference 2025 Conference Paper

MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding

Jiaze Wang
Yi Wang
Ziyu Guo
Renrui Zhang
Donghao Zhou
Guangyong Chen
Anfeng Liu
Pheng-Ann Heng

We introduce MM-Mixing, a multi-modal mixing alignment framework for 3D understanding. MM-Mixing applies mixing-based methods to multi-modal data, preserving and optimizing cross-modal connections while enhancing diversity and improving alignment across modalities. Our proposed two-stage training pipeline combines feature-level and input-level mixing to optimize the 3D encoder. The first stage employs feature-level mixing with contrastive learning to align 3D features with their corresponding modalities. The second stage incorporates both feature-level and input-level mixing, introducing mixed point cloud inputs to further refine 3D feature representations. MM-Mixing enhances intermodality relationships, promotes generalization, and ensures feature consistency while providing diverse and realistic training samples. We demonstrate that MM-Mixing significantly improves baseline performance across various learning scenarios, including zero-shot 3D classification, linear probing 3D classification, and cross-modal 3D shape retrieval. Notably, we improved the zero-shot classification accuracy on ScanObjectNN from 51.3% to 61.9%, and on Objaverse-LVIS from 46.8% to 51.4%. Our findings highlight the potential of multi-modal mixing-based alignment to significantly advance 3D object recognition and understanding while remaining straightforward to implement and integrate into existing frameworks.

PDF Details DOI

JBHI Journal 2025 Journal Article

Multi-Scale Spatio-Temporal Transformer-Based Imbalanced Longitudinal Learning for Glaucoma Forecasting From Irregular Time Series Images

Xikai Yang
Jian Wu
Xi Wang
Yuchen Yuan
Jinpeng Li
Guangyong Chen
Ning Li Wang
Pheng-Ann Heng

Glaucoma is one of the major eye diseases that leads to progressive optic nerve fiber damage and irreversible blindness, afflicting millions of individuals. Glaucoma forecast is a good solution to early screening and intervention of potential patients, which is helpful to prevent further deterioration of the disease. It leverages a series of historical fundus images of an eye and forecasts the likelihood of glaucoma occurrence in the future. However, the irregular sampling nature and the imbalanced class distribution are two challenges in the development of disease forecasting approaches. To this end, we introduce the Multi-scale Spatio-temporal Transformer Network (MST-former) based on the transformer architecture tailored for sequential image inputs, which can effectively learn representative semantic information from sequential images on both temporal and spatial dimensions. Specifically, we employ a multi-scale structure to extract features at various resolutions, which can largely exploit rich spatial information encoded in each image. Besides, we design a time distance matrix to scale time attention in a non-linear manner, which could effectively deal with the irregularly sampled data. Furthermore, we introduce a temperature-controlled Balanced Softmax Cross-entropy loss to address the class imbalance issue. Extensive experiments on the Sequential fundus Images for Glaucoma Forecast (SIGF) dataset demonstrate the superiority of the proposed MST-former method, achieving an AUC of 96. 6% for glaucoma forecasting. Besides, our method shows excellent generalization capability on the Alzheimer's Disease Neuroimaging Initiative (ADNI) MRI dataset, with an accuracy of 88. 2% for mild cognitive impairment and Alzheimer's disease prediction, outperforming the compared method by a large margin. A series of ablation studies further verify the contribution of our proposed components in addressing the irregular sampled and class imbalanced problems.

Details DOI

NeurIPS Conference 2025 Conference Paper

Protein Inverse Folding From Structure Feedback

Junde Xu
Zijun Gao
Xinyi Zhou
Xingyi Cheng
Le Song
Guangyong Chen
Pheng-Ann Heng
Jiezhong Qiu

The inverse folding problem, aiming to design amino acid sequences that fold into desired three-dimensional structures, is pivotal for various biotechnological applications. Here, we introduce a novel approach leveraging Direct Preference Optimization (DPO) to fine-tune an inverse folding model using feedback from a protein folding model. Given a target protein structure, we begin by sampling candidate sequences from the inverse‐folding model, then predict the three‐dimensional structure of each sequence with the folding model to generate pairwise structural‐preference labels. These labels are used to fine‐tune the inverse‐folding model under the DPO objective. Our results on the CATH 4. 2 test set demonstrate that DPO fine-tuning not only improves sequence recovery of baseline models but also leads to a significant improvement in average TM-Score from 0. 77 to 0. 81, indicating enhanced structure similarity. Furthermore, iterative application of our DPO-based method on challenging protein structures yields substantial gains, with an average TM-Score increase of 79. 5\% with regard to the baseline model. This work establishes a promising direction for enhancing protein sequence design ability from structure feedback by effectively utilizing preference optimization.

PDF Details

NeurIPS Conference 2025 Conference Paper

What We Miss Matters: Learning from the Overlooked in Point Cloud Transformers

Yi Wang
Jiaze Wang
Ziyu Guo
Renrui Zhang
Donghao Zhou
Guangyong Chen
Anfeng Liu
Pheng-Ann Heng

Point Cloud Transformers have become a cornerstone in 3D representation for their ability to model long-range dependencies via self-attention. However, these models tend to overemphasize salient regions while neglecting other informative regions, which limits feature diversity and compromises robustness. To address this challenge, we introduce BlindFormer, a novel contrastive attention learning framework that redefines saliency by explicitly incorporating features typically neglected by the model. The proposed Attentional Blindspot Mining (ABM) suppresses highly attended regions during training, thereby guiding the model to explore its own blind spots. This redirection of attention expands the model’s perceptual field and uncovers richer geometric cues. To consolidate these overlooked features, BlindFormer employs Blindspot-Aware Joint Optimization (BJO), a joint learning objective that integrates blindspot feature alignment with the original pretext task. BJO enhances feature discrimination while preserving performance on the primary task, leading to more robust and generalizable representations. We validate BlindFormer on several challenging benchmarks and demonstrate consistent performance gains across multiple Transformer backbones. Notably, it improves Point-MAE by +13. 4\% and PointGPT-S by +6. 3\% on OBJ-BG under Gaussian noise. These results highlight the importance of mitigating attentional biases in 3D representation learning, revealing BlindFormer’s superior ability to handle perturbations and improve feature discrimination.

PDF Details

AAAI Conference 2024 Conference Paper

ANEDL: Adaptive Negative Evidential Deep Learning for Open-Set Semi-supervised Learning

Yang Yu
Danruo Deng
Furui Liu
Qi Dou
Yueming Jin
Guangyong Chen
Pheng Ann Heng

Semi-supervised learning (SSL) methods assume that labeled data, unlabeled data and test data are from the same distribution. Open-set semi-supervised learning (Open-set SSL) con- siders a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers). Most previous works focused on out- lier detection via binary classifiers, which suffer from insufficient scalability and inability to distinguish different types of uncertainty. In this paper, we propose a novel framework, Adaptive Negative Evidential Deep Learning (ANEDL) to tackle these limitations. Concretely, we first introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference. Furthermore, we propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers. As demonstrated empirically, our proposed method outperforms existing state-of-the-art methods across four datasets.

PDF Details DOI

AAAI Conference 2024 Conference Paper

DR-Label: Label Deconstruction and Reconstruction of GNN Models for Catalysis Systems

Bowen Wang
Chen Liang
Jiaze Wang
Jiezhong Qiu
Furui Liu
Shaogang Hao
Dong Li
Guangyong Chen

Attaining the equilibrium geometry of a catalyst-adsorbate system is key to fundamentally assessing its effective properties, such as adsorption energy. While machine learning methods with advanced representation or supervision strategies have been applied to boost and guide the relaxation processes of catalysis systems, existing methods that produce linearly aggregated geometry predictions are susceptible to edge representations ambiguity, and are therefore vulnerable to graph variations. In this paper, we present a novel graph neural network (GNN) supervision and prediction strategy DR-Label. Our approach mitigates the multiplicity of solutions in edge representation and encourages model predictions that are independent of graph structural variations. DR-Label first Deconstructs finer-grained equilibrium state information to the model by projecting the node-level supervision signal to each edge. Reversely, the model Reconstructs a more robust equilibrium state prediction by converting edge-level predictions to node-level via a sphere-fitting algorithm. When applied to three fundamentally different models, DR-Label consistently enhanced performance. Leveraging the graph structure invariance of the DR-Label strategy, we further propose DRFormer, which applied explicit intermediate positional update and achieves a new state-of-the-art performance on the Open Catalyst 2020 (OC20) dataset and the Cu-based single-atom alloys CO adsorption (SAA) dataset. We expect our work to highlight vital principles for advancing geometric GNN models for catalysis systems and beyond. Our code is available at https://github.com/bowenwang77/DR-Label

PDF Details DOI

ICRA Conference 2024 Conference Paper

LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic Surgery

Kexin Chen 0003
Yuyang Du 0001
Tao You
Mobarakol Islam
Ziyu Guo
Yueming Jin
Guangyong Chen
Pheng-Ann Heng

Visual question answering (VQA) can be fundamentally crucial for promoting robotic-assisted surgical education. In practice, the needs of trainees are constantly evolving, such as learning more surgical types and adapting to new surgical instruments/techniques. Therefore, continually updating the VQA system by a sequential data stream from multiple resources is demanded in robotic surgery to address new tasks. In surgical scenarios, the privacy issue of patient data often restricts the availability of old data when updating the model, necessitating an exemplar-free continual learning (CL) setup. However, prior studies overlooked two vital problems of the surgical domain: i) large domain shifts from diverse surgical operations collected from multiple departments or clinical centers, and ii) severe data imbalance arising from the uneven presence of surgical instruments or activities during surgical procedures. This paper proposes to address these two problems with a multimodal large language model (LLM) and an adaptive weight assignment methodology. We first develop a new multi-teacher CL framework that leverages a multimodal LLM as the additional teacher. The strong generalization ability of the LLM can bridge the knowledge gap when domain shifts and data imbalances occur. We then put forth a novel data processing method that transforms complex LLM embeddings into logits compatible with our CL framework. We also design an adaptive weight assignment approach that balances the generalization ability of the LLM and the domain expertise of the old CL model. Finally, we construct a new dataset for surgical VQA tasks. Extensive experimental results demonstrate the superiority of our method to other advanced CL models.

Details

AAAI Conference 2024 Conference Paper

PointPatchMix: Point Cloud Mixing with Patch Scoring

Yi Wang
Jiaze Wang
Jinpeng Li
Zixu Zhao
Guangyong Chen
Anfeng Liu
Pheng Ann Heng

Data augmentation is an effective regularization strategy for mitigating overfitting in deep neural networks, and it plays a crucial role in 3D vision tasks, where the point cloud data is relatively limited. While mixing-based augmentation has shown promise for point clouds, previous methods mix point clouds either on block level or point level, which has constrained their ability to strike a balance between generating diverse training samples and preserving the local characteristics of point clouds. The significance of each part component of the point clouds has not been fully considered, as not all parts contribute equally to the classification task, and some parts may contain unimportant or redundant information. To overcome these challenges, we propose PointPatchMix, a novel approach that mixes point clouds at the patch level and integrates a patch scoring module to generate content-based targets for mixed point clouds. Our approach preserves local features at the patch level, while the patch scoring module assigns targets based on the content-based significance score from a pre-trained teacher model. We evaluate PointPatchMix on two benchmark datasets including ModelNet40 and ScanObjectNN, and demonstrate significant improvements over various baselines in both synthetic and real-world datasets, as well as few-shot settings. With Point-MAE as our baseline, our model surpasses previous methods by a significant margin. Furthermore, our approach shows strong generalization across various point cloud methods and enhances the robustness of the baseline model. Code is available at https://jiazewang.com/projects/pointpatchmix.html.

PDF Details DOI

ICML Conference 2024 Conference Paper

Sample-Efficient Multiagent Reinforcement Learning with Reset Replay

Yaodong Yang 0002
Guangyong Chen
Jianye Hao
Pheng-Ann Heng

The popularity of multiagent reinforcement learning (MARL) is growing rapidly with the demand for real-world tasks that require swarm intelligence. However, a noticeable drawback of MARL is its low sample efficiency, which leads to a huge amount of interactions with the environment. Surprisingly, few MARL works focus on this practical problem especially in the parallel environment setting, which greatly hampers the application of MARL into the real world. In response to this gap, in this paper, we propose Multiagent Reinforcement Learning with Reset Replay (MARR) to greatly improve the sample efficiency of MARL by enabling MARL training at a high replay ratio in the parallel environment setting for the first time. To achieve this, first, a reset strategy is introduced for maintaining the network plasticity to ensure that MARL continually learns with a high replay ratio. Second, MARR incorporates a data augmentation technique to boost the sample efficiency further. Extensive experiments in SMAC and MPE show that MARR significantly improves the performance of various MARL approaches with much fewer environment interactions.

Details

EAAI Journal 2023 Journal Article

Robust and fuzzy ensemble framework via spectral learning for random projection-based fuzzy-c-means clustering

Zhaoyin Shi
Long Chen
Junwei Duan
Guangyong Chen
Kai Zhao

The ensembles of random projection-based fuzzy-c-means (RP-FCM) can handle high-dimensional data efficiently. However, the performance of these ensemble frameworks is still hindered by some issues, such as misaligned membership matrices, information loss of co-similar matrices, large storage space, unstable ensemble results due to the additional re-clustering, etc. To address these issues, we propose a robust and fuzzy ensemble framework via spectral learning for RP-FCM clustering. After using random projection to generate different dimensional datasets and obtaining the membership matrices via fuzzy-c-means, we first convert these membership matrices into regularized graphs and approximates the affinity matrices of these graphs by spectral matrices. This step not only avoids the alignment problems of membership matrices but also excludes the storage of large-scale graphs. The spectral matrices of the same size are used as the features of membership matrices for the ensemble, avoiding the possible information loss by applying co-similar matrix transformations. More importantly, an optimization model is designed in our framework to learn the fusion of spectral features. In this model, the proportion of each base clustering is adjusted adaptively through a fuzzification exponent, and the effect of outliers is also suppressed by a robust norm. Finally, the Laplacian rank constraint in the model guarantees the ensemble can achieve the exact final partition. An efficient algorithm for this model is derived, and its time complexity and convergence are also analyzed. Competitive experimental results on benchmark data demonstrate the effectiveness of the proposed ensemble framework in comparison to state-of-the-art methods.

Details DOI

ICML Conference 2023 Conference Paper

Uncertainty Estimation by Fisher Information-based Evidential Deep Learning

Danruo Deng
Guangyong Chen
Yang Yu
Furui Liu
Pheng-Ann Heng

Uncertainty estimation is a key factor that makes deep learning reliable in practical applications. Recently proposed evidential neural networks explicitly account for different uncertainties by treating the network’s outputs as evidence to parameterize the Dirichlet distribution, and achieve impressive performance in uncertainty estimation. However, for high data uncertainty samples but annotated with the one-hot label, the evidence-learning process for those mislabeled classes is over-penalized and remains hindered. To address this problem, we propose a novel method, Fisher Information-based Evidential Deep Learning ($\mathcal{I}$-EDL). In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focus on the representation learning of uncertain classes. The generalization ability of our network is further improved by optimizing the PAC-Bayesian bound. As demonstrated empirically, our proposed method consistently outperforms traditional EDL-related algorithms in multiple uncertainty estimation tasks, especially in the more challenging few-shot classification settings.

Details

IROS Conference 2022 Conference Paper

Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical Scene Segmentation with Limited Annotations

Yang Yu 0070
Zixu Zhao
Yueming Jin
Guangyong Chen
Qi Dou 0001
Pheng-Ann Heng

Surgical scene segmentation is fundamentally crucial for prompting cognitive assistance in robotic surgery. However, pixel-wise annotating surgical video in a frame-by-frame manner is expensive and time consuming. To greatly reduce the labeling burden, in this work, we study semi-supervised scene segmentation from robotic surgical video, which is practically essential yet rarely explored before. We consider a clinically suitable annotation situation under the equidistant sampling. We then propose PGV-CL, a novel pseudo-label guided cross-video contrast learning method to boost scene segmentation. It effectively leverages unlabeled data for a trusty and global model regularization that produces more discriminative feature representation. Concretely, for trusty representation learning, we propose to incorporate pseudo labels to instruct the pair selection, obtaining more reliable representation pairs for pixel contrast. Moreover, we expand the representation learning space from previous image-level to cross-video, which can capture the global semantics to benefit the learning process. We extensively evaluate our method on a public robotic surgery dataset EndoVis18 and a public cataract dataset CaDIS. Experimental results demonstrate the effectiveness of our method, consistently outperforming the state-of-the-art semi-supervised methods under different labeling ratios, and even surpassing fully supervised training on EndoVis18 with 10. 1% labeling. Our code is available at https://github.com/yangyu-cuhk/PGV-CL.

Details

NeurIPS Conference 2022 Conference Paper

Transformer-based Working Memory for Multiagent Reinforcement Learning with Action Parsing

Yaodong Yang
Guangyong Chen
Weixun Wang
Xiaotian Hao
Jianye Hao
Pheng-Ann Heng

Learning in real-world multiagent tasks is challenging due to the usual partial observability of each agent. Previous efforts alleviate the partial observability by historical hidden states with Recurrent Neural Networks, however, they do not consider the multiagent characters that either the multiagent observation consists of a number of object entities or the action space shows clear entity interactions. To tackle these issues, we propose the Agent Transformer Memory (ATM) network with a transformer-based memory. First, ATM utilizes the transformer to enable the unified processing of the factored environmental entities and memory. Inspired by the human’s working memory process where a limited capacity of information temporarily held in mind can effectively guide the decision-making, ATM updates its fixed-capacity memory with the working memory updating schema. Second, as agents' each action has its particular interaction entities in the environment, ATM parses the action space to introduce this action’s semantic inductive bias by binding each action with its specified involving entity to predict the state-action value or logit. Extensive experiments on the challenging SMAC and Level-Based Foraging environments validate that ATM could boost existing multiagent RL algorithms with impressive learning acceleration and performance improvement.

PDF Details

AAAI Conference 2021 Conference Paper

Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise

Pengfei Chen
Junjie Ye
Guangyong Chen
Jingwei Zhao
Pheng-Ann Heng

Supervised learning under label noise has seen numerous advances recently, while existing theoretical findings and empirical results broadly build up on the class-conditional noise (CCN) assumption that the noise is independent of input features given the true label. In this work, we present a theoretical hypothesis testing and prove that noise in real-world dataset is unlikely to be CCN, which confirms that label noise should depend on the instance and justifies the urgent need to go beyond the CCN assumption. The theoretical results motivate us to study the more general and practical-relevant instancedependent noise (IDN). To stimulate the development of theory and methodology on IDN, we formalize an algorithm to generate controllable IDN and present both theoretical and empirical evidence to show that IDN is semantically meaningful and challenging. As a primary attempt to combat IDN, we present a tiny algorithm termed self-evolution average label (SEAL), which not only stands out under IDN with various noise fractions, but also improves the generalization on realworld noise benchmark Clothing1M. Our code is released1. Notably, our theoretical analysis in Section 2 provides rigorous motivations for studying IDN, which is an important topic that deserves more research attention in future.

PDF Details

NeurIPS Conference 2021 Conference Paper

Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning

Danruo Deng
Guangyong Chen
Jianye Hao
Qiong Wang
Pheng-Ann Heng

The backpropagation networks are notably susceptible to catastrophic forgetting, where networks tend to forget previously learned skills upon learning new ones. To address such the 'sensitivity-stability' dilemma, most previous efforts have been contributed to minimizing the empirical risk with different parameter regularization terms and episodic memory, but rarely exploring the usages of the weight loss landscape. In this paper, we investigate the relationship between the weight loss landscape and sensitivity-stability in the continual learning scenario, based on which, we propose a novel method, Flattening Sharpness for Dynamic Gradient Projection Memory (FS-DGPM). In particular, we introduce a soft weight to represent the importance of each basis representing past tasks in GPM, which can be adaptively learned during the learning process, so that less important bases can be dynamically released to improve the sensitivity of new skill learning. We further introduce Flattening Sharpness (FS) to reduce the generalization gap by explicitly regulating the flatness of the weight loss landscape of all seen tasks. As demonstrated empirically, our proposed method consistently outperforms baselines with the superior ability to learn new skills while alleviating forgetting effectively.

PDF Details

AAAI Conference 2021 Conference Paper

Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction

Hongyao Tang
Zhaopeng Meng
Guangyong Chen
Pengfei Chen
Chen Chen
Yaodong Yang
Luo Zhang
Wulong Liu

Value function is the central notion of Reinforcement Learning (RL). Value estimation, especially with function approximation, can be challenging since it involves the stochasticity of environmental dynamics and reward signals that can be sparse and delayed in some cases. A typical model-free RL algorithm usually estimates the values of a policy by Temporal Difference (TD) or Monte Carlo (MC) algorithms directly from rewards, without explicitly taking dynamics into consideration. In this paper, we propose Value Decomposition with Future Prediction (VDFP), providing an explicit two-step understanding of the value estimation process: 1) first foresee the latent future, 2) and then evaluate it. We analytically decompose the value function into a latent future dynamics part and a policy-independent trajectory return part, inducing a way to model latent dynamics and returns separately in value estimation. Further, we derive a practical deep RL algorithm, consisting of a convolutional model to learn compact trajectory representation from past experiences, a conditional variational auto-encoder to predict the latent future dynamics and a convex return model that evaluates trajectory representation. In experiments, we empirically demonstrate the effectiveness of our approach for both off-policy and on-policy RL in several OpenAI Gym continuous control tasks as well as a few challenging variants with delayed reward.

PDF Details

ICLR Conference 2021 Conference Paper

Noise against noise: stochastic label noise helps combat inherent label noise

Pengfei Chen 0003
Guangyong Chen
Junjie Ye 0002
Jingwei Zhao
Pheng-Ann Heng

The noise in stochastic gradient descent (SGD) provides a crucial implicit regularization effect, previously studied in optimization by analyzing the dynamics of parameter updates. In this paper, we are interested in learning with noisy labels, where we have a collection of samples with potential mislabeling. We show that a previously rarely discussed SGD noise, induced by stochastic label noise (SLN), mitigates the effects of inherent label noise. In contrast, the common SGD noise directly applied to model parameters does not. We formalize the differences and connections of SGD noise variants, showing that SLN induces SGD noise dependent on the sharpness of output landscape and the confidence of output probability, which may help escape from sharp minima and prevent overconfidence. SLN not only improves generalization in its simplest form but also boosts popular robust training methods, including sample selection and label correction. Specifically, we present an enhanced algorithm by applying SLN to label correction. Our code is released.

Details

AAAI Conference 2021 Conference Paper

Robustness of Accuracy Metric and its Inspirations in Learning with Noisy Labels

Pengfei Chen
Junjie Ye
Guangyong Chen
Jingwei Zhao
Pheng-Ann Heng

For multi-class classification under class-conditional label noise, we prove that the accuracy metric itself can be robust. We concretize this finding’s inspiration in two essential aspects: training and validation, with which we address critical issues in learning with noisy labels. For training, we show that maximizing training accuracy on sufficiently many noisy samples yields an approximately optimal classifier. For validation, we prove that a noisy validation set is reliable, addressing the critical demand of model selection in scenarios like hyperparameter-tuning and early stopping. Previously, model selection using noisy validation samples has not been theoretically justified. We verify our theoretical results and additional claims with extensive experiments. We show characterizations of models trained with noisy labels, motivated by our theoretical results, and verify the utility of a noisy validation set by showing the impressive performance of a framework termed noisy best teacher and student (NTS). Our code is released1.

PDF Details

ICML Conference 2020 Conference Paper

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Yaodong Yang 0002
Jianye Hao
Guangyong Chen
Hongyao Tang
Yingfeng Chen
Yujing Hu
Changjie Fan
Zhongyu Wei

Recently, deep multiagent reinforcement learning (MARL) has become a highly active research area as many real-world problems can be inherently viewed as multiagent systems. A particularly interesting and widely applicable class of problems is the partially observable cooperative multiagent setting, in which a team of agents learns to coordinate their behaviors conditioning on their private observations and commonly shared global reward signals. One natural solution is to resort to the centralized training and decentralized execution paradigm and during centralized training, one key challenge is the multiagent credit assignment: how to allocate the global rewards for individual agent policies for better coordination towards maximizing system-level’s benefits. In this paper, we propose a new method called Q-value Path Decomposition (QPD) to decompose the system’s global Q-values into individual agents’ Q-values. Unlike previous works which restrict the representation relation of the individual Q-values and the global one, we leverage the integrated gradient attribution technique into deep MARL to directly decompose global Q-values along trajectory paths to assign credits for agents. We evaluate QPD on the challenging StarCraft II micromanagement tasks and show that QPD achieves the state-of-the-art performance in both homogeneous and heterogeneous multiagent scenarios compared with existing cooperative MARL algorithms.

Details

ICML Conference 2019 Conference Paper

Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels

Pengfei Chen 0003
Benben Liao
Guangyong Chen
Shengyu Zhang 0002

Noisy labels are ubiquitous in real-world datasets, which poses a challenge for robustly training deep neural networks (DNNs) as DNNs usually have the high capacity to memorize the noisy labels. In this paper, we find that the test accuracy can be quantitatively characterized in terms of the noise ratio in datasets. In particular, the test accuracy is a quadratic function of the noise ratio in the case of symmetric noise, which explains the experimental findings previously published. Based on our analysis, we apply cross-validation to randomly split noisy datasets, which identifies most samples that have correct labels. Then we adopt the Co-teaching strategy which takes full advantage of the identified samples to train DNNs robustly against noisy labels. Compared with extensive state-of-the-art methods, our strategy consistently improves the generalization performance of DNNs under both synthetic and real-world training noise.

Details

TAAS Journal 2017 Journal Article

Efficient and Robust Emergence of Norms through Heuristic Collective Learning

Jianye Hao
Jun Sun
Guangyong Chen
Zan Wang
Chao Yu
Zhong Ming

In multiagent systems, social norms serves as an important technique in regulating agents’ behaviors to ensure effective coordination among agents without a centralized controlling mechanism. In such a distributed environment, it is important to investigate how a desirable social norm can be synthesized in a bottom-up manner among agents through repeated local interactions and learning techniques. In this article, we propose two novel learning strategies under the collective learning framework, collective learning EV-l and collective learning EV-g, to efficiently facilitate the emergence of social norms. Extensive simulations results show that both learning strategies can support the emergence of desirable social norms more efficiently and be applicable in a wider range of multiagent interaction scenarios compared with previous work. The influence of different topologies is investigated, which shows that the performance of all strategies is robust across different network topologies. The influences of a number of key factors (neighborhood size, actions space, population size, fixed agents and isolated subpopulations) on norm emergence performance are investigated as well.

Details DOI

ICML Conference 2017 Conference Paper

Learning to Aggregate Ordinal Labels by Maximizing Separating Width

Guangyong Chen
Shengyu Zhang 0002
Di Lin 0002
Hui Huang 0004
Pheng-Ann Heng

While crowdsourcing has been a cost and time efficient method to label massive samples, one critical issue is quality control, for which the key challenge is to infer the ground truth from noisy or even adversarial data by various users. A large class of crowdsourcing problems, such as those involving age, grade, level, or stage, have an ordinal structure in their labels. Based on a technique of sampling estimated label from the posterior distribution, we define a novel separating width among the labeled observations to characterize the quality of sampled labels, and develop an efficient algorithm to optimize it through solving multiple linear decision boundaries and adjusting prior distributions. Our algorithm is empirically evaluated on several real world datasets, and demonstrates its supremacy over state-of-the-art methods.

Details