Author name cluster

Yangyang Cheng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

1 author row

IJCAI Conference 2020 Conference Paper

Feature Augmented Memory with Global Attention Network for VideoQA

Jiayin Cai
Chun Yuan
Cheng Shi
Lei Li
Yangyang Cheng
Ying Shan

Recently, Recurrent Neural Network (RNN) based methods and Self-Attention (SA) based methods have achieved promising performance in Video Question Answering (VideoQA). Despite the success of these works, RNN-based methods tend to forget the global semantic contents due to the inherent drawbacks of the recurrent units themselves, while SA-based methods cannot precisely capture the dependencies of the local neighborhood, leading to insufficient modeling for temporal order. To tackle these problems, we propose a novel VideoQA framework which progressively refines the representations of videos and questions from fine to coarse grain in a sequence-sensitive manner. Specifically, our model improves the feature representations via the following two steps: (1) introducing two fine-grained feature-augmented memories to strengthen the information augmentation of video and text which can improve memory capacity by memorizing more relevant and targeted information. (2) appending the self-attention and co-attention module to the memory output thus the module is able to capture global interaction between high-level semantic informations. Experimental results show that our approach achieves state-of-the-art performance on VideoQA benchmark datasets.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Self-Attention ConvLSTM for Spatiotemporal Prediction

Zhihui Lin
Maomao Li
Zhuobin Zheng
Yangyang Cheng
Chun Yuan

Spatiotemporal prediction is challenging due to the complex dynamic motion and appearance changes. Existing work concentrates on embedding additional cells into the standard ConvLSTM to memorize spatial appearances during the prediction. These models always rely on the convolution layers to capture the spatial dependence, which are local and inef- ﬁcient. However, long-range spatial dependencies are significant for spatial applications. To extract spatial features with both global and local dependencies, we introduce the selfattention mechanism into ConvLSTM. Speciﬁcally, a novel self-attention memory (SAM) is proposed to memorize features with long-range dependencies in terms of spatial and temporal domains. Based on the self-attention, SAM can produce features by aggregating features across all positions of both the input itself and memory features with pair-wise similarity scores. Moreover, the additional memory is updated by a gating mechanism on aggregated features and an established highway with the memory of the previous time step. Therefore, through SAM, we can extract features with longrange spatiotemporal dependencies. Furthermore, we embed the SAM into a standard ConvLSTM to construct a selfattention ConvLSTM (SA-ConvLSTM) for the spatiotemporal prediction. In experiments, we apply the SA-ConvLSTM to perform frame prediction on the MovingMNIST and KTH datasets and trafﬁc ﬂow prediction on the TexiBJ dataset. Our SA-ConvLSTM achieves state-of-the-art results on both datasets with fewer parameters and higher time efﬁciency than previous state-of-the-art method.

PDF Details

AAAI Conference 2019 Conference Paper

Self-Supervised Mixture-of-Experts by Uncertainty Estimation

Zhuobin Zheng
Chun Yuan
Xinrui Zhu
Zhihui Lin
Yangyang Cheng
Cheng Shi
Jiahui Ye

Learning related tasks in various domains and transferring exploited knowledge to new situations is a significant challenge in Reinforcement Learning (RL). However, most RL algorithms are data inefficient and fail to generalize in complex environments, limiting their adaptability and applicability in multi-task scenarios. In this paper, we propose Self- Supervised Mixture-of-Experts (SUM), an effective algorithm driven by predictive uncertainty estimation for multitask RL. SUM utilizes a multi-head agent with shared parameters as experts to learn a series of related tasks simultaneously by Deep Deterministic Policy Gradient (DDPG). Each expert is extended by predictive uncertainty estimation on known and unknown states to enhance the Q-value evaluation capacity against overfitting and the overall generalization ability. These enable the agent to capture and diffuse the common knowledge across different tasks improving sample efficiency in each task and the effectiveness of expert scheduling across multiple tasks. Instead of task-specific design as common MoEs, a self-supervised gating network is adopted to determine a potential expert to handle each interaction from unseen environments and calibrated completely by the uncertainty feedback from the experts without explicit supervision. To alleviate the imbalanced expert utilization as the crux of MoE, optimization is accomplished via decayedmasked experience replay, which encourages both diversification and specialization of experts during different periods. We demonstrate that our approach learns faster and achieves better performance by efficient transfer and robust generalization, outperforming several related methods on extended OpenAI Gym’s MuJoCo multi-task environments.

PDF Details

IJCAI Conference 2018 Conference Paper

Self-Adaptive Double Bootstrapped DDPG

Zhuobin Zheng
Chun Yuan
Zhihui Lin
Yangyang Cheng
Hanghao Wu

Deep Deterministic Policy Gradient (DDPG) algorithm has been successful for state-of-the-art performance in high-dimensional continuous control tasks. However, due to the complexity and randomness of the environment, DDPG tends to suffer from inefficient exploration and unstable training. In this work, we propose Self-Adaptive Double Bootstrapped DDPG (SOUP), an algorithm that extends DDPG to bootstrapped actor-critic architecture. SOUP improves the efficiency of exploration by multiple actor heads capturing more potential actions and multiple critic heads evaluating more reasonable Q-values collaboratively. The crux of double bootstrapped architecture is to tackle the fluctuations in performance, caused by multiple heads of spotty capacity varying throughout training. To alleviate the instability, a self-adaptive confidence mechanism is introduced to dynamically adjust the weights of bootstrapped heads and enhance the ensemble performance effectively and efficiently. We demonstrate that SOUP achieves faster learning by at least 45% while improving cumulative reward and stability substantially in comparison to vanilla DDPG on OpenAI Gym's MuJoCo environments.

PDF Details