Author name cluster

Guoli Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers

2 author rows

ECAI Conference 2025 Conference Paper

AIRES: A General Framework for Efficient Intrinsic Rewards Based on Attention Mechanisms

Xin Liu
Jie Tan
Li Shen
Xu Wang
Guoli Wu
Xiaoguang Ren
Huadong Dai

Efficient exploration in high-dimensional observation spaces remains a critical challenge in deep reinforcement learning, particularly in scenarios with sparse extrinsic rewards. A promising approach is to encourage exploration by estimating intrinsic rewards based on the novelty of observations. However, there is a gap between the observed novelty and the actual effectiveness of exploration, as both environmental stochasticity and the agent’s actions may influence observations. To accurately evaluate the novelty contributed by agent exploration in intrinsic rewards, we propose the AIRES (Attention-driven Intrinsic Reward for Exploration Strategy) framework. AIRES leverages the attention mechanisms to analyze the relationship within trajectory sequences generated by agent-environment interactions, employing attention weights to quantify the relevance of observations to actions. By applying attention weights to intrinsic rewards, the novelty brought by agent exploration is enhanced and the impact of environmental stochasticity is reduced. Extensive experiments demonstrate that AIRES significantly enhances the performance of prominent intrinsic reward methods, establishing it as a robust and scalable solution for efficient exploration.

Details

AAAI Conference 2025 Conference Paper

Contrastive Representation for Interactive Recommendation

Jingyu Li
Zhiyong Feng
Dongxiao He
Hongqi Chen
Qinghang Gao
Guoli Wu

Interactive Recommendation (IR) has gained significant attention recently for its capability to quickly capture dynamic interest and optimize both short and long term objectives. IR agents are typically implemented through Deep Reinforcement Learning (DRL), because DRL is inherently compatible with the dynamic nature of IR. However, DRL is currently not perfect for IR. Due to the large action space and sample inefficiency problem, training DRL recommender agents is challenging. The key point is that useful features cannot be extracted as high-quality representations for the recommender agent to optimize its policy. To tackle this problem, we propose Contrastive Representation for Interactive Recommendation (CRIR). CRIR efficiently extracts latent, high-level preference ranking features from explicit interaction, and leverages the features to enhance users’ representation. Specifically, the CRIR provides representation through one representation network, and refines it through our proposed Preference Ranking Contrastive Learning (PRCL). The key insight of PRCL is that it can perform contrastive learning without relying on computations involving high-level representations or large potential action sets. Furthermore, we also propose a data exploiting mechanism and an agent training mechanism to better adapt CRIR to the DRL backbone. Extensive experiments have been carried out to show our method's superior improvement on the sample efficiency while training an DRL-based IR agent.

PDF Details DOI