Arrow Research search

Author name cluster

Shoujin Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

18 papers
1 author row

Possible papers

18

AAAI Conference 2026 Conference Paper

SAR: A Structure-Aligned Reasoning Framework for Temporal Knowledge Graph Question Answering

  • Qianyi Hu
  • Jiaxue Liu
  • Xinhui Tu
  • Shoujin Wang

Large language models (LLMs) augmented with retrieval have shown impressive performance in open-domain question answering, yet struggle significantly with temporal knowledge graph question answering (TKGQA). The core issue lies in structural misalignment: treating structured, temporally sensitive graph queries as plain text often causes LLMs to retrieve or reason with semantically similar but structurally incorrect facts, resulting in critical inaccuracies. To address this, we introduce SAR (Structure-Aligned Reasoning), a novel TKGQA framework that integrates LLM reasoning tightly with the explicit subject–predicate–object–time schema inherent in knowledge graphs. SAR employs an LLM agent to first decompose natural language questions into structured queries, clearly delineating entities, relationships, and temporal constraints. It then conducts schema-consistent, time-aware retrieval from the knowledge graph to acquire candidate quadruples, which guide a subsequent iterative ReAct-style reasoning process by the LLM. A final verification stage ensures that proposed answers strictly adhere to temporal conditions, reinforcing accuracy and temporal coherence. Experiments on two benchmark datasets, MultiTQ and CronQuestions, demonstrate SAR’s effectiveness, achieving the best results. Specifically, with GPT-4.1, SAR achieves 78.2% Hits@1 on MultiTQ, significantly outperforming existing methods, and similarly establishes a new performance record on CronQuestions. Our results underscore the critical importance of structural alignment in temporal reasoning tasks, particularly in handling complex queries involving multiple temporal constraints and multi-hop reasoning.

NeurIPS Conference 2025 Conference Paper

Revealing Multimodal Causality with Large Language Models

  • Jin Li
  • Shoujin Wang
  • Qi Zhang
  • Feng Liu
  • Tongliang Liu
  • Longbing Cao
  • Shui Yu
  • Fang Chen

Uncovering cause-and-effect mechanisms from data is fundamental to scientific progress. While large language models (LLMs) show promise for enhancing causal discovery (CD) from unstructured data, their application to the increasingly prevalent multimodal setting remains a critical challenge. Even with the advent of multimodal LLMs (MLLMs), their efficacy in multimodal CD is hindered by two primary limitations: (1) difficulty in exploring intra- and inter-modal interactions for comprehensive causal variable identification; and (2) insufficiency to handle structural ambiguities with purely observational data. To address these challenges, we propose MLLM-CD, a novel framework for multimodal causal discovery from unstructured data. It consists of three key components: (1) a novel contrastive factor discovery module to identify genuine multimodal factors based on the interactions explored from contrastive sample pairs; (2) a statistical causal structure discovery module to infer causal relationships among discovered factors; and (3) an iterative multimodal counterfactual reasoning module to refine the discovery outcomes iteratively by incorporating the world knowledge and reasoning capabilities of MLLMs. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of the proposed MLLM-CD in revealing genuine factors and causal relationships among them from multimodal unstructured data. The implementation code and data are available at https: //github. com/JinLi-i/MLLM-CD.

NeurIPS Conference 2024 Conference Paper

NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction

  • Zixuan Gong
  • Guangyin Bao
  • Qi Zhang
  • Zhongwei Wan
  • Duoqian Miao
  • Shoujin Wang
  • Lei Zhu
  • Changwei Wang

Reconstruction of static visual stimuli from non-invasion brain activity fMRI achieves great success, owning to advanced deep learning models such as CLIP and Stable Diffusion. However, the research on fMRI-to-video reconstruction remains limited since decoding the spatiotemporal perception of continuous visual experiences is formidably challenging. We contend that the key to addressing these challenges lies in accurately decoding both high-level semantics and low-level perception flows, as perceived by the brain in response to video stimuli. To the end, we propose NeuroClips, an innovative framework to decode high-fidelity and smooth video from fMRI. NeuroClips utilizes a semantics reconstructor to reconstruct video keyframes, guiding semantic accuracy and consistency, and employs a perception reconstructor to capture low-level perceptual details, ensuring video smoothness. During inference, it adopts a pre-trained T2V diffusion model injected with both keyframes and low-level perception flows for video reconstruction. Evaluated on a publicly available fMRI-video dataset, NeuroClips achieves smooth high-fidelity video reconstruction of up to 6s at 8FPS, gaining significant improvements over state-of-the-art models in various metrics, e. g. , a 128% improvement in SSIM and an 81% improvement in spatiotemporal metrics. Our project is available at https: //github. com/gongzix/NeuroClips.

TIST Journal 2024 Journal Article

Trustworthy Recommender Systems

  • Shoujin Wang
  • Xiuzhen Zhang
  • Yan Wang
  • Francesco Ricci

Recommender systems (RSs) aim at helping users to effectively retrieve items of their interests from a large catalogue. For a quite long time, researchers and practitioners have been focusing on developing accurate RSs. Recent years have witnessed an increasing number of threats to RSs, coming from attacks, system and user generated noise, and various types of biases. As a result, it has become clear that the focus on RS accuracy is too narrow, and the research must consider other important factors, particularly trustworthiness. A trustworthy recommender system (TRS) should not only be accurate but also transparent, unbiased, fair, and robust to noise and attacks. These observations actually led to a paradigm shift of the research on RSs: from accuracy-oriented RSs to TRSs. However, there is a lack of a systematic overview and discussion of the literature in this novel and fast-developing field of TRSs. To this end, in this article, we provide an overview of TRSs, including a discussion of the motivation and basic concepts of TRSs, a presentation of the challenges in building TRSs, and a perspective on the future directions in this area. We also provide a novel conceptual framework to support the construction of TRSs.

AAMAS Conference 2023 Conference Paper

Efficient Interactive Recommendation via Huffman Tree-based Policy Learning

  • Longxiang Shi
  • Zilin Zhang
  • Shoujin Wang
  • Binbin Zhou
  • Minghui Wu
  • Cheng Yang
  • Shijian Li

Interactive recommender systems (IRSs) are an essential part of our daily life, as they can suggest items to persistently satisfy our demands. Due to the interactive nature, conventional static recommendation methods such as matrix factorization, and content-based filtering are ineffective to capture the dynamic preferences of users. Recently, reinforcement learning (RL) has shown great potential in addressing the challenges in IRSs, since it can capture users’ dynamic preferences and model the long-term profit of user-item interactions. However, millions of items in real-world IRSs lead to a large discrete action space in the RL setting, rendering RL-based IRSs inefficient and hindering their widespread application. Such an inefficiency issue has not been well addressed in the literature. In order to address this issue, we propose a novel Huffman Tree Policy Recommendation (HTPR) framework. Specifically, a novel policy learning network based on a newly designed Huffman tree is proposed for policy representation learning, which effectively improves the learning efficiency. Moreover, a novel parameter-sharing scheme is devised to further reduce unnecessary computations. Extensive experiments on two real-world benchmark datasets demonstrate the superiority of HTPR over the state-of-the-art IRS methods in terms of both recommendation accuracy and efficiency.

NeurIPS Conference 2023 Conference Paper

Frequency-domain MLPs are More Effective Learners in Time Series Forecasting

  • Kun Yi
  • Qi Zhang
  • Wei Fan
  • Shoujin Wang
  • Pengyang Wang
  • Hui He
  • Ning An
  • Defu Lian

Time series forecasting has played the key role in different industrial, including finance, traffic, energy, and healthcare domains. While existing literatures have designed many sophisticated architectures based on RNNs, GNNs, or Transformers, another kind of approaches based on multi-layer perceptrons (MLPs) are proposed with simple structure, low complexity, and superior performance. However, most MLP-based forecasting methods suffer from the point-wise mappings and information bottleneck, which largely hinders the forecasting performance. To overcome this problem, we explore a novel direction of applying MLPs in the frequency domain for time series forecasting. We investigate the learned patterns of frequency-domain MLPs and discover their two inherent characteristic benefiting forecasting, (i) global view: frequency spectrum makes MLPs own a complete view for signals and learn global dependencies more easily, and (ii) energy compaction: frequency-domain MLPs concentrate on smaller key part of frequency components with compact signal energy. Then, we propose FreTS, a simple yet effective architecture built upon Frequency-domain MLPs for Time Series forecasting. FreTS mainly involves two stages, (i) Domain Conversion, that transforms time-domain signals into complex numbers of frequency domain; (ii) Frequency Learning, that performs our redesigned MLPs for the learning of real and imaginary part of frequency components. The above stages operated on both inter-series and intra-series scales further contribute to channel-wise and time-wise dependency learning. Extensive experiments on 13 real-world benchmarks (including 7 benchmarks for short-term forecasting and 6 benchmarks for long-term forecasting) demonstrate our consistent superiority over state-of-the-art methods. Code is available at this repository: https: //github. com/aikunyi/FreTS.

IJCAI Conference 2022 Conference Paper

A Probabilistic Code Balance Constraint with Compactness and Informativeness Enhancement for Deep Supervised Hashing

  • Qi Zhang
  • Liang Hu
  • Longbing Cao
  • Chongyang Shi
  • Shoujin Wang
  • Dora D. Liu

Building on deep representation learning, deep supervised hashing has achieved promising performance in tasks like similarity retrieval. However, conventional code balance constraints (i. e. , bit balance and bit uncorrelation) imposed on avoiding overfitting and improving hash code quality are unsuitable for deep supervised hashing owing to their inefficiency and impracticality of simultaneously learning deep data representations and hash functions. To address this issue, we propose probabilistic code balance constraints on deep supervised hashing to force each hash code to conform to a discrete uniform distribution. Accordingly, a Wasserstein regularizer aligns the distribution of generated hash codes to a uniform distribution. Theoretical analyses reveal that the proposed constraints form a general deep hashing framework for both bit balance and bit uncorrelation and maximizing the mutual information between data input and their corresponding hash codes. Extensive empirical analyses on two benchmark datasets further demonstrate the enhancement of compactness and informativeness of hash codes for deep supervised hash to improve retrieval performance (code available at: https: //github. com/mumuxi/dshwr).

IJCAI Conference 2022 Conference Paper

Modeling Spatio-temporal Neighbourhood for Personalized Point-of-interest Recommendation

  • Xiaolin Wang
  • Guohao Sun
  • Xiu Fang
  • Jian Yang
  • Shoujin Wang

Point-of-interest (POI) recommendations can help users explore attractive locations, which is playing an important role in location-based social networks (LBSNs). In POI recommendations, the results are largely impacted by users' preferences. However, the existing POI methods model user and location almost separately, which cannot capture users' personal and dynamic preferences to location. In addition, they also ignore users' acceptance to distance/time of location. To overcome the limitations of the existing methods, we first introduce Knowledge Graph with temporal information (known as TKG) into POI recommendation, including both user and location with timestamps. Then, based on TKG, we propose a Spatial-Temporal Graph Convolutional Attention Network (STGCAN), a novel network that learns users' preferences on TKG by dynamically capturing the spatial-temporal neighbourhoods. Specifically, in STGCAN, we construct receptive fields on TKG to aggregate neighbourhoods of user and location respectively at each timestamp. And we measure the spatial-temporal interval as users' acceptance to distance/time with self-attention. Experiments on three real-world datasets demonstrate that the proposed model outperforms the state-of-the-art POI recommendation approaches.

IS Journal 2021 Journal Article

Concept Representation by Learning Explicit and Implicit Concept Couplings

  • Wenpeng Lu
  • Yuteng Zhang
  • Shoujin Wang
  • Heyan Huang
  • Qian Liu
  • Sheng Luo

Generating the precise semantic representation of a word or concept is a fundamental task in natural language processing. Recent studies which incorporate semantic knowledge into word embedding have shown their potential in improving the semantic representation of a concept. However, existing approaches only achieved limited performance improvement as they usually 1) model a word’s semantics from some explicit aspects while ignoring the intrinsic aspects of the word, 2) treat semantic knowledge as a supplement of word embeddings, and 3) consider partial relations between concepts while ignoring rich coupling relations between them, such as explicit concept co-occurrences in descriptive texts in a corpus as well as concept hyperlink relations in a knowledge network, and implicit couplings between concept co-occurrences and hyperlinks. In human consciousness, a concept is always associated with various couplings that exist within/between descriptive texts and knowledge networks, which inspires us to capture as many concept couplings as possible for building a more informative concept representation. We thus propose a neural coupled concept representation (CoupledCR) framework and its instantiation: a coupled concept embedding (CCE) model. CCE first learns two types of explicit couplings that are based on concept co-occurrences and hyperlink relations, respectively, and then learns a type of high-level implicit couplings between these two types of explicit couplings for better concept representation. Extensive experimental results on six real-world datasets show that CCE significantly outperforms eight state-of-the-art word embeddings and semantic representation methods.

IJCAI Conference 2021 Conference Paper

Graph Learning based Recommender Systems: A Review

  • Shoujin Wang
  • Liang Hu
  • Yan Wang
  • Xiangnan He
  • Quan Z. Sheng
  • Mehmet A. Orgun
  • Longbing Cao
  • Francesco Ricci

Recent years have witnessed the fast development of the emerging topic of Graph Learning based Recommender Systems (GLRS). GLRS mainly employ advanced graph learning approaches to model users’ preferences and intentions as well as items’ characteristics and popularity for Recommender Systems (RS). Differently from other approaches, including content based filtering and collaborative filtering, GLRS are built on graphs where the important objects, e. g. , users, items, and attributes, are either explicitly or implicitly connected. With the rapid development of graph learning techniques, exploring and exploiting homogeneous or heterogeneous relations in graphs is a promising direction for building more effective RS. In this paper, we provide a systematic review of GLRS, by discussing how they extract knowledge from graphs to improve the accuracy, reliability and explainability of the recommendations. First, we characterize and formalize GLRS, and then summarize and categorize the key challenges and main progress in this novel research area.

IS Journal 2021 Journal Article

Hierarchical Attentive Transaction Embedding With Intra- and Inter-Transaction Dependencies for Next-Item Recommendation

  • Shoujin Wang
  • Longbing Cao
  • Liang Hu
  • Shlomo Berkovsky
  • Xiaoshui Huang
  • Lin Xiao
  • Wenpeng Lu

A transaction-based recommender system (TBRS) aims to predict the next item by modeling dependencies in transactional data. Generally, two kinds of dependencies considered are intra-transaction dependence and inter-transaction dependence. Most existing TBRSs recommend next item by only modeling the intra-transaction dependence within the current transaction while ignoring inter-transaction dependence with recent transactions that may also affect the next item. However, as not all recent transactions are relevant to the current and next items, the relevant ones should be identified and prioritized. In this article, we propose a novel hierarchical attentive transaction embedding (HATE) model to tackle these issues. Specifically, a two-level attention mechanism integrates both item embedding and transaction embedding to build an attentive context representation that incorporates both intra- and inter-transaction dependencies. With the learned context representation, HATE then recommends the next item. Experimental evaluations on two real-world transaction datasets show that HATE significantly outperforms the state-of-the-art methods in terms of recommendation accuracy.

AAAI Conference 2020 Conference Paper

Intention Nets: Psychology-Inspired User Choice Behavior Modeling for Next-Basket Prediction

  • Shoujin Wang
  • Liang Hu
  • Yan Wang
  • Quan Z. Sheng
  • Mehmet Orgun
  • Longbing Cao

Human behaviors are complex, which are often observed as a sequence of heterogeneous actions. In this paper, we take user choices for shopping baskets as a typical case to study the complexity of user behaviors. Most of existing approaches often model user behaviors in a mechanical way, namely treating a user action sequence as homogeneous sequential data, such as hourly temperatures, which fails to consider the complexity in user behaviors. In fact, users’ choices are driven by certain underlying intentions (e. g. , feeding the baby or relieving pain) according to Psychological theories. Moreover, the durations of intentions to drive user actions are quite different; some of them may be persistent while others may be transient. According to Psychological theories, we develop a hierarchical framework to describe the goal, intentions and action sequences, based on which, we design Intention Nets (Int- Net). In IntNet, multiple Action Chain Nets are constructed to model the user actions driven by different intentions, and a specially designed Persistent-Transient Intention Unit models the different intention durations. We apply the IntNet to nextbasket prediction, a recent challenging task in recommender systems. Extensive experiments on real-world datasets show the superiority of our Psychology-inspired model IntNet over the state-of-the-art approaches.

IJCAI Conference 2020 Conference Paper

Intention2Basket: A Neural Intention-driven Approach for Dynamic Next-basket Planning

  • Shoujin Wang
  • Liang Hu
  • Yan Wang
  • Quan Z. Sheng
  • Mehmet Orgun
  • Longbing Cao

User purchase behaviours are complex and dynamic, which are usually observed as multiple choice actions across a sequence of shopping baskets. Most of the existing next-basket prediction approaches model user actions as homogeneous sequence data without considering complex and heterogeneous user intentions, impeding deep under-standing of user behaviours from the perspective of human inside drivers and thus reducing the prediction performance. Psychological theories have indicated that user actions are essentially driven by certain underlying intentions (e. g. , diet and entertainment). Moreover, different intentions may influence each other while different choices usually have different utilities to accomplish an intention. Inspired by such psychological insights, we formalize the next-basket prediction as an Intention Recognition, Modelling and Accomplishing problem and further design the Intention2Basket (Int2Ba in short) model. In Int2Ba, an Intention Recognizer, a Coupled Intention Chain Net, and a Dynamic Basket Planner are specifically designed to respectively recognize, model and accomplish the heterogeneous intentions behind a sequence of baskets to better plan the next-basket. Extensive experiments on real-world datasets show the superiority of Int2Ba over the state-of-the-art approaches.

IJCAI Conference 2019 Conference Paper

Modeling Multi-Purpose Sessions for Next-Item Recommendations via Mixture-Channel Purpose Routing Networks

  • Shoujin Wang
  • Liang Hu
  • Yan Wang
  • Quan Z. Sheng
  • Mehmet Orgun
  • Longbing Cao

A session-based recommender system (SBRS) suggests the next item by modeling the dependencies between items in a session. Most of existing SBRSs assume the items inside a session are associated with one (implicit) purpose. However, this may not always be true in reality, and a session may often consist of multiple subsets of items for different purposes (e. g. , breakfast and decoration). Specifically, items (e. g. , bread and milk) in a subsethave strong purpose-specific dependencies whereas items (e. g. , bread and vase) from different subsets have much weaker or even no dependencies due to the difference of purposes. Therefore, we propose a mixture-channel model to accommodate the multi-purpose item subsets for more precisely representing a session. Filling gaps in existing SBRSs, this model recommends more diverse items to satisfy different purposes. Accordingly, we design effective mixture-channel purpose routing networks (MCPRN) with a purpose routing network to detect the purposes of each item and assign it into the corresponding channels. Moreover, a purpose specific recurrent network is devised to model the dependencies between items within each channel for a specific purpose. The experimental results show the superiority of MCPRN over the state-of-the-art methods in terms of both recommendation accuracy and diversity.

IJCAI Conference 2019 Conference Paper

Sequential Recommender Systems: Challenges, Progress and Prospects

  • Shoujin Wang
  • Liang Hu
  • Yan Wang
  • Longbing Cao
  • Quan Z. Sheng
  • Mehmet Orgun

The emerging topic of sequential recommender systems (SRSs) has attracted increasing attention in recent years. Different from the conventional recommender systems (RSs) including collaborative filtering and content-based filtering, SRSs try to understand and model the sequential user behaviors, the interactions between users and items, and the evolution of users’ preferences and item popularity over time. SRSs involve the above aspects for more precise characterization of user contexts, intent and goals, and item consumption trend, leading to more accurate, customized and dynamic recommendations. In this paper, we provide a systematic review on SRSs. We first present the characteristics of SRSs, and then summarize and categorize the key challenges in this research area, followed by the corresponding research progress consisting of the most recent and representative developments on this topic. Finally, we discuss the important research directions in this vibrant area.

AAAI Conference 2018 Conference Paper

Attention-Based Transactional Context Embedding for Next-Item Recommendation

  • Shoujin Wang
  • Liang Hu
  • Longbing Cao
  • Xiaoshui Huang
  • Defu Lian
  • Wei Liu

To recommend the next item to a user in a transactional context is practical yet challenging in applications such as marketing campaigns. Transactional context refers to the items that are observable in a transaction. Most existing transactionbased recommender systems (TBRSs) make recommendations by mainly considering recently occurring items instead of all the ones observed in the current context. Moreover, they often assume a rigid order between items within a transaction, which is not always practical. More importantly, a long transaction often contains many items irreverent to the next choice, which tends to overwhelm the influence of a few truely relevant ones. Therefore, we posit that a good TBRS should not only consider all the observed items in the current transaction but also weight them with different relevance to build an attentive context that outputs the proper next item with a high probability. To this end, we design an effective attentionbased transaction embedding model (ATEM) for context embedding to weight each observed item in a transaction without assuming order. The empirical study on real-world transaction datasets proves that ATEM significantly outperforms the state-of-the-art methods in terms of both accuracy and novelty.

IJCAI Conference 2017 Conference Paper

Diversifying Personalized Recommendation with User-session Context

  • Liang Hu
  • Longbing Cao
  • Shoujin Wang
  • Guandong Xu
  • Jian Cao
  • Zhiping Gu

Recommender systems (RS) have become an integral part of our daily life. However, most current RS often repeatedly recommend items to users with similar profiles. We argue that recommendation should be diversified by leveraging session contexts with personalized user profiles. For this, current session-based RS (SBRS) often assume a rigidly ordered sequence over data which does not fit in many real-world cases. Moreover, personalization is often omitted in current SBRS. Accordingly, a personalized SBRS over relaxedly ordered user-session contexts is more pragmatic. In doing so, deep-structured models tend to be too complex to serve for online SBRS owing to the large number of users and items. Therefore, we design an efficient SBRS with shallow wide-in-wide-out networks, inspired by the successful experience in modern language modelings. The experiments on a real-world e-commerce dataset show the superiority of our model over the state-of-the-art methods.