Author name cluster

Ruochen Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

1 author row

AAAI Conference 2026 Conference Paper

F.A.C.U.L.: Language-Based Interaction with AI Companions in Gaming

Wenya Wei
Sipeng Yang
Qixian Zhou
Ruochen Liu
Xuelei Zhang
Yifu Yuan
Yan Jiang
Yongle Luo

In cooperative video games, traditional AI companions are deployed to assist players, who control them using hotkeys or command wheels to issue predefined commands such as ''attack'', ''defend'', or ''retreat''. Despite their simplicity, these methods, which lack target specificity, limit players' ability to give complex tactical instructions and hinder immersive gameplay experiences. To address this, we propose the FPS AI Companion who Understands Language (F.A.C.U.L.), the first real-time AI system that enables players to communicate and collaborate with AI companions using natural language. By integrating natural language processing with a confidence-based framework, F.A.C.U.L. efficiently decomposes complex commands and interprets player intent. It also employs a dynamic entity retrieval method for environmental awareness, aligning human intentions with decision-making. Unlike traditional rule-based systems, our method supports real-time language interactions, enabling players to issue complex commands such as ''clear the second floor,'' ''take cover behind that tree,'' or ''retreat to the river''. The system provides real-time behavioral responses and vocal feedback, ensuring seamless tactical collaboration. Using the popular FPS game Arena Breakout: Infinite as a case study, we present comparisons demonstrating the efficacy of our approach and discuss the advantages and limitations of AI companions based on real-world user feedback.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Fine Tuning Out-of-Vocabulary Item Recommendation with User Sequence Imagination

Ruochen Liu
Hao Chen
Yuanchen Bei
Qijie Shen
Fangwei Zhong
Senzhang Wang
Jianxin Wang

Recommending out-of-vocabulary (OOV) items is a challenging problem since the in-vocabulary (IV) items have well-trained behavioral embeddings but the OOV items only have content features. Current OOV recommendation models often generate 'makeshift' embeddings for OOV items from content features and then jointly recommend with the `makeshift' OOV item embeddings and the behavioral IV item embeddings. However, merely using the 'makeshift' embedding will result in suboptimal recommendation performance due to the substantial gap between the content feature and the behavioral embeddings. To bridge the gap, we propose a novel User Sequence IMagination (USIM) fine-tuning framework, which first imagines the user sequences and then refines the generated OOV embeddings with the user behavioral embeddings. Specifically, we frame the user sequence imagination as a reinforcement learning problem and develop a recommendation-focused reward function to evaluate to what extent a user can help recommend the OOV items. Besides, we propose an embedding-driven transition function to model the embedding transition after imaging a user. USIM has been deployed on a prominent e-commerce platform for months, offering recommendations for millions of OOV items and billions of users. Extensive experiments demonstrate that USIM outperforms traditional generative models in OOV item recommendation performance across traditional collaborative filtering and GNN-based collaborative filtering models.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

SaSDim: Self-Adaptive Noise Scaling Diffusion Model for Spatial Time Series Imputation

Shunyang Zhang
Senzhang Wang
Xianzhen Tan
Renzhi Wang
Ruochen Liu
Jian Zhang
Jianxin Wang

Spatial time series imputation is of great importance to various real-world applications. As the state-of-the-art generative models, diffusion models (e. g. CSDI) have outperformed statistical and autoregressive based models in time series imputation. However, diffusion models may introduce unstable noise owing to the inherent uncertainty in sampling, leading to the generated noise deviating from the intended Gaussian distribution. Consequently, the imputed data may deviate from the real data. To this end, we propose a Self-adaptive noise Scaling Diffusion Model named SaSDim for spatial time series imputation. Specifically, we introduce a novel Probabilistic High-Order SDE Solver Module to scale the noise following the standard Gaussian distribution. The noise scaling operation helps the noise prediction module of the diffusion model to more accurately estimate the variance of noise. To effectively learn the spatial and temporal features, a Spatial guided Global Convolution Module (SgGConv) for multi-periodic temporal dependencies learning with the Fast Fourier Transformation and dynamic spatial dependencies learning with dynamic graph convolution is also proposed. Extensive experiments conducted on three real-world spatial time series datasets verify the effectiveness of SaSDim.

PDF Details DOI

IS Journal 2022 Journal Article

A Stable Deep Reinforcement Learning Framework for Recommendation

Ruochen Liu
Dawei Jiang
Xilong Zhang

Recommender system (RS) solves the problem of information overload, which is crucial in industrial fields. Recently, reinforcement learning (RL) combined with RS has attracted researchers’ attention. These new methods model the interaction between RS and users as a process of serialization decision-making. However, these studies suffer from several disadvantages: 1) they fail to model the accumulated long-term interest tied to high reward, and 2) these algorithms need a lot of interactive data to learn a good strategy and are unstable in the scenario of recommendation. In this article, we propose a stable reinforcement learning framework for recommendation. We redefine the Markov decision process of RL-based recommendation, and add a stable module to model high feedback behavior of users. Second, an advanced RL algorithm is introduced to ensure stability and exploratory. The experiments verify the effectiveness of the proposed algorithm.

Details DOI

NeurIPS Conference 2022 Conference Paper

MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity Parsing

Zelun Luo
Zane Durante
Linden Li
Wanze Xie
Ruochen Liu
Emily Jin
Zhuoyi Huang
Lun Yu Li

Video-language models (VLMs), large models pre-trained on numerous but noisy video-text pairs from the internet, have revolutionized activity recognition through their remarkable generalization and open-vocabulary capabilities. While complex human activities are often hierarchical and compositional, most existing tasks for evaluating VLMs focus only on high-level video understanding, making it difficult to accurately assess and interpret the ability of VLMs to understand complex and fine-grained human activities. Inspired by the recently proposed MOMA framework, we define activity graphs as a single universal representation of human activities that encompasses video understanding at the activity, sub-activity, and atomic action level. We redefine activity parsing as the overarching task of activity graph generation, requiring understanding human activities across all three levels. To facilitate the evaluation of models on activity parsing, we introduce MOMA-LRG (Multi-Object Multi-Actor Language-Refined Graphs), a large dataset of complex human activities with activity graph annotations that can be readily transformed into natural language sentences. Lastly, we present a model-agnostic and lightweight approach to adapting and evaluating VLMs by incorporating structured knowledge from activity graphs into VLMs, addressing the individual limitations of language and graphical models. We demonstrate strong performance on few-shot activity parsing, and our framework is intended to foster future research in the joint modeling of videos, graphs, and language.

PDF Details

NeurIPS Conference 2021 Conference Paper

TenSet: A Large-scale Program Performance Dataset for Learned Tensor Compilers

Lianmin Zheng
Ruochen Liu
Junru Shao
Tianqi Chen
Joseph Gonzalez
Ion Stoica
Ameer Haj-Ali

Search-based tensor compilers can greatly accelerate the execution of machine learning models by generating high-performance tensor programs, such as matrix multiplications and convolutions. These compilers take a high-level mathematical expression as input and search for the fastest low-level implementations. At the core of the search procedure is a cost model which estimates the performance of different candidates to reduce the frequency of time-consuming on-device measurements. There has been a growing interest in using machine learning techniques to learn a cost model to ease the effort of building an analytical model. However, a standard dataset for pre-training and benchmarking learned cost models is lacking. We introduce TenSet, a large-scale tensor program performance dataset. TenSet contains 52 million program performance records collected from 6 hardware platforms. We provide comprehensive studies on how to learn and evaluate the cost models, including data collection, model architectures, loss functions, transfer learning, and evaluation metrics. We also show that a cost model pre-trained on TenSet can accelerate the search time in the state-of-the-art tensor compiler by up to 10$\times$. The dataset is available at https: //github. com/tlc-pack/tenset.

PDF Details