Arrow Research search

Author name cluster

Yiming Xia

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers
1 author row

Possible papers

2

AAAI Conference 2026 Conference Paper

Belief-Driven Value Alignment for Human-Robot Collaboration

  • Saisai Li
  • Bing Shi
  • Yiming Xia
  • Xiao Su

As intelligent systems advance rapidly, human-robot collaboration is becoming increasingly important. Ensuring that the intelligent agent's behaviors match human intentions and value preferences is crucial for effective collaboration, which is termed the value alignment problem. Within the Reinforcement Learning (RL) paradigm, value alignment typically relies on pre-designed reward functions, and Cooperative Inverse Reinforcement Learning (CIRL) is often used to model value alignment as a human-robot game. However, existing works often assume that human is perfectly rational, and can fully obtain robot’s belief on human’s preference. To address this limitation, we propose a Particle Filter-based Hierarchical Dynamic Programming algorithm (PFHDP). By modeling the robot's belief state, this algorithm ensures the correct updates of human's estimate of the robot's belief. This allows human to adopt more targeted pedagogical behaviors to guide the robot based on her understanding of the robot's current belief, achieving belief alignment between human and robot and thereby promoting value alignment more effectively. Furthermore, we run experiments to evaluate the proposed method in two cooperative scenarios against some typical benchmark approaches. The experimental results show that our method can strengthen the alignment of belief states between human and robot, leading to enhanced value alignment.

EAAI Journal 2024 Journal Article

A vehicle value based ride-hailing order matching and dispatching algorithm

  • Bing Shi
  • Yiming Xia
  • Shuai Xu
  • Yikai Luo

Online ride-hailing has become one of the most important transportation ways. In the ride-hailing system, how to efficiently match orders with vehicles and dispatch idle vehicles are key issues. The ride-hailing platform needs to match orders with vehicles and dispatch idle vehicles efficiently to maximize social welfare. However, the matching and dispatching decisions at the current round may affect the supply and demand of ride-hailing in the future rounds since they will affect the future vehicle distributions in different geographical zones. In fact, vehicles in different zones at different times may have different values for the matching and dispatching results. In this paper, we use the vehicle value function to characterize the spatio-temporal value of vehicles in each zone and then use it to design the order matching and idle vehicle dispatching algorithm to improve the long-term social welfare. In addition, in the order matching, passengers may untruthfully report the maximum price they are willing to pay to maximize their own profits, which can affect the order matching and thus may result in the losses of the long-term social welfare. Therefore, we design a VCG based pricing algorithm to prevent the strategic behavior of passengers. We further run experiments to evaluate the proposed algorithm. The experimental results show that our algorithm can outperform the state-of-the-art algorithm in terms of social welfare by 11. 7% and service ratio by 11. 1%. This work can provide some useful insights for the online ride-hailing platform to design practical order matching and pricing strategies.