Arrow Research search

Author name cluster

Pan Hui

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
2 author rows

Possible papers

14

AAAI Conference 2026 Conference Paper

LLM Safety in Judicial AI: A Stress Test of Social Media Influence on Real-World Judgments

  • Yixuan Xie
  • Yang He
  • Xiaoyu Yang
  • Xu Gai
  • Pan Hui

Integrating Large Language Models (LLMs) into judicial decision-making demands rigorous safety examination against non-legal influences. This paper presents a novel stress test where we evaluate LLM-generated labor dispute outcomes by introducing social media sentiment as an external pressure, critically comparing them against 10,000 real-world court judgments from China Judgments Online (CJOL). Our findings reveal significant LLM safety vulnerabilities: models exhibit inherent deviations from real rulings, and public opinion substantially amplifies these discrepancies, leading to unstable and often inflated compensation predictions. Furthermore, these safety risks are compounded across low-skilled occupational categories and emotionally charged topics. This study uncovers critical threats to judicial integrity and public trust, underscoring the urgent need for robust safeguards against non-legal influences in AI legal systems.

IJCAI Conference 2025 Conference Paper

ContextAware: A Multi-Agent Framework for Detecting Harmful Image-Based Comments on Social Media

  • Zheng Wei
  • Mingchen Li
  • Pu Zhang
  • Xinyu Liu
  • Huamin Qu
  • Pan Hui

Detecting hidden stigmatization in social media poses significant challenges due to semantic misalignments between textual and visual modalities, as well as the subtlety of implicit stigmatization. Traditional approaches often fail to capture these complexities in real-world, multimodal content. To address this gap, we introduce ContextAware, an agent-based framework that leverages specialized modules to collaboratively process and analyze images, textual context, and social interactions. Our approach begins by clustering image embeddings to identify recurring content, activating high-likes agents for deeper analysis of images receiving substantial user engagement, while comprehensive agents handle lower-engagement images. By integrating case-based learning, textual sentiment, and vision-language models (VLMs), ContextAware refines its detection of harmful content. We evaluate ContextAware on a self-collected Douyin dataset focused on interracial relationships, comprising 871 short videos and 885, 502 comments—of which a notable portion are image-based. Experimental results show that ContextAware not only outperforms state-of-the-art methods in accuracy and F1 score but also effectively detects implicit stigmatization within the highly contextual environment of social media. Our findings underscore the importance of agent-based architectures and multimodal alignment in capturing nuanced, culturally specific forms of harmful content.

NeurIPS Conference 2025 Conference Paper

PyraMotion: Attentional Pyramid-Structured Motion Integration for Co-Speech 3D Gesture Synthesis

  • Zhizhuo Yin
  • Yuk Hang Tsui
  • Pan Hui

Generating full-body human gestures encompassing face, body, hands, and global movements from audio is crucial yet challenging for virtual avatar creation. Existing systems tokenize gestures frame-wise, predicting tokens of each frame from the input audio. However, expressive human gestures consist of varied patterns with different frame lengths, and different body parts exhibit motion patterns of varying durations. Existing systems fail to capture motion patterns across body parts and temporal scales due to the fixed frame-count setting of their gesture tokens. Inspired by the success of the feature pyramid technique in the multi-scale visual information extraction, we propose a novel framework named PyraMotion and an adaptive multi-scale feature capturing model called Attentive Pyramidal VQ-VAE (APVQ-VAE). Objective and subjective experiments demonstrate that the PyraMotion outperforms state-of-the-art methods in terms of generating natural and expressive full-body human gestures. Extensive ablation experiments highlight that the self-adaptiveness integration through attention maps contributes to performance.

IJCAI Conference 2024 Conference Paper

From Pixels to Progress: Generating Road Network from Satellite Imagery for Socioeconomic Insights in Impoverished Areas

  • Yanxin Xi
  • Yu Liu
  • Zhicheng Liu
  • Sasu Tarkoma
  • Pan Hui
  • Yong Li

The Sustainable Development Goals (SDGs) aim to resolve societal challenges, such as eradicating poverty and improving the lives of vulnerable populations in impoverished areas. Those areas rely on road infrastructure construction to promote accessibility and economic development. Although publicly available data like OpenStreetMap is available to monitor road status, data completeness in impoverished areas is limited. Meanwhile, the development of deep learning techniques and satellite imagery shows excellent potential for earth monitoring. To tackle the challenge of road network assessment in impoverished areas, we develop a systematic road extraction framework combining an encoder-decoder architecture and morphological operations on satellite imagery, offering an integrated workflow for interdisciplinary researchers. Extensive experiments of road network extraction on real-world data in impoverished regions achieve a 42. 7% enhancement in the F1-score over the baseline methods and reconstruct about 80% of the actual roads. We also propose a comprehensive road network dataset covering approximately 794, 178 km2 area and 17. 048 million people in 382 impoverished counties in China. The generated dataset is further utilized to conduct socioeconomic analysis in impoverished counties, showing that road network construction positively impacts regional economic development. The technical appendix, code, and generated dataset can be found at https: //github. com/tsinghua-fib-lab/Road_network_extraction_impoverished_counties.

ICRA Conference 2024 Conference Paper

OmniColor: A Global Camera Pose Optimization Approach of LiDAR-360Camera Fusion for Colorizing Point Clouds

  • Bonan Liu
  • Guoyang Zhao
  • Jianhao Jiao
  • Guang Cai
  • Chengyang Li
  • Handi Yin
  • Yuyang Wang
  • Ming Liu 0001

A Colored point cloud, as a simple and efficient 3D representation, has many advantages in various fields, including robotic navigation and scene reconstruction. This representation is now commonly used in 3D reconstruction tasks relying on cameras and LiDARs. However, fusing data from these two types of sensors is poorly performed in many existing frameworks, leading to unsatisfactory mapping results, mainly due to inaccurate camera poses. This paper presents Omni-Color, a novel and efficient algorithm to colorize point clouds using an independent 360-degree camera. Given a LiDAR-based point cloud and a sequence of panorama images with initial coarse camera poses, our objective is to jointly optimize the poses of all frames for mapping images onto geometric reconstructions. Our pipeline works in an off-the-shelf manner that does not require any feature extraction or matching process. Instead, we find optimal poses by directly maximizing the photometric consistency of LiDAR maps. In experiments, we show that our method can overcome the severe visual distortion of omnidirectional images and greatly benefit from the wide field of view (FOV) of 360-degree cameras to reconstruct various scenarios with accuracy and stability. The code will be released at https://github.com/liubonan123/OmniColor/.

IJCAI Conference 2024 Conference Paper

VulnerabilityMap: An Open Framework for Mapping Vulnerability among Urban Disadvantaged Populations in the United States

  • Lin Chen
  • Yong Li
  • Pan Hui

Cities are crucibles of numerous opportunities, but also hotbeds of inequality. The plight of disadvantaged populations who are ``left behind'' within urban environments has been an increasingly pressing concern, which poses substantial threats to the realization of the UN SDG agenda. However, a comprehensive framework for studying this urban dilemma is currently absent, preventing researchers from developing AI models for social good prediction and intervention. To fill this gap, we construct VulnerabilityMap, a framework to meticulously dissect the challenges faced by urban disadvantaged populations, unraveling their vulnerability to a spectrum of shocks and stresses that are categorized through the prism of Maslow's hierarchy of needs. Specifically, we systematically collect large-scale multi-sourced census and web-based data covering more than 328 million people in the United States regarding demographic features, neighborhood environments, offline mobility behaviors, and online social connections. These features are further related to vulnerability outcomes from short-term shocks such as COVID-19 and long-term physiological, social, and self-actualization stresses. Leveraging our framework, we construct machine learning models that exhibit strong performance in predicting vulnerability outcomes from various disadvantage features, which shows the promising utility of our framework to support targeted AI models. Moreover, we provide model-based explainability analysis to interpret the reasons underlying model predictions, shedding light on intricate social factors that trap certain populations inside vulnerable situations. Our constructed dataset is publicly available at https: //github. com/LinChen-65/VulnerabilityMap/.

TIST Journal 2023 Journal Article

Learning Representations of Satellite Imagery by Leveraging Point-of-Interests

  • Tong Li
  • Yanxin Xi
  • Huandong Wang
  • Yong Li
  • Sasu Tarkoma
  • Pan Hui

Satellite imagery depicts the Earth’s surface remotely and provides comprehensive information for many applications, such as land use monitoring and urban planning. Existing studies on unsupervised representation learning for satellite images only take into account the images’ geographic information, ignoring human activity factors. To bridge this gap, we propose using the Point-of-Interest (POI) data to capture human factors and designing a contrastive learning-based framework to consolidate the representation of satellite imagery with POI information. Besides, we introduce a season-invariant representation learning model on satellite imagery, considering that human factors are mostly unchanging with respect to seasons. An attention model is designed at last to merge the representations from the geographic, seasonal, and POI perspectives adaptively. On the basis of real-world datasets collected from Beijing, 1 we evaluate our method for predicting socioeconomic indicators. The results show that the representation containing POI information outperforms the geographic representation in estimating commercial activity-related indicators. Our proposed attentional framework can estimate the socioeconomic indicators with R 2 of 0.874 and outperforms the baseline methods. Furthermore, we explore the differences in the representations of satellite images with varying socioeconomic statuses. Finally, we investigate the impact of geographic and POI perspective information in the representation learning process, as well as the effect of satellite imagery on various spatial resolutions.

TIST Journal 2023 Journal Article

You Are How You Use Apps: User Profiling Based on Spatiotemporal App Usage Behavior

  • Tong Li
  • Yong Li
  • Mingyang Zhang
  • Sasu Tarkoma
  • Pan Hui

Mobile apps have become an indispensable part of people’s daily lives. Users determine what apps to use and when and where to use them based on their tastes, interests, and personal demands, depending on their personality traits. This article aims to infer user profiles from their spatiotemporal mobile app usage behavior. Specifically, we first transform mobile app usage records into a heterogeneous graph. On the graph, nodes represent users, apps, locations, and time slots. Edges describe the co-occurrence of entities in usage records. We then develop a multi-relational heterogeneous graph attention network (MRel-HGAN), an end-to-end system for user profiling. MRel-HGAN first adopts a neighbor sampling strategy based on bootstrapping to sample heavily connected neighbors of a fixed size for each node. Next, we design a relational graph convolutional operation and a multi-relational attention operation. Through such modules, MRel-HGAN can generate node embedding by sufficiently leveraging the rich semantic information of the multi-relational structure in the mobile app usage graph. Experimental results on real-world mobile app usage datasets show the effectiveness and superiority of our MRel-HGAN in the user profiling task for attributes of gender and age.

TIST Journal 2022 Journal Article

Hierarchical Multi-agent Model for Reinforced Medical Resource Allocation with Imperfect Information

  • Qianyue Hao
  • Fengli Xu
  • Lin Chen
  • Pan Hui
  • Yong Li

With the advent of the COVID-19 pandemic, the shortage in medical resources became increasingly more evident. Therefore, efficient strategies for medical resource allocation are urgently needed. However, conventional rule-based methods employed by public health experts have limited capability in dealing with the complex and dynamic pandemic-spreading situation. In addition, model-based optimization methods such as dynamic programming (DP) fail to work since we cannot obtain a precise model in real-world situations most of the time. Model-free reinforcement learning (RL) is a powerful tool for decision-making; however, three key challenges exist in solving this problem via RL: (1) complex situations and countless choices for decision-making in the real world; (2) imperfect information due to the latency of pandemic spreading; and (3) limitations on conducting experiments in the real world since we cannot set up pandemic outbreaks arbitrarily. In this article, we propose a hierarchical RL framework with several specially designed components. We design a decomposed action space with a corresponding training algorithm to deal with the countless choices, ensuring efficient and real-time strategies. We design a recurrent neural network–based framework to utilize the imperfect information obtained from the environment. We also design a multi-agent voting method, which modifies the decision-making process considering the randomness during model training and, thus, improves the performance. We build a pandemic-spreading simulator based on real-world data, serving as the experimental platform. We then conduct extensive experiments. The results show that our method outperforms all baselines, which reduces infections and deaths by 14.25% on average without the multi-agent voting method and up to 15.44% with it.

NeurIPS Conference 2021 Conference Paper

Automorphic Equivalence-aware Graph Neural Network

  • Fengli Xu
  • Quanming Yao
  • Pan Hui
  • Yong Li

Distinguishing the automorphic equivalence of nodes in a graph plays an essential role in many scientific domains, e. g. , computational biologist and social network analysis. However, existing graph neural networks (GNNs) fail to capture such an important property. To make GNN aware of automorphic equivalence, we first introduce a localized variant of this concept --- ego-centered automorphic equivalence (Ego-AE). Then, we design a novel variant of GNN, i. e. , GRAPE, that uses learnable AE-aware aggregators to explicitly differentiate the Ego-AE of each node's neighbors with the aids of various subgraph templates. While the design of subgraph templates can be hard, we further propose a genetic algorithm to automatically search them from graph data. Moreover, we theoretically prove that GRAPE is expressive in terms of generating distinct representations for nodes with different Ego-AE features, which fills in a fundamental gap of existing GNN variants. Finally, we empirically validate our model on eight real-world graph data, including social network, e-commerce co-purchase network, and citation network, and show that it consistently outperforms existing GNNs. The source code is public available at https: //github. com/tsinghua-fib-lab/GRAPE.

IJCAI Conference 2020 Conference Paper

Multi-View Joint Graph Representation Learning for Urban Region Embedding

  • Mingyang Zhang
  • Tong Li
  • Yong Li
  • Pan Hui

The increasing amount of urban data enable us to investigate urban dynamics, assist urban planning, and eventually, make our cities more livable and sustainable. In this paper, we focus on learning an embedding space from urban data for urban regions. For the first time, we propose a multi-view joint learning model to learn comprehensive and representative urban region embeddings. We first model different types of region correlations based on both human mobility and inherent region properties. Then, we apply a graph attention mechanism in learning region representations from each view of the built correlations. Moreover, we introduce a joint learning module that boosts the region embedding learning by sharing cross-view information and fuses multi-view embeddings by learning adaptive weights. Finally, we exploit the learned embeddings in the downstream applications of land usage classification and crime prediction in urban areas with real-world data. Extensive experiment results demonstrate that by exploiting our proposed joint learning model, the performance is improved by a large margin on both tasks compared with the state-of-the-art methods.

IJCAI Conference 2019 Conference Paper

A Decomposition Approach for Urban Anomaly Detection Across Spatiotemporal Data

  • Mingyang Zhang
  • Tong Li
  • Hongzhi Shi
  • Yong Li
  • Pan Hui

Urban anomalies such as abnormal flow of crowds and traffic accidents could result in loss of life or property if not handled properly. Detecting urban anomalies at the early stage is important to minimize the adverse effects. However, urban anomaly detection is difficult due to two challenges: a) the criteria of urban anomalies varies with different locations and time; b) urban anomalies of different types may show different signs. In this paper, we propose a decomposing approach to address these two challenges. Specifically, we decompose urban dynamics into the normal component and the abnormal component. The normal component is merely decided by spatiotemporal features, while the abnormal component is caused by anomalous events. Then, we extract spatiotemporal features and estimate the normal component accordingly. At last, we derive the abnormal component to identify anomalies. We evaluate our method using both real-world and synthetic datasets. The results show our method can detect meaningful events and outperforms state-of-the-art anomaly detecting methods by a large margin.

IJCAI Conference 2019 Conference Paper

FaRM: Fair Reward Mechanism for Information Aggregation in Spontaneous Localized Settings

  • Moin Hussain Moti
  • Dimitris Chatzopoulos
  • Pan Hui
  • Sujit Gujar

Although peer prediction markets are widely used in crowdsourcing to aggregate information from agents, they often fail to reward the participating agents equitably. Honest agents can be wrongly penalized if randomly paired with dishonest ones. In this work, we introduce selective and cumulative fairness. We characterize a mechanism as fair if it satisfies both notions and present FaRM, a representative mechanism we designed. FaRM is a Nash incentive mechanism that focuses on information aggregation for spontaneous local activities which are accessible to a limited number of agents without assuming any prior knowledge of the event. All the agents in the vicinity observe the same information. FaRM uses (i) a report strength score to remove the risk of random pairing with dishonest reporters, (ii) a consistency score to measure an agent's history of accurate reports and distinguish valuable reports, (iii) a reliability score to estimate the probability of an agent to collude with nearby agents and prevents agents from getting swayed, and (iv) a location robustness score to filter agents who try to participate without being present in the considered setting. Together, report strength, consistency, and reliability represent a fair reward given to agents based on their reports.