Arrow Research search

Author name cluster

Zihan Ma

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
1 author row

Possible papers

6

AAAI Conference 2026 Conference Paper

Splat-SAP: Feed-Forward Gaussian Splatting for Human-Centered Scene with Scale-Aware Point Map Reconstruction

  • Boyao Zhou
  • Shunyuan Zheng
  • Zhanfeng Liao
  • Zihan Ma
  • Hanzhang Tu
  • Boning Liu
  • Yebin Liu

We present Splat-SAP, a feed-forward approach to render novel views of human-centered scenes from binocular cameras with large sparsity. Gaussian Splatting has shown its promising potential in rendering tasks, but it typically necessitates per-scene optimization with dense input views. Although some recent approaches achieve feed-forward Gaussian Splatting rendering through geometry priors obtained by multi-view stereo, such approaches still require largely overlapped input views to establish the geometry prior. To bridge this gap, we leverage pixel-wise point map reconstruction to represent geometry which is robust to large sparsity for its independent view modeling. In general, we propose a two-stage learning strategy. In stage 1, we transform the point map into real space via an iterative affinity learning process, which facilitates camera control in the following. In stage 2, we project point maps of two input views onto the target view plane and refine such geometry via stereo matching. Furthermore, we anchor Gaussian primitives on this refined plane in order to render high-quality images. As a metric representation, the scale-aware point map in stage 1 is trained in a self-supervised manner without 3D supervision and stage 2 is supervised with photo-metric loss. We collect multi-view human-centered data and demonstrate that our method improves both the stability of point map reconstruction and the visual quality of free-viewpoint rendering.

AAAI Conference 2026 Conference Paper

TrajEvo: Trajectory Prediction Heuristics Design via LLM-driven Evolution

  • Zhikai Zhao
  • Chuanbo Hua
  • Federico Berto
  • Kanghoon Lee
  • Zihan Ma
  • Jiachen Li
  • Jinkyoo Park

Trajectory prediction is a crucial task in modeling human behavior, especially in safety-critical fields such as social robotics and autonomous vehicle navigation. Traditional heuristics based on handcrafted rules often lack accuracy, while recently proposed deep learning approaches suffer from computational cost, slow inference speed, lack of explainability, and generalization issues that limit their practical adoption in such environments. In this paper, we introduce TrajEvo, a framework that leverages Large Language Models (LLMs) to automatically design trajectory prediction heuristics. TrajEvo employs an evolutionary algorithm to generate and refine prediction heuristics from past trajectory data. We introduce a Cross-Generation Elite Sampling to promote population diversity and a Statistics Feedback Loop allowing the LLM to analyze alternative predictions. Our evaluations show TrajEvo outperforms previous heuristic methods on various real-world datasets, and remarkably outperforms both heuristics and deep learning methods when generalizing to an unseen real-world dataset. TrajEvo represents a first step toward automated design of fast, explainable, and generalizable trajectory prediction heuristics. We make our source code publicly available to foster future research.

AAAI Conference 2025 Conference Paper

Each Fake News Is Fake in Its Own Way: An Attribution Multi-Granularity Benchmark for Multimodal Fake News Detection

  • Hao Guo
  • Zihan Ma
  • Zhi Zeng
  • Minnan Luo
  • Weixin Zeng
  • Jiuyang Tang
  • Xiang Zhao

Social platforms, while facilitating access to information, have also become saturated with a plethora of fake news, resulting in negative consequences. Automatic multimodal fake news detection is a worthwhile pursuit. Existing multimodal fake news datasets only provide binary labels of real or fake. However, real news is alike, while each fake news is fake in its own way. These datasets fail to reflect the mixed nature of various types of multimodal fake news. To bridge the gap, we construct an attributing multi-granularity multimodal fake news detection dataset AMG, revealing the inherent fake pattern. Furthermore, we propose a multi-granularity clue alignment model MGCA to achieve multimodal fake news detection and attribution. Experimental results demonstrate that AMG is a challenging dataset, and its attribution setting opens up new avenues for future research.

NeurIPS Conference 2025 Conference Paper

Rethinking Verification for LLM Code Generation: From Generation to Testing

  • Zihan Ma
  • Taolin Zhang
  • Junnan Liu
  • Wenwei Zhang
  • Minnan Luo
  • Songyang Zhang
  • Kai Chen

Large language models (LLMs) have recently achieved notable success in code‑generation benchmarks such as HumanEval and LiveCodeBench. However, a detailed examination reveals that these evaluation suites often comprise only a limited number of homogeneous test cases, resulting in subtle faults going undetected. This not only artificially inflates measured performance but also compromises accurate reward estimation in reinforcement learning frameworks utilizing verifiable rewards (RLVR). To address these critical shortcomings, we systematically investigate the test-case generation (TCG) task by proposing multi-dimensional metrics designed to rigorously quantify test-suite thoroughness. Furthermore, we introduce a human-LLM collaborative method (SAGA), leveraging human programming expertise with LLM reasoning capability, aimed at significantly enhancing both the coverage and the quality of generated test cases. In addition, we develop a TCGBench to facilitate the study of the TCG task. Experiments show that SAGA achieves a detection rate of 90. 62\% and a verifier accuracy of 32. 58\% on TCGBench. The Verifier Accuracy (Verifier Acc) of the code generation evaluation benchmark synthesized by SAGA is 10. 78\% higher than that of LiveCodeBench-v6. These results demonstrate the effectiveness of our proposed method. We hope this work contributes to building a scalable foundation for reliable LLM code evaluation, further advancing RLVR in code generation, and paving the way for automated adversarial test synthesis and adaptive benchmark integration.

AAMAS Conference 2024 Conference Paper

HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent Pathfinding

  • Huijie Tang
  • Federico Berto
  • Zihan Ma
  • Chuanbo Hua
  • Kyuree Ahn
  • Jinkyoo Park

Large-scale multi-agent pathfinding (MAPF) presents significant challenges in several areas. As systems grow in complexity with a multitude of autonomous agents operating simultaneously, efficient and collision-free coordination becomes paramount. Traditional algorithms often fall short in scalability, especially in intricate scenarios. Reinforcement Learning (RL) has shown potential to address the intricacies of MAPF; however, it has also been shown to struggle with scalability, demanding intricate implementation, lengthy training, and often exhibiting unstable convergence, limiting its practical application. In this paper, we introduce Heuristics-Informed Multi- Agent Pathfinding (HiMAP), a novel scalable approach that employs imitation learning with heuristic guidance in a decentralized manner. We train on small-scale instances using a heuristic policy as a teacher that maps each single agent observation information to an action probability distribution. During pathfinding, we adopt several inference techniques to improve performance. With a simple training scheme and implementation, HiMAP demonstrates competitive results in terms of success rate and scalability in the field of imitation-learning-only MAPF, showing the potential of imitation-learning-only MAPF equipped with inference techniques.

NeurIPS Conference 2022 Conference Paper

TwiBot-22: Towards Graph-Based Twitter Bot Detection

  • Shangbin Feng
  • Zhaoxuan Tan
  • Herun Wan
  • Ningnan Wang
  • Zilong Chen
  • Binchi Zhang
  • Qinghua Zheng
  • Wenqian Zhang

Twitter bot detection has become an increasingly important task to combat misinformation, facilitate social media moderation, and preserve the integrity of the online discourse. State-of-the-art bot detection methods generally leverage the graph structure of the Twitter network, and they exhibit promising performance when confronting novel Twitter bots that traditional methods fail to detect. However, very few of the existing Twitter bot detection datasets are graph-based, and even these few graph-based datasets suffer from limited dataset scale, incomplete graph structure, as well as low annotation quality. In fact, the lack of a large-scale graph-based Twitter bot detection benchmark that addresses these issues has seriously hindered the development and evaluation of novel graph-based bot detection approaches. In this paper, we propose TwiBot-22, a comprehensive graph-based Twitter bot detection benchmark that presents the largest dataset to date, provides diversified entities and relations on the Twitter network, and has considerably better annotation quality than existing datasets. In addition, we re-implement 35 representative Twitter bot detection baselines and evaluate them on 9 datasets, including TwiBot-22, to promote a fair comparison of model performance and a holistic understanding of research progress. To facilitate further research, we consolidate all implemented codes and datasets into the TwiBot-22 evaluation framework, where researchers could consistently evaluate new models and datasets. The TwiBot-22 Twitter bot detection benchmark and evaluation framework are publicly available at \url{https: //twibot22. github. io/}.