Author name cluster

Haiyang Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

2 author rows

ICRA Conference 2025 Conference Paper

Renderworld: World Model with Self-Supervised 3D Label

Ziyang Yan
Wenzhen Dong
Yihua Shao
Yuhang Lu
Haiyang Liu
Jingwen Liu
Haozhe Wang 0002
Zhe Wang

End-to-end autonomous driving with vision-only is not only more cost-effective compared to LiDAR-vision fusion but also more reliable than traditional methods. To achieve a economical and robust purely visual autonomous driving system, we propose RenderWorld, a vision-only end-to-end autonomous driving framework, which generates 3D occupancy labels using a self-supervised gaussian-based Img2Occ Module, then encodes the labels by AM-VAE, and uses world model for forecasting and planning. RenderWorld employs Gaussian Splatting to represent 3D scenes and render 2D images greatly improves segmentation accuracy and reduces GPU memory consumption compared with NeRF-based methods. By applying AM-VAE to encode air and non-air separately, RenderWorld achieves more fine-grained scene element representation, leading to state-of-the-art performance in both 4D occupancy forecasting and motion planning from autoregressive world model.

Details

ICLR Conference 2025 Conference Paper

TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation

Haiyang Liu
Xingchao Yang
Tomoya Akiyama
Yuantian Huang
Qiaoge Li
Shigeru Kuriyama
Takafumi Taketomi

We present TANGO, a framework for generating co-speech body-gesture videos. Given a few-minute, single-speaker reference video and target speech audio, TANGO produces high-fidelity videos with synchronized body gestures. TANGO builds on Gesture Video Reenactment (GVR), which splits and retrieves video clips using a directed graph structure - representing video frames as nodes and valid transitions as edges. We address two key limitations of GVR: audio-motion misalignment and visual artifacts in GAN-generated transition frames. In particular, i) we propose retrieving gestures using latent feature distance to improve cross-modal alignment. To ensure the latent features could effectively model the relationship between speech audio and gesture motion, we implement a hierarchical joint embedding space (AuMoClip); ii) we introduce the diffusion-based model to generate high-quality transition frames. Our diffusion model, Appearance Consistent Interpolation (ACInterp), is built upon AnimateAnyone and includes a reference motion module and homography background flow to preserve appearance consistency between generated and reference videos. By integrating these components into the graph-based retrieval framework, TANGO reliably produces realistic, audio-synchronized videos and outperforms all existing generative and retrieval methods. Our code, pretrained models, and datasets are publicly available at https://github.com/CyberAgentAILab/TANGO.

Details

YNIMG Journal 2025 Journal Article

The association among individual gray matter volume of frontal-limbic circuitry, fatigue susceptibility, and comorbid neuropsychiatric symptoms following COVID-19

Xuan Niu
Wenrui Bao
Zhaoyao Luo
Pang Du
Heping Zhou
Haiyang Liu
Baoqi Wang
Huawen Zhang

BACKGROUND: Fatigue is often accompanied by comorbid sleep disturbance and psychiatric distress following the COVID-19 infection. However, identifying individuals at risk for developing post-COVID fatigue remains challenging. This study aimed to identify the neurobiological markers underlying fatigue susceptibility and further investigate their effect on COVID-19-related neuropsychiatric symptoms. METHODS: Individuals following a mild SARS-CoV-2 infection (COV+) underwent neuropsychiatric measurements (n = 335) and MRI scans (n = 271) within 1 month (baseline), and 191 (70.5 %) of the individuals were followed up 3 months after infection. Sixty-seven healthy controls (COV-) completed the same recruitment protocol. RESULTS: Whole-brain voxel-wise analysis showed that gray matter volume (GMV) during the acute phase did not differ between the COV+ and COV- groups. GMV in the right dorsolateral prefrontal cortex (DLPFC) and left dorsal anterior cingulate cortex (dACC) were associated with fatigue severity only in the COV+ group at baseline, which were assigned to the frontal system and limbic system, respectively. Furthermore, fatigue mediated the associations between volume differences in fatigue susceptibility and COVID-related sleep, post-traumatic stress disorder, anxiety and depression. Crucially, the initial GMV in the right DLPFC can predict fatigue symptoms 3 months after infection. CONCLUSIONS: We provide novel evidence on the neuroanatomical basis of fatigue vulnerability and emphasize that acute fatigue is an important link between early GMV in the frontal-limbic regions and comorbid neuropsychiatric symptoms at baseline and 3 months after infection. Our findings highlight the role of the frontal-limbic system in predisposing individuals to develop post-COVID fatigue.

Details DOI

EAAI Journal 2024 Journal Article

IDPonzi: An interpretable detection model for identifying smart Ponzi schemes

Xia Feng
Qichen Shi
Xingye Li
Haiyang Liu
Liangmin Wang

Ponzi schemes are deceptive financial scams that lure users with the promise of high profits, resulting in substantial losses for global investors. The advent of blockchain technologies has witnessed these traditional scams transitioning from offline operations to the blockchain system. In the blockchain environment, Ponzi schemes often take the form of high-return investment contracts. Existing approaches for detecting smart Ponzi schemes rely on machine learning techniques that analyze smart contracts’ operation codes or transaction histories. However, these approaches, which rely on the frequency distribution of opcodes, lack interpretability. Additionally, transaction-based methods require a significant number of generated transactions, limiting their ability to promptly detect newly deployed smart contracts. These limitations render current detection methods inefficient in identifying Ponzi schemes. This paper proposes IDPonzi, a novel interpretable model for detecting smart Ponzi schemes in the blockchain. We refine the dataset by eliminating duplicate contracts to enhance the detection capability, resulting in more compact and discriminative samples. We then utilize a classification algorithm to analyze the features extracted from the operation codes of contracts, accurately identifying Ponzi schemes. Specifically, we employ the Shapley Additive exPlanation (SHAP) method to interpret predictions for individual samples and conduct a dependency analysis for four pairs of features. Experimental results demonstrate that IDPonzi achieves impressive effectiveness, with a precision of 99%, recall of 85%, and F-score of 92%, outperforming existing approaches.

Details DOI