Author name cluster

Ke Ma

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

2 author rows

AAAI Conference 2026 Conference Paper

LidarPainter: One-Step Away from Any Lidar View to Novel Guidance

Yuzhou Ji
Ke Ma
Hong Cai
Anchun Zhang
Lizhuang Ma
Xin Tan

Dynamic driving scene reconstruction is of great importance in fields like digital twin system and autonomous driving simulation. However, unacceptable degradation occurs when the view deviates from the input trajectory, leading to corrupted background and vehicle models. To improve reconstruction quality on novel trajectory, existing methods are subject to various limitations including inconsistency, deformation, and time consumption. This paper proposes LidarPainter, a one-step diffusion model that recovers consistent driving views from sparse LiDAR condition and artifact-corrupted renderings in real-time, enabling high-fidelity lane shifts in driving scene reconstruction. Extensive experiments show that LidarPainter outperforms state-of-the-art methods in speed, quality and resource efficiency, specifically 7 × faster than StreetCrafter with only one fifth of GPU memory required. LidarPainter also supports stylized generation using text prompts such as “foggy” and “night”, allowing for a diverse expansion of the existing asset library.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Phys-Liquid: A Physics-Informed Dataset for Estimating 3D Geometry and Volume of Transparent Deformable Liquids

Ke Ma
Yizhou Fang
Jean-Baptiste Weibel
Shuai Tan
Xinggang Wang
Yang Xiao
Yi Fang
Tian Xia

Estimating the geometric and volumetric properties of transparent deformable liquids is challenging due to optical complexities and dynamic surface deformations induced by container movements. Autonomous robots performing precise liquid manipulation tasks—such as dispensing, aspiration, and mixing—must handle containers in ways that inevitably induce these deformations, complicating accurate liquid state assessment. Current datasets lack comprehensive physics-informed simulation data representing realistic liquid behaviors under diverse dynamic scenarios. To bridge this gap, we introduce Phys-Liquid, a physics-informed dataset comprising 97,200 simulation images and corresponding 3D meshes, capturing liquid dynamics across multiple laboratory scenes, lighting conditions, liquid colors, and container rotations. To validate the realism and effectiveness of Phys-Liquid, we propose a four-stage reconstruction and estimation pipeline involving liquid segmentation, multi-view mask generation, 3D mesh reconstruction, and real-world scaling. Experimental results demonstrate improved accuracy and consistency in reconstructing liquid geometry and volume, outperforming existing benchmarks. The dataset and associated validation methods facilitate future advancements in transparent liquid perception tasks.

PDF Details DOI

ICRA Conference 2025 Conference Paper

A Visual Servo System for Robotic on-Orbit Servicing Based on 3D Perception of Non-Cooperative Satellite

Panpan Zhao
Li Jin
Yeheng Chen
Jiachen Li
Xiuqiang Song
Wenxuan Chen
Nan Li
Wenjuan Du

The 3D perception of satellites, including both their shape and pose, is a key foundation for robotic on-orbit servicing. However, the demanding space environment-such as intense and dim illumination-presents significant challenges. Previous non-cooperative methods focus on specific geometric features like solar panel brackets or docking rings, overlooking the satellite's overall shape and increasing the risk of collisions during grasping. Additionally, satellites are often weakly textured, limiting the accuracy of 3D perception. To address these issues, we propose, for the first time, a 3D perceptionbased visual servo system of non-cooperative satellites. This system combines reconstruction and tracking to enhance shape perception and pose estimation accuracy in orbital conditions. Specifically, we employ an alternating iterative strategy to simultaneously reconstruct and track the satellite and introduce a novel constraint to fuse different cues under extreme conditions. Further, we develop a simulation environment platform, a dualarm microgravity grasping system, and an online monitoring module to enhance system capabilities for on-orbit servicing. Synthetic and real-world datasets from the simulation environment are also created for experimental validation. Results show that each module of our system achieves state-of-the-art performance.

Details

AAAI Conference 2025 Conference Paper

Exploring Query Efficient Data Generation Towards Data-Free Model Stealing in Hard Label Setting

Gaozheng Pei
Shaojie Lyu
Ke Ma
Pinci Yang
Qianqian Xu
Yingfei Sun

Data-free model stealing involves replicating the functionality of a target model into a substitute model without accessing the target model's structure, parameters, or training data. Instead, the adversary can only access the target model's predictions for generated samples. Once the substitute model closely approximates the behavior of the target model, attackers can exploit its white-box characteristics for subsequent malicious activities, such as adversarial attacks. Existing methods within cooperative game frameworks often produce samples with high confidence for the prediction of the substitute model, which makes it difficult for the substitute model to replicate the behavior of the target model. This paper presents a new data-free model stealing approach called Query Efficient Data Generation (QEDG). We introduce two distinct loss functions to ensure the generation of sufficient samples that closely and uniformly align with the target model's decision boundary across multiple classes. Building on the limitation of current methods, which typically yield only one piece of supervised information per query, we propose the query-free sample augmentation that enables the acquisition of additional supervised information without increasing the number of queries. Motivated by theoretical analysis, we adopt the consistency rate metric, which more accurately evaluates the similarity between the substitute and target models. We conducted extensive experiments to verify the effectiveness of our proposed method, which achieved better performance with fewer queries compared to the state-of-the-art methods on the real MLaaS scenario and five datasets.

PDF Details DOI

ICRA Conference 2025 Conference Paper

Real-World Automated Vehicle Longitudinal Stability Analysis: Controller Design and Field Test

Ke Ma
Yuqin Zhang
Hang Zhou
Zhaohui Liang
Xiaopeng Li 0020

Although extensive research has been conducted on modeling the stable longitudinal controller of automated vehicles (AVs) to dampen traffic oscillations, the real-world performance of these controllers in actual vehicles remains uncertain. In the operation of real-world AVs, the delay between actual dynamics and the commands prevents the controller's command from being effectively implemented to dampen traffic oscillations. Thus, this study adapts the designed controllers within an AV test platform to compare the theoretically stable conditions with the actual oscillation dampening performance. Initially, we compute the stable conditions for both the traditional car-following controller, which assumes no delay, and the longitudinal controller that accounts for the dynamic response of the vehicle. Through empirical experiments, we demonstrate that the longitudinal controller predicts vehicle stability more accurately than conventional car-following controller, showing an improvement from an average prediction accuracy rate of 0. 59 to 0. 91. Also, the experiments uncover specific delays inherent in dynamics systems, with a response delay of 0. 34 seconds. Our work makes two principal contributions to the field of AV control systems. First, it empirically validates that the longitudinal model, which accounts for the vehicle's dynamic responses, offers a more precise representation of vehicular behavior. Second, the relatively brief response delay identified expands the stability region, thereby enhancing vehicle control and safety. The longitudinal controller is critical for enhancing AV performance and reliability in dampening traffic oscillations.

Details

NeurIPS Conference 2024 Conference Paper

HAWK: Learning to Understand Open-World Video Anomalies

Jiaqi Tang
Hao Lu
Ruizheng Wu
Xiaogang Xu
Ke Ma
Cheng Fang
Bin Guo
Jiangbo Lu

Video Anomaly Detection (VAD) systems can autonomously monitor and identify disturbances, reducing the need for manual labor and associated costs. However, current VAD systems are often limited by their superficial semantic understanding of scenes and minimal user interaction. Additionally, the prevalent data scarcity in existing datasets restricts their applicability in open-world scenarios. In this paper, we introduce HAWK, a novel framework that leverages interactive large Visual Language Models (VLM) to interpret video anomalies precisely. Recognizing the difference in motion information between abnormal and normal videos, HAWK explicitly integrates motion modality to enhance anomaly identification. To reinforce motion attention, we construct an auxiliary consistency loss within the motion and video space, guiding the video branch to focus on the motion modality. Moreover, to improve the interpretation of motion-to-language, we establish a clear supervisory relationship between motion and its linguistic representation. Furthermore, we have annotated over 8, 000 anomaly videos with language descriptions, enabling effective training across diverse open-world scenarios, and also created 8, 000 question-answering pairs for users' open-world questions. The final results demonstrate that HAWK achieves SOTA performance, surpassing existing baselines in both video description generation and question-answering. Our codes/dataset/demo will be released at https: //github. com/jqtangust/hawk.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Slicing Vision Transformer for Flexible Inference

Yitian Zhang
Huseyin Coskun
Xu Ma
Huan Wang
Ke Ma
Xi Chen
Derek H. Hu
Yun Fu

Vision Transformers (ViT) is known for its scalability. In this work, we target to scale down a ViT to fit in an environment with dynamic-changing resource constraints. We observe that smaller ViTs are intrinsically the sub-networks of a larger ViT with different widths. Thus, we propose a general framework, named Scala, to enable a single network to represent multiple smaller ViTs with flexible inference capability, which aligns with the inherent design of ViT to vary from widths. Concretely, Scala activates several subnets during training, introduces Isolated Activation to disentangle the smallest sub-network from other subnets, and leverages Scale Coordination to ensure each sub-network receives simplified, steady, and accurate learning objectives. Comprehensive empirical validations on different tasks demonstrate that with only one-shot training, Scala learns slimmable representation without modifying the original ViT structure and matches the performance of Separate Training. Compared with the prior art, Scala achieves an average improvement of 1. 6% on ImageNet-1K with fewer parameters.

PDF Details DOI

JBHI Journal 2022 Journal Article

Efficient Subject-Independent Detection of Anterior Cruciate Ligament Deficiency Based on Marine Predator Algorithm and Support Vector Machine

Gengyuan Wang
Xiaolong Zeng
Guanquan Lai
Guoqing Zhong
Ke Ma
Yu Zhang

Anterior cruciate ligament (ACL) deficiency not only reduces knee stability, but also increases the risk of more disease and impairs daily life, thus requiring efficient detection of ACL deficiency. To build an efficient subject-independent ACL deficiency detection model, this study proposes a new method called SVM-MPA that fuses marine predator algorithm (MPA) and support vector machine (SVM) for simultaneous feature selection, hyperparameter optimization and classification. 35ACL-deficient (ACLD) and 35 ACL-intact (ACLI) participants were recruited to collect 6-degree-of-freedom knee kinematic data. Then, 216-dimensional multi-domain features covering time domain, frequency domain, time-frequency domain and nonlinearity were extracted. The error rate of SVM classification based on 5-fold cross-validation was used to construct the fitness of MPA, and MPA served to select features and optimize two hyperparameters for SVM. The majority voting strategy-based post-processing was introduced to convert the gait cycle-level to knee-level ACL deficiency detection. Comparing with 7 well-known meta-heuristic algorithms and running all 20 times, the best average gait cycle-level ACL deficiency detection performance (sensitivity: 96. 78±0. 4. 84%, specificity: 99. 43±5. 70%, and accuracy: 98. 48±1. 70%) was obtained using the proposed method. With post-processing, this study improved the best (final) detection performance (sensitivity: 97. 78±4. 97%, specificity: 100±0. 00%, and accuracy: 99. 13±1. 94%). These results demonstrate the feasibility and effectiveness of the proposed method and shows that an efficient subject-independent ACL deficiency detection model can be constructed using the proposed method, which makes it possible to provide a non-invasive, objective and accurate preoperative auxiliary detection method for diagnosing ACL deficiency clinically.

Details DOI

IJCAI Conference 2022 Conference Paper

Quaternion Ordinal Embedding

Wenzheng Hou
Qianqian Xu
Ke Ma
Qianxiu Hao
Qingming Huang

Ordinal embedding (OE) aims to project objects into a low-dimensional space while preserving their ordinal constraints as well as possible. Generally speaking, a reasonable OE algorithm should simultaneously capture a) semantic meaning and b) the ordinal relationship of the objects. However, most of the existing methods merely focus on b). To address this issue, our goal in this paper is to seek a generic OE method to embrace the two features simultaneously. We argue that different dimensions of vector-based embedding are naturally entangled with each other. To realize a), we expect to decompose the D dimensional embedding space into D different semantic subspaces, where each subspace is associated with a matrix representation. Unfortunately, introducing a matrix-based representation requires far more complex parametric space than its vector-based counterparts. Thanks to the algebraic property of quaternions, we are able to find a more efficient way to represent a matrix with quaternions. For b), inspired by the classic chordal Grassmannian distance, a new distance function is defined to measure the distance between different quaternions/matrices, on top of which we construct a generic OE loss function. Experimental results for different tasks on both simulated and real-world datasets verify the effectiveness of our proposed method.

PDF Details DOI

AAAI Conference 2021 Conference Paper

What to Select: Pursuing Consistent Motion Segmentation from Multiple Geometric Models

Yangbangyan Jiang
Qianqian Xu
Ke Ma
Zhiyong Yang
Xiaochun Cao
Qingming Huang

Motion segmentation aims at separating motions of different moving objects in a video sequence. Facing the complicated real-world scenes, recent studies reveal that combining multiple geometric models would be a more effective way than just employing a single one. This motivates a new wave of model-fusion based motion segmentation methods. However, the vast majority of models of this kind merely seek consensus in spectral embeddings. We argue that a simple consensus might be insufficient to filter out the harmful information which is either unreliable or semantically unrelated to the segmentation task. Therefore, how to automatically select valuable patterns across multiple models should be regarded as a key challenge here. In this paper, we present a novel geometric-model-fusion framework for motion segmentation, which targets at constructing a consistent affinity matrix across all the geometric models. Specifically, it incorporates the structural information shared by affinity matrices to select those semantically consistent entries. Meanwhile, a multiplicative decomposition scheme is adopted to ensure structural consistency among multiple affinities. To solve this problem, an alternative optimization scheme is proposed, together with a proof of its global convergence. Experiments on four real-world benchmarks show the superiority of the proposed method.

PDF Details

JBHI Journal 2020 Journal Article

Real-Time Detection of Compensatory Patterns in Patients With Stroke to Reduce Compensation During Robotic Rehabilitation Therapy

Siqi Cai
Guofeng Li
Enze Su
Xuyang Wei
Shuangyuan Huang
Ke Ma
Haiqing Zheng
Longhan Xie

Objectives: Compensations are commonly employed by patients with stroke during rehabilitation without therapist supervision, leading to suboptimal recovery outcomes. This study investigated the feasibility of the real-time monitoring of compensation in patients with stroke by using pressure distribution data and machine learning algorithms. Whether trunk compensation can be reduced by combining the online detection of compensation and haptic feedback of a rehabilitation robot was also investigated. Methods: Six patients with stroke did three forms of reaching movements while pressure distribution data were recorded as Dataset1. A support vector machine (SVM) classifier was trained with features extracted from Dataset1. Then, two other patients with stroke performed reaching tasks, and the SVM classifier trained by Dataset1 was employed to classify the compensatory patterns online. Based on the real-time monitoring of compensation, a rehabilitation robot provided an assistive force to patients with stroke to reduce compensations. Results: Good classification performance (F1 score > 0. 95) was obtained in both offline and online compensation analysis using the SVM classifier and pressure distribution data of patients with stroke. Based on the real-time detection of compensatory patterns, the angles of trunk rotation, trunk lean-forward and trunk-scapula elevation decreased by 46. 95%, 32. 35% and 23. 75%, respectively. Conclusion: High classification accuracies verified the feasibility of detecting compensation in patients with stroke based on pressure distribution data. Since the validity and reliability of the online detection of compensation has been verified, this classifier can be incorporated into a rehabilitation robot to reduce trunk compensations in patients with stroke.

Details DOI

AAAI Conference 2019 Conference Paper

Less but Better: Generalization Enhancement of Ordinal Embedding via Distributional Margin

Ke Ma
Qianqian Xu
Zhiyong Yang
Xiaochun Cao

In the absence of prior knowledge, ordinal embedding methods obtain new representation for items in a low-dimensional Euclidean space via a set of quadruple-wise comparisons. These ordinal comparisons often come from human annotators, and sufficient comparisons induce the success of classical approaches. However, collecting a large number of labeled data is known as a hard task, and most of the existing work pay little attention to the generalization ability with insufficient samples. Meanwhile, recent progress in large margin theory discloses that rather than just maximizing the minimum margin, both the margin mean and variance, which characterize the margin distribution, are more crucial to the overall generalization performance. To address the issue of insufficient training samples, we propose a margin distribution learning paradigm for ordinal embedding, entitled Distributional Margin based Ordinal Embedding (DMOE). Precisely, we first define the margin for ordinal embedding problem. Secondly, we formulate a concise objective function which avoids maximizing margin mean and minimizing margin variance directly but exhibits the similar effect. Moreover, an Augmented Lagrange Multiplier based algorithm is customized to seek the optimal solution of DMOE effectively. Experimental studies on both simulated and realworld datasets are provided to show the effectiveness of the proposed algorithm.

PDF Details

AAAI Conference 2019 Conference Paper

Robust Ordinal Embedding from Contaminated Relative Comparisons

Ke Ma
Qianqian Xu
Xiaochun Cao

Existing ordinal embedding methods usually follow a twostage routine: outlier detection is first employed to pick out the inconsistent comparisons; then an embedding is learned from the clean data. However, learning in a multi-stage manner is well-known to suffer from sub-optimal solutions. In this paper, we propose a unified framework to jointly identify the contaminated comparisons and derive reliable embeddings. The merits of our method are three-fold: (1) By virtue of the proposed unified framework, the sub-optimality of traditional methods is largely alleviated; (2) The proposed method is aware of global inconsistency by minimizing a corresponding cost, while traditional methods only involve local inconsistency; (3) Instead of considering the nuclear norm heuristics, we adopt an exact solution for rank equality constraint. Our studies are supported by experiments with both simulated examples and real-world data. The proposed framework provides us a promising tool for robust ordinal embedding from the contaminated comparisons.

PDF Details

AAAI Conference 2018 Conference Paper

Stochastic Non-Convex Ordinal Embedding With Stabilized Barzilai-Borwein Step Size

Ke Ma
Jinshan Zeng
Jiechao Xiong
Qianqian Xu
Xiaochun Cao
Wei Liu
Yuan Yao

Learning representation from relative similarity comparisons, often called ordinal embedding, gains rising attention in recent years. Most of the existing methods are batch methods designed mainly based on the convex optimization, say, the projected gradient descent method. However, they are generally time-consuming due to that the singular value decomposition (SVD) is commonly adopted during the update, especially when the data size is very large. To overcome this challenge, we propose a stochastic algorithm called SVRG-SBB, which has the following features: (a) SVD-free via dropping convexity, with good scalability by the use of stochastic algorithm, i. e. , stochastic variance reduced gradient (SVRG), and (b) adaptive step size choice via introducing a new stabilized Barzilai-Borwein (SBB) method as the original version for convex problems might fail for the considered stochastic non-convex optimization problem. Moreover, we show that the proposed algorithm converges to a stationary point at a rate O( 1 T ) in our setting, where T is the number of total iterations. Numerous simulations and real-world data experiments are conducted to show the effectiveness of the proposed algorithm via comparing with the state-of-the-art methods, particularly, much lower computational cost with good prediction performance.

PDF Details