Author name cluster

Qian Ma

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

IROS Conference 2025 Conference Paper

LaneMind: Seeing Lanes Like Human Drivers

Zhengyan Qian
Qian Ma

Accurate lane detection is critical for autonomous driving safety. In recent years, anchor-based detection methods have made significant progress. However, existing frameworks struggle in complex scenarios such as nighttime or dazzle light environments. Additionally, these methods exhibit limited geometric modeling and extrapolation capabilities for curvature variations in curved lanes. To tackle these challenges, we propose LaneMind, an innovative framework that combines human visual perception principles with advanced geometric modeling. Our approach features a dual-path architecture with cross-path attention mechanism, enabling simultaneous local feature extraction and global structure modeling. The network outputs confidence heatmap, followed by a skeleton-guided regression module that extracts medial-axis skeletons from high-probability lane regions to precisely localize lanes while maintaining topological continuity. Experimental results demonstrate that LaneMind achieves competitive performance across various benchmarks, particularly excelling in challenging curved lane scenarios and adverse lighting conditions. The framework’s robust performance and accurate detection quality highlight its potential for real-world autonomous driving applications.

Details

NeurIPS Conference 2025 Conference Paper

LBMKGC: Large Model-Driven Balanced Multimodal Knowledge Graph Completion

Yuan Guo
Qian Ma
Hui Li
Qiao Ning
Furui Zhan
Yu Gu
Ge Yu
Shikai Guo

Multi-modal Knowledge Graph Completion (MMKGC) aims to predict missing entities, relations, or attributes in knowledge graphs by collaboratively modeling the triple structure and multimodal information (e. g. , text, images, videos) associated with entities. This approach facilitates the automatic discovery of previously unobserved factual knowledge. However, existing MMKGC methods encounter several critical challenges: (i) the imbalance of inter-entity information across different modalities; (ii) the heterogeneity of intra-entity multimodal information; and (iii) for a given entity, the informational contributions of different modalities are inconsistent across contexts. In this paper, we propose a novel L arge model-driven B alanced M ultimodal K nowledge G raph C ompletion framework, termed LBMKGC. Subsequently, to bridge the semantic gap between heterogeneous modalities, LBMKGC aligns the multimodal embeddings of entities semantically by using the CLIP (Contrastive Language-Image Pre-Training) model. Furthermore, LBMKGC adaptively fuses multimodal embeddings with relational guidance by distinguishing between the perceptual and conceptual attributes of triples. Finally, extensive experiments conducted against 21 state-of-the-art baselines demonstrate that LBMKGC achieves superior performance across diverse datasets and scenarios while maintaining efficiency and generalizability. Our code and data are publicly available at: https: //github. com/guoynow/LBMKGC.

PDF Details

NeurIPS Conference 2025 Conference Paper

Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding

Qian Ma
Ruoxiang Xu
Yongqiang Cai

Numerous studies have demonstrated that the Transformer architecture possesses the capability for in-context learning (ICL). In scenarios involving function approximation, context can serve as a control parameter for the model, endowing it with the universal approximation property (UAP). In practice, context is represented by tokens from a finite set, referred to as a vocabulary, which is the case considered in this paper, i. e. , vocabulary in-context learning (VICL). We demonstrate that VICL in single-layer Transformers, without positional encoding, does not possess the UAP; however, it is possible to achieve the UAP when positional encoding is included. Several sufficient conditions for the positional encoding are provided. Our findings reveal the benefits of positional encoding from an approximation theory perspective in the context of in-context learning.

PDF Details

EAAI Journal 2024 Journal Article

Structuring Meaningful Code Review Automation in Developer Community

Zhenzhen Cao
Sijia Lv
Xinlong Zhang
Hui Li
Qian Ma
Tingting Li
Cheng Guo
Shikai Guo

Software code review is a crucial quality assurance procedure for software systems. As a result, some automated code review models have been proposed that jointly consider the reviewer’s comments and code. It is worth noting that these previous models have not solved the problem of insufficient diversity of generated code, which can lead to a low accuracy of generated modified code. Therefore, we introduce a method, called SMILER (Structuring Meaningful Code Review), to improve the effectiveness of code review by enhancing the diversity of generated code. Specifically, SMILER consists of two models, where each model consists of four components, i. e. , encoder, decoder, prior net and posterior net. The encoder and decoder learn parameters and generate possible code for automating the process of code review. In the prior net and posterior net, Gaussian noise is introduced to increase the diversity of the generated code and improve the performance of the model. Experimental studies on 17, 194 code pairs and triplets demonstrate that SMILER outperforms state-of-the-art models from the perspectives of both the reviewer and developer, respectively, in terms of perfect prediction.

Details DOI

EAAI Journal 2023 Journal Article

MIVAE: Multiple Imputation based on Variational Auto-Encoder

Qian Ma
Xia Li
Mei Bai
Xite Wang
Bo Ning
Guanyu Li

Nowadays, the issue of MV imputation has become one of the research hotspots in the field of data quality, since the missing values (MVs) are prevalent in real-world datasets and bring challenges to advanced data analytics algorithms. To impute the MVs, most existing approaches directly derive one estimation for each MV, which is categorized as the single imputation (SI). However, the SI ignores the uncertainty of the MVs, and thereby usually derive unsatisfactory imputation results compared to the Multiple imputation (MI). To extract the uncertainty of the MVs, the MI algorithms derive multiple candidate estimations for each MV. Nevertheless, existing MI approaches are few due to the complicated data-handling process. Accordingly, in this paper, by exploring the Variational Auto-Encoder (VAE) model, we propose a new MI approach, namely MIVAE (Multiple Imputation based on Variational Auto-Encoder) to impute MVs for the tabular data. In MIVAE, we first add a corrupted input layer (where the synthetic MVs are introduced) adjacent to the original input layer to make the model capable of MV issue. Then, we obtain multiple rather than single candidate estimations for each data sample from the posterior distribution of the latent variables learned by our designed model. In such way, the multiple imputation is effectively implemented where the uncertainty of the MVs are extracted perfectly. Next, to obtain satisfactory imputation results, we add a data analysis layer at the end of the network to integrate multiple candidate estimations intelligently. Finally, the experimental results over four real-world datasets demonstrate that MIVAE achieves significantly higher imputation accuracy compared to existing solutions, and MIVAE are capable of handling both numerical and categorized tabular data. For example, the imputation accuracy based on MIVAE improves up to about 40% and 30% compared with PMM and MIWAE (which are the state-of-the-art MI approach) over the CropMapping dataset, respectively. Moreover, we train a MIVAE model over three datasets containing MVs, respectively. By leveraging the trained MIVAE, the classification performance over the imputed data is similar to that over the complete data.

Details DOI

EAAI Journal 2021 Journal Article

AONet: Active Offset Network for crowd flow prediction

Dafeng Wang
Qian Ma
Naiyao Wang
Xuanzhe Fan
MingYu Lu
Hongbo Liu

Predicting crowd flow is of great importance to public safety and traffic management. The crowd flow is difficult to predict accurately and timely due to the uncertainty of the future positions. In this paper, we propose a novel Active Offset Network (AONet), in which ActiveGRU (Active Gate Recurrent Unit) is designed to predict the variation of pedestrians’ positions in the crowd flow. Its inner location-variant recurrent structure is implemented by utilizing convolution operation on low dimensional spatio-temporal sequences to obtain fractional offset locations. Afterwards, the sampling locations are determined by bilinear interpolation on fractional offset locations. Moreover, a probabilistic sparse strategy is introduced to reduce the links between sampling locations during supervised training. Finally, the experiments over popular benchmarks demonstrate that our method can actively characterize the future positions of pedestrians. Meanwhile, the performance of the proposed AONet is superior over state-of-art baselines with regard to both accuracy and computational savings.

Details DOI

TCS Journal 2019 Journal Article

Index set expressions can represent temporal logic formulas

Zhenhua Duan
Cong Tian
Nan Zhang
Qian Ma
Hongwei Du

In Temporal Logic (TL), a well-formed formula is generally formed by applying rules of its syntax finitely many times. However, under some circumstances, although formulas such as ones expressed by index set expressions, are constructed via applying rules of the syntax infinitely many times, they are possibly still well-formed since their equivalent concise syntax formulas can be found. With this motivation, this paper investigates the relationship between formulas specified by index set expressions and concise syntax by means of fixed-point approach. Firstly, we present two kinds of formulas, namely ⋁ i ∈ N 0 ◯ i Q and ⋁ i ∈ N 0 Q i, and prove they are indeed well-formed by proving they are equivalent to formulas ◇Q and Q ⁎ respectively. Further, we generalize ⋁ i ∈ N 0 ◯ i Q to ⋁ i ∈ N 0 P ( i ) ∧ ◯ i Q and explore the least and greatest fixed-points of an abstract equation X ≡ Q ∨ P ∧ ◯ X. Based on these, some well-formed special instances of ⋁ i ∈ N 0 P ( i ) ∧ ◯ i Q are obtained. Besides, with the index set expression technique, we equivalently represent ‘ U ’ (strong until) and ‘ W ’ (weak until) constructs of propositional Linear Temporal Logic (LTL) within Propositional Projection Temporal Logic (PPTL).

Details DOI