Arrow Research search

Author name cluster

Xiaochen Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

EAAI Journal 2024 Journal Article

Graph Confident Learning for Software Vulnerability Detection

  • Qian Wang
  • Zhengdao Li
  • Hetong Liang
  • Xiaowei Pan
  • Hui Li
  • Tingting Li
  • Xiaochen Li
  • Chenchen Li

Code vulnerability exposes millions of software to the possibility of being attacked, as evidence every year on increasing reports of security issues, such as information leaks, system compromise, and denial of service. Despite with many vulnerability detection models proposed so far, their effectiveness is still limited due to the ignorance of syntactic structural information analysis in source code and the improper handling of labeling errors. To address these issues, we propose the Graph Confident Learning for Software Vulnerability Detection (GCL4SVD) model, a machine learning model to detect software vulnerability in the development phase. It comprises two components: code graph embedding and graph confident learning denoising. To address the syntactic structural information analysis limitation, the code graph embedding component extracts the structure and semantic information of source code with a sliding window mechanism, and then encodes source code into a graph structure to capture the patterns and characteristics of code vulnerabilities. Additionally, the graph confident learning denoising component identifies labeling errors to improve the quality of training set. Experimental results show that GCL4SVD outperforms the state-of-the-art vulnerability detection models on four open source datasets by 3. 7%, 3. 3%, 2. 5%, 0. 8% in terms of Accuracy, respectively, and by 10. 2%, 21. 8%, 8. 2%, 11. 2% in terms of F1-score.

IJCAI Conference 2024 Conference Paper

VF-Detector: Making Multi-Granularity Code Changes on Vulnerability Fix Detector Robust to Mislabeled Changes

  • Zhenkan Fu
  • Shikai Guo
  • Hui Li
  • Rong Chen
  • Xiaochen Li
  • He Jiang

As software development projects increasingly rely on open-source software, users face the risk of security vulnerabilities from third-party libraries. To address label and character noise in code changes, we present VF-Detector to automatically identifying bug-fix commits in actual noise development environment. VF-Detector consists of three componments: Data Pre-processing (DP), Vulnerability Confidence Computation (VCC) and Confidence Learning Denoising (CLD). The DP component is responsible for preprocessing code change data. The VCC component calculates code change confidence value for each bug-fix by extracting features at various granularity levels. The CLD component removes noise and enhances model robustness by pruning noisy data with confidence values and performing effort-aware adjustments. Experimental results demonstrate VF-Detector's superiority over state-of-the-art methods in EffortCost@L and Popt@L metrics on Java and Python datasets. The improvements were 6. 5% and 5% for Java, and 23. 4% and 17. 8% for Python.

ICML Conference 2023 Conference Paper

Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization

  • Stone Tao
  • Xiaochen Li
  • Tongzhou Mu
  • Zhiao Huang
  • Yuzhe Qin
  • Hao Su 0001

Training long-horizon robotic policies in complex physical environments is essential for many applications, such as robotic manipulation. However, learning a policy that can generalize to unseen tasks is challenging. In this work, we propose to achieve one-shot task generalization by decoupling plan generation and plan execution. Specifically, our method solves complex long-horizon tasks in three steps: build a paired abstract environment by simplifying geometry and physics, generate abstract trajectories, and solve the original task by an abstract-to-executable trajectory translator. In the abstract environment, complex dynamics such as physical manipulation are removed, making abstract trajectories easier to generate. However, this introduces a large domain gap between abstract trajectories and the actual executed trajectories as abstract trajectories lack low-level details and are not aligned frame-to-frame with the executed trajectory. In a manner reminiscent of language translation, our approach leverages a seq-to-seq model to overcome the large domain gap between the abstract and executable trajectories, enabling the low-level policy to follow the abstract trajectory. Experimental results on various unseen long-horizon tasks with different robot embodiments demonstrate the practicability of our methods to achieve one-shot task generalization.