Author name cluster

Junjie Xu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

TIST Journal 2025 Journal Article

A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness

Fali Wang
Zhiwei Zhang
Xianren Zhang
Zongyu Wu
TzuHao Mo
Qiuhao Lu
Wanjing Wang
Rui Li

Large language models (LLMs) have demonstrated emergent abilities in text generation, question answering, and reasoning, facilitating various tasks and domains. Despite their proficiency in various tasks, LLMs like PaLM 540B and Llama-3.1 405B face limitations due to large parameter sizes and computational demands, often requiring cloud API use, which raises privacy concerns, limits real-time applications on edge devices, and increases fine-tuning costs. Additionally, LLMs often underperform in specialized domains such as healthcare and law due to insufficient domain-specific knowledge, necessitating specialized models. Therefore, Small Language Models (SLMs) are increasingly favored for their low inference latency, cost-effectiveness, efficient development, and easy customization and adaptability. These models are particularly well-suited for resource-limited environments and domain knowledge acquisition, addressing LLMs’ challenges and proving ideal for applications that require localized data handling for privacy, minimal inference latency for efficiency, and domain knowledge acquisition through lightweight fine-tuning. The rising demand for SLMs has spurred extensive research and development. However, a comprehensive survey investigating issues related to the definition, acquisition, application, enhancement, and reliability of SLM remains lacking, prompting us to conduct a detailed survey on these topics. The definition of SLMs varies widely; thus, to standardize, we propose defining SLMs by their capability to perform specialized tasks and suitability for resource-constrained settings, setting boundaries based on the minimal size for emergent abilities and the maximum size sustainable under resource constraints. For other aspects, we provide a taxonomy of relevant models/methods and develop general frameworks for each category to enhance and utilize SLMs effectively. We have compiled the collected SLM models and related methods on GitHub: https://github.com/FairyFali/SLMs-Survey.

Details DOI

ICLR Conference 2025 Conference Paper

Beyond Sequence: Impact of Geometric Context for RNA Property Prediction

Junjie Xu
Artem Moskalev
Tommaso Mansi
Mangal Prakash
Rui Liao

Accurate prediction of RNA properties, such as stability and interactions, is crucial for advancing our understanding of biological processes and developing RNA-based therapeutics. RNA structures can be represented as 1D sequences, 2D topological graphs, or 3D all-atom models, each offering different insights into its function. Existing works predominantly focus on 1D sequence-based models, which overlook the geometric context provided by 2D and 3D geometries. This study presents the first systematic evaluation of incorporating explicit 2D and 3D geometric information into RNA property prediction, considering not only performance but also real-world challenges such as limited data availability, partial labeling, sequencing noise, and computational efficiency. To this end, we introduce a newly curated set of RNA datasets with enhanced 2D and 3D structural annotations, providing a resource for model evaluation on RNA data. Our findings reveal that models with explicit geometry encoding generally outperform sequence-based models, with an average prediction RMSE reduction of around 12% across all various RNA tasks and excelling in low-data and partial labeling regimes, underscoring the value of explicitly incorporating geometric context. On the other hand, geometry-unaware sequence-based models are more robust under sequencing noise but often require around 2-5x training data to match the performance of geometry-aware models. Our study offers further insights into the trade-offs between different RNA representations in practical applications and addresses a significant gap in evaluating deep learning models for RNA tasks.

Details

NeurIPS Conference 2025 Conference Paper

DualEqui: A Dual-Space Hierarchical Equivariant Network for Large Biomolecules

Junjie Xu
Jiahao Zhang
Mangal Prakash
Xiang Zhang
Suhang Wang

Geometric graph neural networks (GNNs) that respect E(3) symmetries have achieved strong performance on small molecule modeling, but they face scalability and expressiveness challenges when applied to large biomolecules such as RNA and proteins. These systems require models that can simultaneously capture fine-grained atomic interactions, long-range dependencies across spatially distant components, and biologically relevant hierarchical structure—such as atoms forming residues, which in turn form higher-order domains. Existing geometric GNNs, which typically operate exclusively in either Euclidean or Spherical Harmonics space, are limited in their ability to capture both the fine-scale atomic details and the long-range, symmetry-aware dependencies required for modeling the multi-scale structure of large biomolecules. We introduce DualEquiNet, a Dual-Space Hierarchical Equivariant Network that constructs complementary representations in both Euclidean and Spherical Harmonics spaces to capture local geometry and global symmetry-aware features. DualEquiNet employs bidirectional cross-space message passing and a novel Cross-Space Interaction Pooling mechanism to hierarchically aggregate atomic features into biologically meaningful units, such as residues, enabling efficient and expressive multi-scale modeling for large biomolecular systems. DualEquiNet achieves state-of-the-art performance on multiple existing benchmarks for RNA property prediction and protein modeling, and outperforms prior methods on two newly introduced 3D structural benchmarks demonstrating its broad effectiveness across a range of large biomolecule modeling tasks.

PDF Details

ICML Conference 2025 Conference Paper

Geometric Hyena Networks for Large-scale Equivariant Learning

Artem Moskalev
Mangal Prakash
Junjie Xu
Tianyu Cui
Rui Liao
Tommaso Mansi

Processing global geometric context while preserving equivariance is crucial when modeling biological, chemical, and physical systems. Yet, this is challenging due to the computational demands of equivariance and global context at scale. Standard methods such as equivariant self-attention suffer from quadratic complexity, while local methods such as distance-based message passing sacrifice global information. Inspired by the recent success of state-space and long-convolutional models, we introduce Geometric Hyena, the first equivariant long-convolutional model for geometric systems. Geometric Hyena captures global geometric context at sub-quadratic complexity while maintaining equivariance to rotations and translations. Evaluated on all-atom property prediction of large RNA molecules and full protein molecular dynamics, Geometric Hyena outperforms existing equivariant models while requiring significantly less memory and compute that equivariant self-attention. Notably, our model processes the geometric context of $30k$ tokens $20 \times$ faster than the equivariant transformer and allows $72 \times$ longer context within the same budget.

Details

ICRA Conference 2025 Conference Paper

Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences

Ziang Liu 0019
Junjie Xu
Xingjiao Wu
Jing Yang 0023
Liang He 0001

Preference-Based reinforcement learning (PBRL) learns directly from the preferences of human teachers regarding agent behaviors without needing meticulously designed reward functions. However, existing PBRL methods often learn primarily from explicit preferences, neglecting the possibility that teachers may choose equal preferences. This neglect may hinder the understanding of the agent regarding the task perspective of the teacher, leading to the loss of important information. To address this issue, we introduce the Equal Preference Learning Task, which optimizes the neural network by promoting similar reward predictions when the behaviors of two agents are labeled as equal preferences. Building on this task, we propose a novel PBRL method, Multi-Type Preference Learning (MTPL), which allows simultaneous learning from equal preferences while leveraging existing methods for learning from explicit preferences. To validate our approach, we design experiments applying MTPL to four existing state-of-the-art baselines across ten locomotion and robotic manipulation tasks in the DeepMind Control Suite. The experimental results indicate that simultaneous learning from both equal and explicit preferences enables the PBRL method to more comprehensively understand the feedback from teachers, thereby enhancing feedback efficiency. Project page: https://github.com/FeiCuiLengMMbb/paper_MTPL

Details

ICLR Conference 2025 Conference Paper

Robustness Inspired Graph Backdoor Defense

Zhiwei Zhang 0028
Minhua Lin
Junjie Xu
Zongyu Wu 0001
Enyan Dai
Suhang Wang

Graph Neural Networks (GNNs) have achieved promising results in tasks such as node classification and graph classification. However, recent studies reveal that GNNs are vulnerable to backdoor attacks, posing a significant threat to their real-world adoption. Despite initial efforts to defend against specific graph backdoor attacks, there is no work on defending against various types of backdoor attacks where generated triggers have different properties. Hence, we first empirically verify that prediction variance under edge dropping is a crucial indicator for identifying poisoned nodes. With this observation, we propose using random edge dropping to detect backdoors and theoretically show that it can efficiently distinguish poisoned nodes from clean ones. Furthermore, we introduce a novel robust training strategy to efficiently counteract the impact of the triggers. Extensive experiments on real-world datasets show that our framework can effectively identify poisoned nodes, significantly degrade the attack success rate, and maintain clean accuracy when defending against various types of graph backdoor attacks with different properties. Our code is available at: https://github.com/zzwjames/RIGBD.

Details

IROS Conference 2023 Conference Paper

Magnetically Controlled Cell Robots with Immune-Enhancing Potential

Hongyan Sun
Yuguo Dai
Jiaying Zhang
Junjie Xu
Lina Jia
Chutian Wang
Luyao Wang
Chan Li

Magnetic microrobots exhibit enormous potential in targeted drug delivery owing to the remote wireless manipulation and minimum invasion for medical treatment. High degree of freedom offers the magnetic propelled robots extraordinary application prospect since they can be controlled precisely when different magnetic fields sources working cooperatively. However, the biocompatibility of microrobots have attracted sustained and general concern. Therefore, it is highly necessary to develop a promising carrier with high biocompatibility and investigate the mechanism of drug loading-release triggered by special microenvironment in the targeted region. In this paper, we proposed a magnetically controlled cell robots (MCRs) based on macrophages propelled by a rotating magnetic field. The innovative MCRs exhibit good biocompatibility and low toxicity by optimizing the concentration of polylysine-coated Fe nanoparticles (PLL@FeNPs) to 40 µg/mL. These MCRs loaded with murine interleukin-12 (IL-12), murine chemokine (C-C motif) ligand 5 (CCL-5), and murine C-X-C motif chemokine ligand 10 (CXCL-10) which can stimulate T cell differentiation and recruitment of monocytes, respectively. The macrophages showed an obvious M1-polarization tendency of macrophages to phagocytose intracellular pathogens and resist the growth of tumor cells. Under the control of a magnetic propelling system composed of 3 pairs of Helmholtz coil, the cell robot can be propelled wirelessly and moved along a predefined path with high accuracy. Moreover, the MCRs could approach to cancer cells and stop at places of interest in vitro. In conclusion, we have accomplished the preliminary construction of a targeted drug delivery system which displays great immune-enhancing potential for targeted drug delivery.

Details

NeurIPS Conference 2021 Conference Paper

Revisiting Time Series Outlier Detection: Definitions and Benchmarks

Kwei-Herng Lai
Daochen Zha
Junjie Xu
Yue Zhao
Guanchu Wang
Xia Hu

Time series outlier detection has been extensively studied with many advanced algorithms proposed in the past decade. Despite these efforts, very few studies have investigated how we should benchmark the existing algorithms. In particular, using synthetic datasets for evaluation has become a common practice in the literature, and thus it is crucial to have a general synthetic criterion to benchmark algorithms. This is a non-trivial task because the existing synthetic methods are very different in different applications and the outlier definitions are often ambiguous. To bridge this gap, we propose a behavior-driven taxonomy for time series outliers and categorize outliers into point- and pattern-wise outliers with clear context definitions. Following the new taxonomy, we then present a general synthetic criterion and generate 35 synthetic datasets accordingly. We further identify 4 multivariate real-world datasets from different domains and benchmark 9 algorithms on the synthetic and the real-world datasets. Surprisingly, we observe that some classical algorithms could outperform many recent deep learning approaches. The datasets, pre-processing and synthetic scripts, and the algorithm implementations are made publicly available at https: //github. com/datamllab/tods/tree/benchmark

PDF Details

AAAI Conference 2021 System Paper

TODS: An Automated Time Series Outlier Detection System

Kwei-Herng Lai
Daochen Zha
Guanchu Wang
Junjie Xu
Yue Zhao
Devesh Kumar
Yile Chen
Purav Zumkhawaka

We present TODS, an automated Time Series Outlier Detection System for research and industrial applications. TODS is a highly modular system that supports easy pipeline construction. The basic building block of TODS is primitive, which is an implementation of a function with hyperparameters. TODS currently supports 70 primitives, including data processing, time series processing, feature analysis, detection algorithms, and a reinforcement module. Users can freely construct a pipeline using these primitives and perform endto-end outlier detection with the constructed pipeline. TODS provides a Graphical User Interface (GUI), where users can flexibly design a pipeline with drag-and-drop. Moreover, a data-driven searcher is provided to automatically discover the most suitable pipelines given a dataset. TODS is released under Apache 2. 0 license at https: //github. com/datamllab/tods. A video is available on YouTube1.

PDF Details