Author name cluster

Yuke Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

2 author rows

EAAI Journal 2025 Journal Article

A lightweight neural network search algorithm based on in-place distillation and performance prediction for hardware-aware optimization

Siyuan Kang
Yinghao Sun
Shuguang Li
Yaozong Xu
Yuke Li
Guangjie Chen
Fei Xue

Due to the limited computing resources of edge devices, traditional object detection algorithms struggle to meet the efficiency and accuracy requirements of autonomous driving. Consequently, designing a neural network model that balances hardware resource requirements, operating speed, and accuracy is crucial. To address this, by integrating algorithm with hardware characteristics, we propose a lightweight neural network architecture search algorithm based on in-place distillation and performance predictor (LNIP). Initially, we focus on optimizing the operators of the you only look once version 8 nano (YOLOv8n) and dynamically adjust its network structure. Then, we trained a super-network using a progressive shrinking strategy, the sandwich rule, and in-place distillation. Subsequently, we employed a Gaussian process to model the relationship between network architecture and accuracy, utilizing encoding methods and custom kernel function to develop high-performance predictor. Finally, during the search process, we introduce a reward function based on Pareto optimality to balance the performance of the model with hardware constraints. Building upon this foundation, we design an efficient search algorithm based on the performance predictor to progressively explore the optimal network structure tailored to hardware characteristics. We compared our lightweight network with state-of-the-art methods on the BDD100K, COCO, and PASCAL VOC datasets and deployed it on the Black Sesame A1000 and NVIDIA Xavier for comprehensive evaluation. On the NVIDIA Xavier, the lightweight network achieves a latency of 11. 81 ms and an edge precision of 46. 1 %. These experimental results demonstrate that our method outperforms existing methods in balancing hardware constraints and model performance.

Details DOI

IROS Conference 2025 Conference Paper

Compact R-X-Y Stage and Dual-Finger Micromanipulator under Inverted Optical Microscope for Microassembly

Jichao Pang
Zhuo Chen 0006
Yan Chen
Yuke Li
Yunsheng Li
Qiang Huang 0002
Tatsuo Arai
Xiaoming Liu 0007

Microassembly plays an important role in fabricating complex structures with small basic components in industrial and biomedical fields. Inverted optical microscope could provide high-quality image feedback for microassembly with its continuously improving resolution. However, a compact stage capable of positioning and reorienting micro-objects while fitting within the limited space under an inverted optical microscope remains unavailable. This paper proposes a compact R-X-Y stage that can transport micro-objects over long distances in the X and Y directions, and reorient the objects by the 360-degree continuous rotation. Additionally, different from commonly putting the rotational stage on the X-Y stage, we mount the thin X-Y stage on a rotational stage. Thus, after aligning the centers of the visual field and rotational stage at the beginning, all the visiable micro-objects will not move out of the visual field during the rotation. We further integrate the R-X-Y stage and the dual-finger micromanipulator, and then use them to assemble 2-D patterns and complex 3-D micromachine. The obtained results and preliminary demonstration indicate that the proposed compact R-X-Y has great potential in assembling complex micromachines.

Details

IROS Conference 2025 Conference Paper

Enhanced Rolling Motion of Magnetic Microparticles by Turning Interface Lubrication

Yuke Li
Xiyue Liang
Zhuo Chen 0053
Hongzhe Liao
Yue Zhao 0025
Masaru Kojima
Qiang Huang 0002
Tatsuo Arai

Micro-nano robots must break the symmetry of the flow field to generate net displacement in the low Reynolds number environment. The spherical micro-robots utilize the frictional forces generated through interaction with the surface. We designed a magnetic microroller robot powered by the rotating AC magnetic field. Here, we employed dual measurements of laser ranging and computer vision to demonstrate that a single 100 μm microroller maintains a lubrication film of 1 to 15 μm with the surface during normal motion. We found that the translational velocity of the microroller is correlated with the lubrication film thickness. Based on the robot's gravity, we controlled an additional downward gradient magnetic field to effectively increase the load of robot and reduce the lubrication film thickness, thereby controllably increasing the translational velocity of the robot. For example, the gradient magnetic field generated by superimposing a 30mA direct current input can reduce the lubrication film thickness from 8 μm to 4 μm in a 10 Hz rotating magnetic field, and increase the translational velocity from 230 μm/s to 460 μm/s. The enhancement of the robot's motion performance enables it to better control its movement in fluids. Finally, we validated the strategy for controllable acceleration of micro-scale particles rolling on surfaces, applied to control fluid motion in multiple arteries within blood vessels. These results offer deeper insights into the physical motion mechanism of surface robots and hold significant implications for future applications in biomedical engineering.

Details

ICLR Conference 2025 Conference Paper

Identification of Intermittent Temporal Latent Process

Yuke Li
Yujia Zheng 0001
Guangyi Chen 0002
Kun Zhang 0001
Heng Huang 0001

Identifying time-delayed latent causal process is crucial for understanding temporal dynamics and enabling downstream reasoning. While recent methods have made progress in identifying latent time-delayed causal processes, they cannot address the dynamics in which the influence of some latent factors on both the subsequent latent states and the observed data can become inactive or irrelevant at different time steps. Therefore, we introduce intermittent temporal latent processes, where: (1) any subset of latent factors may be missing during nonlinear data generation at any time step, and (2) the active latent factors at each step are unknown. This framework encompasses both nonstationary and stationary transitions, accommodating changing or consistent active factors over time. Our work shows that under certain assumptions, the latent causal variables are block-wise identifiable. With further conditional independence assumption, each latent variable can even be recovered up to component-wise transformations. Using this identification theory, we propose an unsupervised approach, InterLatent, to reliably uncover the representations of the intermittent temporal latent process. The experimental findings on both synthetic and real-world datasets verify our theoretical claims.

Details

IROS Conference 2025 Conference Paper

Magnetically Actuated Steerable Catheter with Redundant DoF for Cardiovascular Interventions

Hongzhe Liao
Han Jin
Jialong Du
Xiyue Liang
Yuke Li
Qiang Huang 0002
Tatsuo Arai
Xiaoming Liu 0007

A magnetically controlled catheter system is proposed to enhance the precision and safety of vascular interventions by reducing procedure time and radiation exposure. The system can also function as a support channel for guidewire deployment. A novel navigation approach is introduced, employing an external permanent magnet capable of controlled rotation to actuate a catheter with an embedded magnetic tip. Leveraging magnetic coupling and a redundant rotational DoF, the system achieves fine angular tip control with minimal spatial displacement, significantly enhancing maneuverability in constrained vascular environments. The magnetic field distribution and its influence on catheter response are characterized, and a kinematic model of the actuation mechanism is established. Experimental validation is conducted under varying magnetic field strengths and orientations, demonstrating reliable steering performance. Application-based experiments in simulated clinical environments further confirm precise navigation capability. The results highlight the advantages of rotational magnetic control in enhancing flexibility and accuracy. The proposed system presents a promising solution for automating catheter-based interventions, offering improved efficiency and power in minimally invasive procedures.

Details

IROS Conference 2025 Conference Paper

On-Chip Dynamic Mechanical Characterization: from Cells to Nucleus

Jingjin Ge
Zhuo Chen 0053
Chenhao Bai
Fengyu Liu
Yuke Li
Masaru Kojima
Qiang Huang 0002
Tatsuo Arai

Traditional single-cell mechanical characterization techniques (e. g. , atomic force microscopy) often face limitations in throughput, require invasive labeling, or fail to replicate physiological microenvironments, impeding their clinical utility for rapid cancer cell analysis. To address these limitations for automated characterization of cellular mechanical properties, this study proposes a novel method using microchannels with narrow geometric structures to measure cellular mechanical characteristics. A dynamic mechanical characterization technique with serially connected microchannels simulates malignant tumor cell deformation and migration in vivo, enabling precise identification of three malignant tumor cell lines and three normal cell lines through consecutive compressions. High-speed imaging combined with computer vision and image processing techniques facilitates rapid and accurate automated analysis for tumor cells. Furthermore, this study reveals that the mechanical properties of the cell nucleus determine the overall cellular mechanics, with the differences between tumor and normal cells attributed to variations in nucleus mechanics. This approach shows promise for early cancer diagnosis.

Details

ICRA Conference 2024 Conference Paper

HPL-ViT: A Unified Perception Framework for Heterogeneous Parallel LiDARs in V2V

Yuhang Liu
Boyi Sun
Yuke Li
Yuzheng Hu
Fei-Yue Wang 0001

To develop the next generation of intelligent LiDARs, we propose a novel framework of parallel LiDARs and construct a hardware prototype in our experimental platform, DAWN (Digital Artificial World for Natural). It emphasizes the tight integration of physical and digital space in LiDAR systems, with networking being one of its supported core features. In the context of autonomous driving, V2V (Vehicle-to-Vehicle) technology enables efficient information sharing between different agents which significantly promotes the development of LiDAR networks. However, current research operates under an ideal situation where all vehicles are equipped with identical LiDAR, ignoring the diversity of LiDAR categories and operating frequencies. In this paper, we first utilize OpenCDA and RLS (Realistic LiDAR Simulation) to construct a novel heterogeneous LiDAR dataset named OPV2V-HPL. Additionally, we present HPL-ViT, a pioneering architecture designed for robust feature fusion in heterogeneous and dynamic scenarios. It uses a graph-attention Transformer to extract domain-specific features for each agent, coupled with a cross-attention mechanism for the final fusion. Extensive experiments on OPV2V-HPL demonstrate that HPL-ViT achieves SOTA (state-of-the-art) performance in all settings and exhibits outstanding generalization capabilities.

Details

ICML Conference 2024 Conference Paper

Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition

Yuke Li
Guangyi Chen 0002
Ben Abramowitz
Stefano Anzellotti
Donglai Wei 0001

Few-shot action recognition aims at quickly adapting a pre-trained model to the novel data with a distribution shift using only a limited number of samples. Key challenges include how to identify and leverage the transferable knowledge learned by the pre-trained model. We therefore propose CDTD, or Causal Domain-Invariant Temporal Dynamics for knowledge transfer. To identify the temporally invariant and variant representations, we employ the causal representation learning methods for unsupervised pertaining, and then tune the classifier with supervisions in next stage. Specifically, we assume the domain information can be well estimated and the pre-trained temporal dynamic generation and transition models can be well transferred. During adaptation, we fix the transferable temporal dynamics and update the image encoder and domain estimator. The efficacy of our approach is revealed by the superior accuracy of CDTD over leading alternatives across standard few-shot action recognition datasets.

Details

ICLR Conference 2024 Conference Paper

LLCP: Learning Latent Causal Processes for Reasoning-based Video Question Answer

Guangyi Chen 0002
Yuke Li
Xiao Liu
Zijian Li 0001
Eman Al Suradi
Donglai Wei 0001
Kun Zhang 0001

Current approaches to Video Question Answering (VideoQA) primarily focus on cross-modality matching, which is limited by the requirement for extensive data annotations and the insufficient capacity for causal reasoning (e.g. attributing accidents). To address these challenges, we introduce a causal framework for video reasoning, termed Learning Latent Causal Processes (LLCP). At the heart of LLCP lies a multivariate generative model designed to analyze the spatial-temporal dynamics of objects within events. Leveraging the inherent modularity of causal mechanisms, we train the model through self-supervised local auto-regression eliminating the need for annotated question-answer pairs. During inference, the model is applied to answer two types of reasoning questions: accident attribution, which infers the cause from observed effects, and counterfactual prediction, which predicts the effects of counterfactual conditions given the factual evidence. In the first scenario, we identify variables that deviate from the established distribution by the learned model, signifying the root cause of accidents. In the second scenario, we replace embeddings of previous variables with counterfactual ones, enabling us to forecast potential developments. Once we have identified these cause/effect variables, natural language answers are derived through a combination of grammatical parsing and a pre-trained vision-language model. We assess the efficacy of LLCP on both synthetic and real-world data, demonstrating comparable performance to supervised methods despite our framework using no paired textual annotations.

Details

IROS Conference 2023 Conference Paper

Programable On-Chip Fabrication of Magnetic Soft Micro-Robot

Yuke Li
Xiaoqing Tang
Xiaoming Liu 0007
Dan Liu 0009
Zhuo Chen 0053
Masaru Kojima
Qiang Huang 0002
Tatsuo Arai

In the last decade, researchers have been trying to develop many microrobots that mimic the extraordinary abilities of bionts in complex environments. How to fabricate the biomimetic microrobot with satisfying deformability and complex shapes to realize desired precise motion is the key issue. In this paper, we proposed an efficient programable fabrication method of the magnetic soft micro-robot through an on-chip photopolymerization system. The superparamagnetic nanoparticles were compiled according to the magnetic anisotropy and assembled in the micro-robot. Then these nanoparticles were immobilized by photopolymerization of the hydrogel polymer. With this fabrication method, a joint rotation mechanism was first fabricated to characterize the deformation performance under the magnetic field control. Besides, the snake-like micro-robot were also fabricated, and the desired motions were achieved. The experimental results show that the proposed programable on-chip fabrication of magnetic soft micro-robot has the potential to facilitate the development of magnetic microrobots and their applications in the biomedical field.

Details

IROS Conference 2022 Conference Paper

Controlled Fabrication of Micro-Chain Robot Using Magnetically Guided Arraying Microfluidic Devices

Xiaoqing Tang
Xiaoming Liu 0007
Yuyang Li 0003
Dan Liu 0009
Yuke Li
Masaru Kojima
Qiang Huang 0002
Tatsuo Arai

The magnetic microrobot has become a promising approach in many biomedical applications due to its small volume, flexible motion, and untethered micromachines. The micro-chain robot is one of the most popular magnetic microrobots. However, the uncontrollable magnetic moment direction and quantity of the magnetic beads consisted in the existing self-assembled micro-chain robot limit their locomotion and applications. This paper proposed an on-chip micro-chain robot fabrication method to assemble the magnetic beads with controllable magnetic moment direction and quantity. The bead quantity can be controlled by the structure limits of the microchannel, and the direction of the magnetic moment can be adj usted by the integrated external magnetic field. The assembled magnetic beads are then glued by the hydrogel under UV exposure. The micro-chain robots with different quantities and magnetic moment directions of the magnetic beads were successfully fabricated and tested in experiments. Due to the array structure of the microfluidic device, batch manufacturing of low-cost magnetic robots was achieved in our method. The movement of dual-bead microrobots with two orthogonal magnetic moment directions was analyzed and compared. One of the dual-bead microrobots was applied in the transportation of the hydrogel module using pushing and pulling modes. It indicated that the proposed controllable on-chip fabrication of the magnetic micro-chain robots has the potential to enhance the microrobot ability in biomedical applications.

Details

AAAI Conference 2022 Conference Paper

ELMA: Energy-Based Learning for Multi-Agent Activity Forecasting

Yuke Li
Pin Wang
Lixiong Chen
Zheng Wang
Ching-Yao Chan

This paper describes an energy-based learning method that predicts the activities of multiple agents simultaneously. It aims to forecast both upcoming actions and paths of all agents in a scene based on their past activities, which can be jointly formulated by a probabilistic model over time. Learning this model is challenging because: 1) it has a large number of time-dependent variables that must scale with the forecast horizon and the number of agents; 2) distribution functions have to contain multiple modes in order to capture the spatiotemporal complexities of each agent’s activities. To address these challenges, we put forth a novel Energy-based Learning approach for Multi-Agent activity forecasting (ELMA) to estimate this complex model via maximum log-likelihood estimation. Specifically, by sampling from a sequence of factorized marginalized multi-modal distributions, ELMA generates the possible future actions efficiently. Moreover, by graph-based representations, ELMA also explicitly resolves the spatio-temporal dependencies of all agents’ activities in a single pass. Our experiments on two large-scale datasets prove that ELMA outperforms recent leading studies by an obvious margin.

PDF Details