Arrow Research search

Author name cluster

Ning Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

29 papers
2 author rows

Possible papers

29

AAAI Conference 2026 Conference Paper

Elite Pattern Reinforcement for Vehicle Routing Problems

  • Ning Li
  • Peng Lin
  • Peng Zhang
  • Ruichen Tian

Machine learning methods have been increasingly applied to solve Vehicle Routing Problems (VRPs). A high-efficiency approach is to learn solution construction using deep neural networks. However, their tendency toward premature convergence is a critical barrier, severely hindering generalization across diverse distributions and scales. To overcome this, we introduce Elite-Pattern Reinforcement (EPR), a novel strategy designed to create a synergy between the diverse, exploratory nature of reinforcement learning and the high-quality, structured knowledge from classical heuristics. The strategy guides the learning process by reinforcing structural patterns from elite solutions, employing an elite-guided score modulation to integrate this external knowledge. The inherent symmetry of path patterns is also exploited to augment the structural information. This steers the policy away from premature convergence by enabling it to distinguish and favour elite path patterns over inferior ones. Integrating our strategy with four construction methods yields substantial performance improvements on the CVRPLIB and TSPLIB benchmarks. Furthermore, our approach outperforms state-of-the-art learning-based methods, demonstrating superior generalization across diverse distributions and scales.

AAAI Conference 2026 Conference Paper

MUTrack: A Memory-Aware Unified Representation Framework for Visual Tracking

  • Weijing Wu
  • Qihua Liang
  • Bineng Zhong
  • Xiaohu Tang
  • Yufei Tan
  • Ning Li
  • Yuanliang Xue

Building a unified target representation that simultaneously achieves short-term adaptability and long-term stability is crucial for robust visual tracking. However, existing trackers typically face an inherent trade-off. Methods primarily relying on short-term appearance and motion cues achieve rapid adaptation, but they often struggle with long-term identity consistency. Conversely, trackers that emphasize extensive temporal context provide strong robustness, yet this approach can compromise their short-term adaptability. To bridge this gap, we propose a novel tracker, MUTrack, which comprehensively integrates both long-term and short-term memories into a unified target representation for more robust tracking. Specifically, we design a unified memory bank that stores and manages long-term memory for maintaining long-term identity consistency, and short-term memory for adapting to instantaneous appearance changes. To fully leverage the complementary nature of both long-term and short-term temporal information, we introduce a perception interaction module that dynamically fuses these memory types through deep and bidirectional interactions, enabling mutual refinement where one guides the other. This ultimately generates a highly adaptive target representation, which effectively balances adaptability to instantaneous changes with robustness against long-term identity drift. Extensive experiments on GOT10k, TrackingNet, LaSOT, LaSOT_ext, NfS, and OTB100 consistently demonstrate that MUTrack achieves SOTA performance.

EAAI Journal 2025 Journal Article

A review on data-driven prognostics and health management for wind turbine systems

  • Mi Yan
  • Siu Cheung Hui
  • Na Jiang
  • Ning Li

The wind power industry has developed rapidly due to the transformation of the global energy matrix. Prognostics and health management have attracted significant attention from industries and academia to ensure the reliability and safety of wind turbines, reduce maintenance costs, and increase productivity. Currently, most wind farms use supervisory control and data acquisition systems to collect, record, and store wind turbine operating data. Data-driven prognostics and health management of wind turbine systems have become the most commonly used methods for real-time monitoring and fault alarm prediction. Although some reviews of studies on data-driven prognostics and health management for wind turbine systems are available, they are structured mainly based on the processing tasks or key components of wind turbine systems, and have overlooked the challenges of data quality behind these processes. Different from previous works, this paper reviews the current developments of data-driven prognostics and health management for wind turbine systems from the perspective of data challenges, which include data availability, data labeling, data scarcity, data imbalance, data inconsistency, dynamic data and fleet-based data. In this paper, we discuss some open datasets and current data challenges in data-driven prognostics and health management for wind turbine systems, and review the related methods proposed for addressing these data issues. We then provide practical applications to help engineers understand data-driven prognostics and health management better. Finally, future research directions are suggested for further work on data-driven prognostics and health management for wind turbine systems.

AAAI Conference 2025 Conference Paper

Decoupled Spatio-Temporal Consistency Learning for Self-Supervised Tracking

  • Yaozong Zheng
  • Bineng Zhong
  • Qihua Liang
  • Ning Li
  • Shuxiang Song

The success of visual tracking has been largely driven by datasets with manual box annotations. However, these box annotations require tremendous human effort, limiting the scale and diversity of existing tracking datasets. In this work, we present a novel Self-Supervised Tracking framework, named SSTrack, designed to eliminate the need of box annotations. Specifically, a decoupled spatio-temporal consistency training framework is proposed to learn rich target information across timestamps through global spatial localization and local temporal association. This allows for the simulation of appearance and motion variations of instances in real-world scenarios. Furthermore, an instance contrastive loss is designed to learn instance-level correspondences from a multi-view perspective, offering robust instance supervision without additional labels. This new design paradigm enables SSTrack to effectively learn generic tracking representations in a self-supervised manner, while reducing reliance on extensive box annotations. Extensive experiments on nine benchmark datasets demonstrate that SSTrack surpasses SOTA self-supervised tracking methods, achieving an improvement of more than 25.3%, 20.4%, and 14.8% in AUC (AO) score on the GOT10K, LaSOT, TrackingNet datasets, respectively.

EAAI Journal 2025 Journal Article

Intelligent fault diagnosis of nonlinear uncertain industrial processes based on kernel local–global interval embedding algorithm

  • Ning Li
  • Hua Ding
  • Xiaochun Sun
  • Zeping Liu

With the rapid development of a new generation of Big Data and artificial intelligence, intelligent fault diagnosis of industrial processes including chemical processes, coal mining equipment operations, etc. , has become increasingly important. The local–global interval embedding algorithm (LGIEA) has attracted significant attention for its capability to simultaneously extract local and global features from interval data. However, this method can only process linear interval data and performs poorly in terms of extracting strong nonlinear features. To solve the problem, this study proposes a new intelligent fault diagnosis method based on kernel LGIEA (KLGIEA), which extends the linear process monitoring model to nonlinearity. First, the interval inner product estimation (IIPE) is transformed into the kernel IIPE by introducing kernel function, which can not only inherit the advantage that LGIEA can extract both global and local features of data simultaneously, but also has stronger applicability to nonlinear data in industrial processes. Second, the four statistics defined can effectively monitor the fault of industrial equipment under strong interference environment such as noises, and the nonlinear reconstruction contribution (NRC) can effectively identify the fault variables, improve the fault diagnosis ability of KLGIEA. Finally, two cases of the Tennessee Eastman process (TEP) simulation data from Eastman Company and site shearer fault data obtained from Shaqu No. 2 coal mine show that KLGIEA is significantly superior to complete information principal component analysis (PCA), midpoint-radius kernel PCA, and LGIEA in processing nonlinear interval data, improving accuracy, applicability, and reliability of algorithm.

EAAI Journal 2025 Journal Article

Key node propagation-based overlapping spammer group detection algorithm on e-commerce platforms

  • Chaoqun Wang
  • Ning Li
  • Shuang Chen
  • Xiaoqing Bu
  • Shujuan Ji

With the rapid growth of e-commerce platforms, spammer groups have increasingly used fake reviews to influence consumer decisions, posing significant challenges to platform governance. This issue has become even more pronounced with the widespread use of large language models, which have made fake reviews harder to detect. However, existing spammer group detection algorithms have certain limitations. For example, they often overlook the core–periphery structure within spammer groups, failing to adequately focus on the core reviewers who play a crucial role in group operations. Additionally, these algorithms struggle to detect spammers who are active across multiple groups. To address these challenges, we propose an overlapping spammer group detection algorithm based on key node propagation (KNP-OSG). First, we model the review data as a co-review graph and use the Deep Q-Network algorithm combined with an action filtering mechanism to identify key reviewers, or key spammers, who have a critical impact on spammer group detection. Subsequently, based on the structural relationships among pivotal spammers, an improved label propagation algorithm, copra-g, is proposed to further identify spammer groups. Experimental results show that the KNP-OSG algorithm outperforms existing methods on real-world datasets, demonstrating its effectiveness in detecting overlapping spammer groups.

NeurIPS Conference 2025 Conference Paper

MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation

  • Ning Li
  • Xiangmou Qu
  • Jiamu Zhou
  • Muning Wen
  • Kounianhua Du
  • Xingyu Lou
  • Qiuying Peng
  • Jun Wang

Recent advances in Multimodal Large Language Models (MLLMs) have enabled the development of mobile agents that can understand visual inputs and follow user instructions, unlocking new possibilities for automating complex tasks on mobile devices. However, applying these models to real-world mobile scenarios remains a significant challenge due to the long-horizon task execution, difficulty in error recovery, and the cold-start problem in unfamiliar environments. To address these challenges, we propose MobileUse, a GUI agent designed for robust and adaptive mobile task execution. To improve resilience in long-horizon tasks and dynamic environments, we introduce a hierarchical reflection architecture that enables the agent to self-monitor, detect, and recover from errors across multiple temporal scales—ranging from individual actions to overall task completion—while maintaining efficiency through a Reflection-on-Demand strategy. To tackle cold-start issues, we further introduce a proactive exploration module, which enriches the agent’s understanding of the environment through self-planned exploration. Evaluations on the AndroidWorld and AndroidLab benchmarks demonstrate that MobileUse establishes new state-of-the-art performance, achieving success rates of 62. 9% and 44. 2%, respectively. To facilitate real-world applications, we release an out-of-the-box toolkit for automated task execution on physical mobile devices, which is available at https: //github. com/MadeAgents/mobile-use.

EAAI Journal 2025 Journal Article

RCSD-UAV: An object detection dataset for unmanned aerial vehicles in realistic complex scenarios

  • WanXuan Geng
  • Junfan Yi
  • Ning Li
  • Chen Ji
  • Yu Cong
  • Liang Cheng

Unmanned Aerial Vehicle (UAV) detection based on visible light plays an important role in urban low-altitude defense, public safety and other fields. However, the current dataset is limited by single scene, large object and other factors deviated from the actual application scene, making it difficult to meet the needs of sample-driven deep learning optical image UAV detection. Therefore, this paper proposed a novel realistic complex scenarios UAV object dataset (RCSD-UAV) to provide training data for UAV detection models based on artificial intelligence technology. All data were obtained from ordinary cameras or mobile phones in the real world, covering various commonly used UAV types and natural scenes. The dataset is classified according to the scene and the object size, and we evaluated several models and gave benchmarks. From the experimental results, it can be concluded that the detection of UAVs is challenging due to small size and complex background. The two-stage model has good detection effect but poor real-time performance. The one-stage model can better balance the detection effect and real-time performance.

AAAI Conference 2025 Conference Paper

Robust Tracking via Mamba-based Context-aware Token Learning

  • Jinxia Xie
  • Bineng Zhong
  • Qihua Liang
  • Ning Li
  • Zhiyi Mo
  • Shuxiang Song

How to make a good trade-off between performance and computational cost is crucial for a tracker. However, current famous methods typically focus on complicated and time-consuming learning that combining temporal and appearance information by input more and more images (or features). Consequently, these methods not only increase the model's computational source and learning burden but also introduce much useless and potentially interfering information. To alleviate the above issues, we propose a simple yet robust tracker that separates temporal information learning from appearance modeling and extracts temporal relations from a set of representative tokens rather than several images (or features). Specifically, we introduce one track token for each frame to collect the target's appearance information in the backbone. Then, we design a mamba-based Temporal Module for track tokens to be aware of context by interacting with other track tokens within a sliding window. This module consists of a mamba layer with autoregressive characteristic and a cross-attention layer with strong global perception ability, ensuring sufficient interaction for track tokens to perceive the appearance changes and movement trends of the target. Finally, track tokens serve as a guidance to adjust the appearance feature for the final prediction in the head. Experiments show our method is effective and achieves competitive performance on multiple benchmarks at a real-time speed.

EAAI Journal 2025 Journal Article

Train a real-world local path planner in one hour via partially decoupled reinforcement learning and vectorized diversity

  • Jinghao Xin
  • Jinwoo Kim
  • Zhi Li
  • Ning Li

Deep Reinforcement Learning (DRL) has exhibited efficacy in resolving the Local Path Planning (LPP) problem. However, its practical application remains significantly constrained due to its limited training efficiency and generalization capability. To address these challenges, we propose a solution termed Color, which includes an Actor-Sharer-Learner (ASL) training framework designed to improve efficiency, and a fast yet diverse simulator named Sparrow aimed at elevating both efficiency and generalization. Specifically, the ASL employs a Vectorized Data Collection (VDC) mode to enhance data collection, decouples the model optimization from data collection to expedite data consumption, and partially connects the two procedures with a Time Feedback Mechanism (TFM) to evade data underuse or overuse. Meanwhile, the Sparrow simulator utilizes a 2-Dimensional (2D) grid-based world, simplified kinematics, matrix operation, and conversion-free data flow to achieve a lightweight design. The lightness facilitates vectorized diversity, allowing for rapid and diversified simulation across numerous copies of the vectorized environments, thereby significantly enhancing both efficiency and generalization capacity. Comprehensive experiments demonstrate that with merely one hour of simulation training, Color achieves impressive arrival rates of 84% and 90% on 32 simulated and 42 real-world LPP scenarios, respectively. The code and video of this paper are accessible on our website. 1 1 https: //github. com/XinJingHao/Color.

NeurIPS Conference 2024 Conference Paper

4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on RDBs

  • Minjie Wang
  • Quan Gan
  • David Wipf
  • Zhenkun Cai
  • Ning Li
  • Jianheng Tang
  • Yanlin Zhang
  • Zizhao Zhang

Given a relational database (RDB), how can we predict missing column values in some target table of interest? Although RDBs store vast amounts of rich, informative data spread across interconnected tables, the progress of predictive machine learning models as applied to such tasks arguably falls well behind advances in other domains such as computer vision or natural language processing. This deficit stems, at least in part, from the lack of established/public RDB benchmarks as needed for training and evaluation purposes. As a result, related model development thus far often defaults to tabular approaches trained on ubiquitous single-table benchmarks, or on the relational side, graph-based alternatives such as GNNs applied to a completely different set of graph datasets devoid of tabular characteristics. To more precisely target RDBs lying at the nexus of these two complementary regimes, we explore a broad class of baseline models predicated on: (i) converting multi-table datasets into graphs using various strategies equipped with efficient subsampling, while preserving tabular characteristics; and (ii) trainable models with well-matched inductive biases that output predictions based on these input subgraphs. Then, to address the dearth of suitable public benchmarks and reduce siloed comparisons, we assemble a diverse collection of (i) large-scale RDB datasets and (ii) coincident predictive tasks. From a delivery standpoint, we operationalize the above four dimensions (4D) of exploration within a unified, scalable open-source toolbox called 4DBInfer; please see https: //github. com/awslabs/multi-table-benchmark.

EAAI Journal 2024 Journal Article

A self-decision ant colony clustering algorithm for electricity theft detection

  • Zhengqiang Yang
  • Linyue Liu
  • Ning Li
  • He Li

The load data features of some electricity-theft consumers during the theft period are similar to those of normal consumers, making these electricity-theft consumers outliers from the cluster of electricity-theft. The current classification method, which uses the mean value to determine the cluster centers, is vulnerable to the influence of outliers. Therefore, this paper proposes a self-decision ant colony clustering algorithm for electricity theft detection method that is targeted to self-decision which samples are used to update the cluster centers. The method constructs a dynamic weighting approach to determine the cluster centers based on the idea of Backpropagation, and updates the weights of each sample in the clusters to reflect the different importance of different samples, thus reducing the influence of outlier samples. A new activation function, Odd, is proposed to enhance the ability of the proposed method to solve linearly indistinguishable problems. A self-decision dropout mechanism is proposed which evolves the mechanism of randomly stopping the work of samples in clusters into a targeted and self-decision mechanism that stops the work of redundant or non-active samples as well as improves the contribution of outlier samples with positive effects. In this paper, the proposed method is tested by the electricity consumption data provided by the State Grid Corporation of China (SGCC) and the Smart* Data Set for Sustainability (SDSS) provided by the UMass Trace Repository, and the experimental results show that the proposed method effectively solves the above problems with higher detection accuracy, it has certain advantages over other current studies.

AAAI Conference 2024 Conference Paper

AACP: Aesthetics Assessment of Children’s Paintings Based on Self-Supervised Learning

  • Shiqi Jiang
  • Ning Li
  • Chen Shi
  • Liping Guo
  • Changbo Wang
  • Chenhui Li

The Aesthetics Assessment of Children's Paintings (AACP) is an important branch of the image aesthetics assessment (IAA), playing a significant role in children's education. This task presents unique challenges, such as limited available data and the requirement for evaluation metrics from multiple perspectives. However, previous approaches have relied on training large datasets and subsequently providing an aesthetics score to the image, which is not applicable to AACP. To solve this problem, we construct an aesthetics assessment dataset of children's paintings and a model based on self-supervised learning. 1) We build a novel dataset composed of two parts: the first part contains more than 20k unlabeled images of children's paintings; the second part contains 1.2k images of children's paintings, and each image contains eight attributes labeled by multiple design experts. 2) We design a pipeline that includes a feature extraction module, perception modules and a disentangled evaluation module. 3) We conduct both qualitative and quantitative experiments to compare our model's performance with five other methods using the AACP dataset. Our experiments reveal that our method can accurately capture aesthetic features and achieve state-of-the-art performance.

JBHI Journal 2024 Journal Article

An Efficient Multi-Task Synergetic Network for Polyp Segmentation and Classification

  • Miao Wang
  • Xingwei An
  • Zhengcun Pei
  • Ning Li
  • Li Zhang
  • Gang Liu
  • Dong Ming

Colonoscopy is considered the best diagnostic tool for early detection and resection of polyps, which can effectively prevent consequential colorectal cancer. In clinical practice, segmenting and classifying polyps from colonoscopic images have a great significance since they provide precious information for diagnosis and treatment. In this study, we propose an efficient multi-task synergetic network (EMTS-Net) for concurrent polyp segmentation and classification, and we introduce a polyp classification benchmark for exploring the potential correlations of the above-mentioned two tasks. This framework is composed of an enhanced multi-scale network (EMS-Net) for coarse-grained polyp segmentation, an EMTS-Net (Class) for accurate polyp classification, and an EMTS-Net (Seg) for fine-grained polyp segmentation. Specifically, we first obtain coarse segmentation masks by using EMS-Net. Then, we concatenate these rough masks with colonoscopic images to assist EMTS-Net (Class) in locating and classifying polyps precisely. To further enhance the segmentation performance of polyps, we propose a random multi-scale (RMS) training strategy to eliminate the interference caused by redundant information. In addition, we design an offline dynamic class activation mapping (OFLD CAM) generated by the combined effect of EMTS-Net (Class) and RMS strategy, which optimizes bottlenecks between multi-task networks efficiently and elegantly and helps EMTS-Net (Seg) to perform more accurate polyp segmentation. We evaluate the proposed EMTS-Net on the polyp segmentation and classification benchmarks, and it achieves an average mDice of 0. 864 in polyp segmentation and an average AUC of 0. 913 with an average accuracy of 0. 924 in polyp classification. Quantitative and qualitative evaluations on the polyp segmentation and classification benchmarks demonstrate that our EMTS-Net achieves the best performance and outperforms previous state-of-the-art methods in terms of both efficiency and generalization.

EAAI Journal 2024 Journal Article

Combination prediction of underground mine rock drilling time based on seasonal and trend decomposition using Loess

  • Ning Li
  • Ding Liu
  • Liguan Wang
  • Haiwang Ye
  • Qizhou Wang
  • Dairong Yan
  • Shugang Zhao

The rock drilling process is a critical component of underground mining, and its operation time is a crucial factor in mine planning and production scheduling optimization; consequently, it is essential to make an accurate prediction of rock drilling operation time using historical time series data. This study proposes a combination prediction model for underground mine rock drilling time based on Seasonal and Trend decomposition using Loess (STL). The STL model decomposes the historical time series data into the trend, seasonal, and random components, uses the Deep Belief Network - Extreme Learning Machine (DBN-ELM) model to predict the trend component, the Support Vector Regression (SVR) model to predict the seasonal component, and the historical mean value to predict the random component, and then overlays and reconstructs the prediction results of the three components to obtain the predicted values of the final working hours of rock drilling operations. The experimental results indicate that the Mean Absolute Percentage Error (MAPE) of the prediction result of the trend component with the highest weight proportion is 0. 0159%, whereas the MAPE of the prediction result of the three components superposition reconstruction model is 0. 2135%, representing a significant improvement in prediction accuracy compared to each comparison model and a broad range of practical applications.

AAAI Conference 2024 Conference Paper

Explicit Visual Prompts for Visual Object Tracking

  • Liangtao Shi
  • Bineng Zhong
  • Qihua Liang
  • Ning Li
  • Shengping Zhang
  • Xianxian Li

How to effectively exploit spatio-temporal information is crucial to capture target appearance changes in visual tracking. However, most deep learning-based trackers mainly focus on designing a complicated appearance model or template updating strategy, while lacking the exploitation of context between consecutive frames and thus entailing the when-and-how-to-update dilemma. To address these issues, we propose a novel explicit visual prompts framework for visual tracking, dubbed EVPTrack. Specifically, we utilize spatio-temporal tokens to propagate information between consecutive frames without focusing on updating templates. As a result, we cannot only alleviate the challenge of when-to-update, but also avoid the hyper-parameters associated with updating strategies. Then, we utilize the spatio-temporal tokens to generate explicit visual prompts that facilitate inference in the current frame. The prompts are fed into a transformer encoder together with the image tokens without additional processing. Consequently, the efficiency of our model is improved by avoiding how-to-update. In addition, we consider multi-scale information as explicit visual prompts, providing multiscale template features to enhance the EVPTrack's ability to handle target scale changes. Extensive experimental results on six benchmarks (i.e., LaSOT, LaSOText, GOT-10k, UAV123, TrackingNet, and TNL2K.) validate that our EVPTrack can achieve competitive performance at a real-time speed by effectively exploiting both spatio-temporal and multi-scale information. Code and models are available at https://github.com/GXNU-ZhongLab/EVPTrack.

EAAI Journal 2024 Journal Article

Flexible margins and multiple samples learning to enhance lexical semantic similarity

  • Jeng-Shyang Pan
  • Xiao Wang
  • Dongqiang Yang
  • Ning Li
  • Kevin Huang
  • Shu-Chuan Chu

The advancement of deep learning and neural networks has led to the widespread adoption of neural word embeddings as a prominent lexical representation method in natural language processing. With the help of the neural language model trained by the contextual information of large scale text, the neural word embedding obtained by the neural language model captures more semantic correlation in the semantic space, while ignoring the semantic similarity. It will incur high computational cost and time costs during the training process of the model. To better inject semantic similarity into the distribution space and reduce time cost, we perform post processing learning of neural word embeddings using deep metric learning. This paper proposes a lexical enhancement method based on flexible margins and multiple samples learning. In this method, we embed the lexical entailment constraint relations into neural word embeddings. By categorizing the set of lexical constraints and penalizing the negative samples to different degrees according to the gap between categories, and allowing the positive and negative samples to learn from each other in the distributed space. The method we propose significantly improves neural word embeddings. By evaluating neural word embedded vocabulary similarity, the benchmark accuracy is improved to 75%. The method shows great competitiveness in text similarity tasks and text categorization tasks. These findings summarize research results and provide strong support for further applications.

EAAI Journal 2024 Journal Article

SDG: A global large-scale airport perception disparity cognition modeling method based on deep learning and geographic knowledge

  • Ning Li
  • Liang Cheng
  • Hui Chen
  • Yalu Zhang
  • Lei Wang
  • Chen Ji
  • Manchun Li

Global airport perception levels vary due to natural geographical factors and economic development disparities. Understanding these differences is crucial for assessing regional airport development and its correlation with geographical patterns. However, there are limited methods available to effectively comprehend these disparities. To address this issue, this paper proposes a Salience, Disturbance, and Geographic-knowledge (SDG) approach for the cognitive analysis of global large-scale airport perception differences. Salience is assessed using a two-class deep learning model to evaluate the prominence of known airports. Disturbance is evaluated using an object detection model to measure background interference in large-scale airport perception. Geographic-knowledge analysis considers the correlation between regional airports and their surrounding geographic environment. The results rank perception difficulties for 17 regions worldwide, with Tajikistan exhibiting the highest difficulty at 0. 922, while the Jiangsu–Zhejiang–Shanghai region in China has the lowest at 0. 102. We also performed correlation analyses to validate the effectiveness of our model. To our knowledge, this paper pioneers the cognitive analysis of target perception difficulty differences across multiple global regions.

ICRA Conference 2024 Conference Paper

Towards a Novel Soft Magnetic Laparoscope for Single Incision Laparoscopic Surgery

  • Hui Liu
  • Ning Li
  • Shuai Li 0018
  • Gregory J. Mancini
  • Jindong Tan

In single-incision laparoscopic surgery (SILS), magnetic anchoring and guidance system (MAGS) is a promising technique to prevent clutter in the surgical workspace and provide a larger vision field. Existing camera designs mainly rely on rigid structure design, resulting in risks of losing magnetic coupling and impacting tissue during the insertion and coupling procedure. In this paper, we proposed a wireless MAGS consisting of soft material and structure design. The camera can bend at the exit of the trocar and maintain strong coupling with the external actuator. The operation principle and modeling were established to investigate the parameter design. An easier insertion procedure was introduced and demonstrated in the experiment. The bendability was tested showing the camera could reach 20° in bending angle and 16. 4mm in displacement. The insertion and deployment took less than 2 minutes on average.

EAAI Journal 2023 Journal Article

Airport detection in remote sensing real-open world using deep learning

  • Ning Li
  • Liang Cheng
  • Chen Ji
  • Hui Chen
  • WanXuan Geng
  • WeiMing Yang

Remote sensing real-open world of large-scare areas brings a high false alarm rate to object detection because of highly complex backgrounds. In this study, we constructed a two-stage extraction framework candidate region extraction (CRE)–multi-core binary analysis (MCBA) (CRE-MCBA) to improve the correct detection rate (DR) and reduce the error DR for airport extraction in large-scale remote sensing real-open areas. First, global sample labeling and large-scale runway CRE were conducted. Open-sourced data were applied to match the detection results spatially, and the MCBA was built for the issue of unbalanced positive and negative samples to mine potential airports. The minimum penalty term δ was also introduced into focal loss to improve detection ability in a remote sensing real-open world area. In the 219, 041 km 2 study area at the Yangtze River Delta in China, the detection and error reduction rates were 100% and 97. 3%, respectively. A total of 37 airports with prominent runway characteristics were detected, with 9 newly added airports. We also test the CRE-MCBA framework in Japan, Korean Peninsula, and Madhya Pradesh of India. Compared with other detection methods, ours has more robust regional adaptability and generalization ability and realizes the practical mining of potential objects.

YNIMG Journal 2023 Journal Article

Neurochemical and functional reorganization of the cognitive-ear link underlies cognitive impairment in presbycusis

  • Ning Li
  • Wen Ma
  • Fuxin Ren
  • Xiao Li
  • Fuyan Li
  • Wei Zong
  • Lili Wu
  • Zongrui Dai

Recent studies suggest that the interaction between presbycusis and cognitive impairment may be partially explained by the cognitive-ear link. However, the underlying neurophysiological mechanisms remain largely unknown. In this study, we combined magnetic resonance spectroscopy (MRS) and resting-state functional magnetic resonance imaging (fMRI) to investigate auditory gamma-aminobutyric acid (GABA) and glutamate (Glu) levels, intra- and inter-network functional connectivity, and their relationships with auditory and cognitive function in 51 presbycusis patients and 51 well-matched healthy controls. Our results confirmed reorganization of the cognitive-ear link in presbycusis, including decreased auditory GABA and Glu levels and aberrant functional connectivity involving auditory networks (AN) and cognitive-related networks, which were associated with reduced speech perception or cognitive impairment. Moreover, mediation analyses revealed that decreased auditory GABA levels and dysconnectivity between the AN and default mode network (DMN) mediated the association between hearing loss and impaired information processing speed in presbycusis. These findings highlight the importance of AN-DMN dysconnectivity in cognitive-ear link reorganization leading to cognitive impairment, and hearing loss may drive reorganization via decreased auditory GABA levels. Modulation of GABA neurotransmission may lead to new treatment strategies for cognitive impairment in presbycusis patients.

EAAI Journal 2023 Journal Article

Randomization-based neural networks for image-based wind turbine fault diagnosis

  • Junda Wang
  • Yang Yang
  • Ning Li

As the development of wind energy industry, the safe production of wind farms has become an urgent problem. To avoid serious faults and deterioration, building effective diagnostic model for wind turbine (WT) has raised increasing attentions in wind-power industry. However, the challenges like big data of sensors and model construction exist still. In this paper, to achieve better performance and suitable framework, a three channel broad learning system (3-BLS) is proposed for image-based fault diagnosis (FD) on overall WT system. First, multiple sensor series are collected and converted into interpretable RGB images via right-sized sliding window for broader information and grabbing relations; Next, features are extracted in respective RGB channels, and a manual feature layer is added in the 3-BLS, where the structure is temporary non-specific; Finally, with the help of an optimizer, the concrete 3-BLS is auto-built with its structure configured reasonably and the manual features binary-coded and enabled selectively. In addition, an inter-channel attention scheme is formed during 3-BLS dynamic updating process, and several BLS prototypes different in projections are studied. In experiments, the optimized 3-BLS with less parameters got over 10% accuracy gain than adjusted single BLS and achieved over 98% fault detection on actual collected WT data.

EAAI Journal 2023 Journal Article

Underground mine truck travel time prediction based on stacking integrated learning

  • Ning Li
  • Yahui Wu
  • Qizhou Wang
  • Haiwang Ye
  • Liguan Wang
  • Mingtao Jia
  • Shugang Zhao

The travel time (TT) prediction of underground mine transport trucks provides essential information for the precise scheduling of mine intelligent dispatching systems. Given the operational requirements and transportation environment of underground mines, in this study, a TT prediction method for underground mine transportation trucks is proposed based on stacking integrated learning. First, depending on the position and status of the transport truck, the truck operation cycle process is broken down into three sections and six stages. The influencing factors of the trucks’ TT in each stage are determined from the perspectives of personnel, equipment, and environment. During the collection process of the influencing factors the road surface roughness data are collected through image processing as part of the influence factor data. The influencing factors’ data are used as input parameters for the stacking integrated learning prediction model. The prediction performance of the fusion model is compared with that of the single models and their pairwise combinations. The final prediction results show that the fusion model performs the best in the drifts, ramps, and ground road sections. The average absolute percentage errors of the predicted values in the three road sections are 2. 3091%, 4. 3906%, and 4. 5583%, respectively, and the corresponding decision coefficients are 0. 9890, 0. 9801, and 0. 9050. These results show that the prediction model based on the stacking integrated framework proposed in this paper has a high prediction accuracy and stability. This accurate model can meet the requirements of intelligent dispatching systems for underground mines.

ICRA Conference 2021 Conference Paper

Real-time 3D-Lidar, MMW Radar and GPS/IMU fusion based vehicle detection and tracking in unstructured environment

  • Ning Li
  • Caixia Lu
  • XueWei Yu
  • Xueyan Liu 0010
  • Bo Su

To solve the problem of unmanned ground vehicle leader-follower formation transportation in unstructured environment, we propose a novel target detection and tracking method based on multi-sensor fusion perception. Combined with 3D-Lidar, millimeter wave Radar and GPS/IMU, the proposed method can achieve stable target detection and continuous tracking of both static and dynamic vehicles. First, 3D-Lidar is used to detect the geometric model of the leader vehicle to complete the initialization of tracking target and it can also be assisted for target tracking. Then during the movement, the dynamic leader is mainly tracked through millimeter wave Radar as this sensor can keep tracking the same target with a constant index and effectively distinguish dynamic vehicle from other static obstacles according to relative speed estimation. In addition, by using GPS/IMU based integrated navigation, the movement trend of the leader can be derived according to the echo vehicle pose information and the relative position relationship. This is helpful to reduce the region of interest for target tracking and improve the real-time performance. In different unstructured environments, we perform the leader-follower formation transportation experiments for hundreds of kilometers. In rough terrain, the maximum tracking speed can still reach 40km/h and the maximum tracking distance can be up to 100 meters. Experiments show that the proposed method is suitable for vehicle target detection and tracking in unstructured environment. It has good robustness and high real-time performance with an average processing frame rate of 20Hz. The proposed method can be used for the formation transportation of unmanned ground vehicles to reduce labor costs.

ICRA Conference 2021 Conference Paper

Recovering Stress Distribution on Deformable Tissue for a Magnetic Actuated Insertable Laparoscopic Surgical Camera

  • Ning Li
  • Gregory J. Mancini
  • Amy Chandler
  • Jindong Tan

Fully insertable laparoscopic cameras represent a promising future of minimally invasive surgery. The most characteristic technology adopted on these devices is transabdominal anchoring and actuation based on magnetic coupling. However, few have paid adequate attention to the safety concerns. As the camera is anchored against the interior abdominal wall without any force feedback, the patient is being exposed to a high risk of getting injured by inappropriate stress on the tissue. We have recovered the camera-tissue interaction force via a non-invasive approach in our previous work. Aiming to access the stress distribution, this paper presents a viscoelastic camera-tissue interaction model, which establishes explicit relations between the contact force and the stress distribution on the tissue. For the first time, a geometric constraint between the contact angle and the tissue indentation is introduced, which helps make the multivariable model solvable. Ex-vivo experiments on porcine abdomen tissue facilitated by non-invasive force measurement validate effectiveness of the model. This work lays foundation for improving control and surgical safety of using a magnetic actuated insertable laparoscopic surgical camera.

ICRA Conference 2019 Conference Paper

A Noninvasive Approach to Recovering the Lost Force Feedback for a Robotic-Assisted Insertable Laparoscopic Surgical Camera

  • Ning Li
  • Gregory J. Mancini
  • Jindong Tan

Fully insertable laparoscopic cameras feature more locomotive flexibility in a larger workspace compared to conventional trocar-based laparoscopes and thus represent a promising future of minimally invasive surgery. These cameras are principally anchored and actuated by transabdominal magnetic coupling. Although several proof-of-concept prototypes have shown the technical feasibility in terms of camera actuation and laparoscopic imaging, none of them are getting close to clinical practice due to concerns about safety. One common problem lies in that the interaction force between the camera and the abdominal wall tissue is completely unknown and not controlled. The camera is being manipulated in an open loop which exposes the patient to a high risk of being injured. In this paper, a noninvasive real-time camera-tissue interaction force measurement approach for an insertable laparoscopic camera is proposed, implemented, and validated. Ex-vivo experiments using a simulated abdominal cavity have demonstrated the effectiveness of this approach during anchoring, translation, and rotation camera behaviors. Potential surgical impacts enabled by the force feedback have also been exemplified by a robotic-assisted camera control experiment using shared autonomy.

IJCAI Conference 2018 Conference Paper

Deep Joint Semantic-Embedding Hashing

  • Ning Li
  • Chao Li
  • Cheng Deng
  • Xianglong Liu
  • Xinbo Gao

Hashing has been widely deployed to large-scale image retrieval due to its low storage cost and fast query speed. Almost all deep hashing methods do not sufficiently discover semantic correlation from label information, which results in the learned hash codes less discriminative. In this paper, we propose a novel Deep Joint Semantic-Embedding Hashing (DSEH) approach that contains LabNet and ImgNet. Specifically, LabNet is explored to capture abundant semantic correlation between sample pairs and supervise ImgNet from semantic level and hash codes level, which is conductive to the generated hash codes being more discriminative and similarity-preserving. Extensive experiments on three benchmark datasets show that the proposed model outperforms the state-of-the-art methods.

IROS Conference 2017 Conference Paper

A novel laparoscopic camera robot with in-vivo lens cleaning and debris prevention modules

  • A. Reza Yazdanpanah
  • Xiaolong Liu 0002
  • Ning Li
  • Jindong Tan

Robotic systems have recently drawn attention in minimally invasive surgeries due to their increased dexterity feature. A major drawback of these systems is image blurring due to lens contamination which cause imaging impairment during up to 40% of surgery time. This paper demonstrates a novel laparoscopic magnetic driven camera system with implemented in-vivo lens cleaning and debris prevention systems. This camera robot can cover 150 degrees field of view inside the abdominal cavity and provide adjustable illumination system to improve the video quality. Design details for different modules, such as anchoring, actuation, video capturing, illumination, debris prevention and lens cleaning of this robot have been provided and discussed. This camera robot can decrease the possibility of lens contamination by creating CO 2 gas barrier in front of lens. In case of contamination it can clean the lens in-vivo without removing the camera from abdominal cavity. The lens cleaning module has been tested for water vapor and water droplets. The robot is manufactured and each module has been validated by designed experiments.