Arrow Research search

Author name cluster

Yan Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

29 papers
2 author rows

Possible papers

29

TAAS Journal 2026 Journal Article

A Novel Physics-Informed Federated Learning Framework for Robust Bearing Fault Diagnosis

  • Jiaqi Chen
  • Jie Wang
  • Yongquan Jiang
  • ZhengHong Wang
  • Fan Zhang
  • Yan Yang

Rolling bearing failures are a primary cause of catastrophic machinery breakdowns, posing significant economic and safety risks. Effective fault diagnosis is frequently hindered by challenges inherent to modern industrial settings, including data privacy constraints, statistical heterogeneity across Non-Independent and Identically Distributed (Non-IID) datasets, and the prevalence of few-shot learning scenarios. To address these challenges, this paper introduces CARR-MgNet, a novel physics-informed federated learning framework. The framework utilizes a M ulti- g ranularity fusion Net work (MgNet) backbone, which enhances feature robustness by embedding physical fault characteristics directly into its convolutional kernels. To ensure stable federated training across heterogeneous clients, we then introduce a C lass- A verage R epresentation R egularization (CARR) mechanism to effectively mitigate client drift. Extensive experiments on four public industrial datasets validate the state-of-the-art performance of our proposed framework. Under challenging non-IID conditions, CARR-MgNet surpasses established baselines, including FedProx and MOON, by up to 8.2% in accuracy. Furthermore, it reduces the number of communication rounds required to reach 95% accuracy by 40% compared to FedAvg and reduces total communication overhead by 35%. These results demonstrate that our physics-informed federated approach provides a robust, communication-efficient, and privacy-preserving solution for real-world industrial fault diagnosis.

AAAI Conference 2026 Conference Paper

ALERT: Adversarial Learning Enhanced Stability-aware Routing Transformer for Adaptive Depression Detection

  • Liangyi Kang
  • Wei Hua
  • Yan Yang
  • Jie Liu
  • DAN YE

Detecting depression through social media is a complex task, as noisy user-generated content creates significant interference between persistent depressive patterns and transient emotional expressions. Two main challenges arise: First, negative mood indicators are not exclusive to depressed individuals, making it difficult to distinguish between pathological symptoms and situational emotional variations. Second, existing static models fail to adapt to diverse user expression styles and effectively filter out confounding noise from posts by non-depressed individuals. This results in conventional approaches either overfitting to superficial emotional cues or overlooking subtle long-term symptom progression. To address these issues, we propose the Adversarial Learning Enhanced Stability-aware Routing Transformer for Adaptive Depression Detection(ALERT), a novel framework integrating adaptive attention routing and adversarial learning to enhance robustness against confounding mood signals. Specifically, ALERT employs a stability-aware dynamic routing mechanism to annotate user-specific mood valence trends, providing a structured representation of affective progression over time. An adversarial learning module then leverages these mood-based representations to distinguish between expressions indicative of persistent depressive mood and variations in situational mood states, ensuring adaptability to diverse user behaviors. Experimental results on public social media datasets demonstrate that ALERT outperforms state-of-the-art methods in depression detection, effectively reducing false alarm from transient mood states and improving classification accuracy.

AAAI Conference 2026 Conference Paper

Sparse4DGS: 4D Gaussian Splatting for Sparse-Frame Dynamic Scene Reconstruction

  • Changyue Shi
  • Chuxiao Yang
  • Xinyuan Hu
  • Minghao Chen
  • Wenwen Pan
  • Yan Yang
  • Jiajun Ding
  • Zhou Yu

Dynamic Gaussian Splatting approaches have achieved remarkable performance for 4D scene reconstruction. However, these approaches rely on dense-frame video sequences for photorealistic reconstruction. In real-world scenarios, due to equipment constraints, sometimes only sparse frames are accessible. In this paper, we propose Sparse4DGS, the first method for sparse-frame dynamic scene reconstruction. We observe that dynamic reconstruction methods fail in both canonical and deformed spaces under sparse-frame settings, especially in areas with high texture richness. Sparse4DGS tackles this challenge by focusing on texture-rich areas. For the deformation network, we propose Texture-Aware Deformation Regularization, which introduces a texture-based depth alignment loss to regulate Gaussian deformation. For the canonical Gaussian field, we introduce Texture-Aware Canonical Optimization, which incorporates texture-based noise into the gradient descent process of canonical Gaussians. Extensive experiments show that when taking sparse frames as inputs, our method outperforms existing dynamic or few-shot techniques on NeRF-Synthetic, HyperNeRF, NeRF-DS, and our iPhone-4D datasets.

JBHI Journal 2025 Journal Article

Adapter-Enhanced Hierarchical Cross-Modal Pre-Training for Lightweight Medical Report Generation

  • Ting Yu
  • Wangwen Lu
  • Yan Yang
  • Weidong Han
  • Qingming Huang
  • Jun Yu
  • Ke Zhang

Automatic medical report generation is an emerging field that aims to transform medical images into descriptive, clinically relevant narratives, potentially reducing the workload for radiologists significantly. Despite substantial progress, the increasing model parameter size and corresponding marginal performance gains have limited further development and application. To address this challenge, we introduce an Adapter-enhanced Hierarchical cross-modal Pre-training (AHP) strategy for lightweight medical report generation. This approach significantly reduces the pre-trained model's parameter size while maintaining superior report generation performance through our proposed spatial adapters. To further address the issue of inadequate representation of visual space details, we employ a convolutional stem combined with hierarchical injectors and extractors, fully integrating with traditional Vision Transformers to achieve more comprehensive visual representations. Additionally, our cross-modal pre-training model effectively handles the inherent complex visual-textual relationships in medical imaging. Extensive experiments on multiple datasets, including IU X-Ray, MIMIC-CXR, and bladder pathology, demonstrate our model's exceptional generalization and transfer performance in downstream medical report generation tasks, highlighting AHP's potential in significantly reducing model parameters while enhancing report generation accuracy and efficiency.

JBHI Journal 2025 Journal Article

Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question Answering

  • Ting Yu
  • Binhui Ge
  • Shuhui Wang
  • Yan Yang
  • Qingming Huang
  • Jun Yu

Medical Visual Question Answering (Med-VQA) holds immense promise as an invaluable medical assistance aid, offering timely diagnostic outcomes based on medical images and accompanying questions, thereby supporting medical professionals in making accurate clinical decisions. However, Med-VQA is still in its infancy, with existing solutions falling short in imitating human diagnostic processes and ensuring result consistency. To address these challenges, we propose a Co nsistency Co nditioned Me mory augmented D ynamic diagnosis model (CoCoMeD), incorporating two core components: a dynamic memory diagnosis engine and a consistency-conditioned enforcer. The dynamic memory diagnosis engine enables intricate diagnostic interactions by retaining vital visual cues from medical images and iteratively updating pertinent memories. This dynamic reasoning capability mirrors the cognitive processes observed in skilled medical diagnosticians, thus effectively enhancing the model's ability to reason over diverse medical visual facts and patient-specific questions. Moreover, to strengthen diagnostic coherence, the consistency-conditioned enforcer imposes coherence constraints linking interrelated questions with identical medical facts, ensuring the credibility and reliability of its diagnostic outcomes. Additionally, we present C-SLAKE, an extended Med-VQA dataset encompassing diverse medical image types, and categorized diagnostic question-answer pairs for consistent Med-VQA evaluation on rich medical sources. Comprehensive experiments on DME and C-SLAKE showcase CoCoMeD's superior performance and potential to advance trustworthy multi-source medical question answering.

AAAI Conference 2025 Conference Paper

Exploring Temporal Event Cues for Dense Video Captioning in Cyclic Co-Learning

  • Zhuyang Xie
  • Yan Yang
  • Yankai Yu
  • Jie Wang
  • Yongquan Jiang
  • Xiao Wu

Dense video captioning aims to detect and describe all events in untrimmed videos. This paper presents a dense video captioning network called Multi-Concept Cyclic Learning (MCCL), which aims to: (1) detect multiple concepts at the frame level and leverage these concepts to provide temporal event cues; and (2) establish cyclic co-learning between the generator and the localizer within the captioning network to promote semantic perception and event localization. Specifically, weakly supervised concept detection is performed for each frame, and the detected concept embeddings are integrated into the video features to provide event cues. Additionally, video-level concept contrastive learning is introduced to produce more discriminative concept embeddings. In the captioning network, a cyclic co-learning strategy is proposed, where the generator guides the localizer for event localization through semantic matching, while the localizer enhances the generator’s event semantic perception through location matching, making semantic perception and event localization mutually beneficial. MCCL achieves state-of-the-art performance on the ActivityNet Captions and YouCook2 datasets. Extensive experiments demonstrate its effectiveness and interpretability.

ECAI Conference 2025 Conference Paper

PFL-IDGAN: Personalized Federated Learning Framework Based on Interactive Dual Generative Adversarial Networks

  • Zhigang Wang
  • Yan Yang
  • Xiaochi Hou
  • Junfeng Zhao 0005

Federated learning (FL) enables collaborative model training without direct data exchange, promoting privacy-preserving data utilization. To address performance degradation caused by non-independent and identically distributed (non-IID) data, Personalized Federated Learning (PFL) allows each client to learn a model tailored to its local distribution. However, real-world personalized scenarios often involve not only data heterogeneity but also model heterogeneity across clients. Existing PFL methods struggle under the coexistence of both, as parameter aggregation requires identical model structures, while knowledge distillation often relies on shared public data. To tackle these challenges, we propose a novel PFL framework called Personalized Federated Learning based on Interactive Dual Generative Adversarial Networks (PFL-IDGAN). This framework leverages Generative Adversarial Networks (GANs) to augment local datasets, effectively mitigating label discrepancies and non-iid. data issues across clients. Moreover, it introduces a dual adversarial learning mechanism that enables fine-grained knowledge transfer and collaboration across clients, while supporting heterogeneous model architectures. Extensive experiments demonstrate that the proposed PFL-IDGAN framework significantly outperforms existing baseline methods, particularly in settings with pronounced disparities in client models and data distributions.

NeurIPS Conference 2025 Conference Paper

Storyboard-guided Alignment for Fine-grained Video Action Recognition

  • Enqi Liu
  • Liyuan Pan
  • Yan Yang
  • Yiran Zhong
  • Zhijing Wu
  • Xinxiao Wu
  • Liu Liu

Fine-grained video action recognition can be formulated as a video–text matching problem. Previous approaches primarily rely on global video semantics to consolidate video embeddings, often leading to misaligned video–text pairs due to inaccurate atomic-level action understanding. This inaccuracy arises due to i) videos with distinct global semantics may share similar atomic actions or visual appearances, and ii) atomic actions can be momentary, gradual, or not directly aligned with overarching video semantics. Inspired by storyboarding, where a script is segmented into individual shots, we propose a multi-granularity framework, SFAR. SFAR generates fine-grained descriptions of common atomic actions for each global semantic using a large language model. Unlike existing works that refine global semantics with auxiliary video frames, SFAR introduces a filtering metric to ensure correspondence between the descriptions and the global semantics, eliminating the need for direct video involvement and thereby enabling more nuanced recognition of subtle actions. By leveraging both global semantics and fine-grained descriptions, our SFAR effectively identifies prominent frames within videos, thereby improving the accuracy of embedding aggregation. Extensive experiments on various video action recognition datasets demonstrate the competitive performance of our SFAR in supervised, few-shot, and zero-shot settings.

NeurIPS Conference 2025 Conference Paper

Towards Prospective Medical Image Reconstruction via Knowledge-Informed Dynamic Optimal Transport

  • Taoran Zheng
  • Yan Yang
  • Xing Li
  • Xiang Gu
  • Jian Sun
  • Zongben Xu

Medical image reconstruction from measurement data is a vital but challenging inverse problem. Deep learning approaches have achieved promising results, but often requires paired measurement and high-quality images, which is typically simulated through a forward model, i. e. , retrospective reconstruction. However, training on simulated pairs commonly leads to performance degradation on real prospective data due to the retrospective-to-prospective gap caused by incomplete imaging knowledge in simulation. To address this challenge, this paper introduces imaging Knowledge-Informed Dynamic Optimal Transport (KIDOT), a novel dynamic optimal transport framework with optimality in the sense of preserving consistency with imaging physics in transport, that conceptualizes reconstruction as finding a dynamic transport path. KIDOT learns from unpaired data by modeling reconstruction as a continuous evolution path from measurements to images, guided by an imaging knowledge-informed cost function and transport equation. This dynamic and knowledge-aware approach enhances robustness and better leverages unpaired data while respecting acquisition physics. Theoretically, we demonstrate that KIDOT naturally generalizes dynamic optimal transport, ensuring its mathematical rationale and solution existence. Extensive experiments on MRI and CT reconstruction demonstrate KIDOT's superior performance. Code is available at https: //github. com/TaoranZheng717/KIDOT.

NeurIPS Conference 2025 Conference Paper

User-Instructed Disparity-aware Defocus Control

  • Yudong Han
  • Yan Yang
  • Hao Yang
  • Liyuan Pan

In photography, an All-in-Focus (AiF) image may not always effectively convey the creator’s intent. Professional photographers manipulate Depth of Field (DoF) to control which regions appear sharp or blurred, achieving compelling artistic effects. For general users, the ability to flexibly adjust DoF enhances creative expression and image quality. In this paper, we propose UiD, a User-Instructed DoF control framework, that allows users to specify refocusing regions using text, box, or point prompts, and our UiD automatically simulates in-focus and out-of-focus (OoF) regions in the given images. However, controlling defocus blur in a single-lens camera remains challenging due to the difficulty in estimating depth-aware aberrations and the suboptimal quality of reconstructed AiF images. To address this, we leverage dual-pixel (DP) sensors, commonly found in DSLR-style and mobile cameras. DP sensors provide a small-baseline stereo pair in a single snapshot, enabling depth-aware aberration estimation. Our approach first establishes an invertible mapping between OoF and AiF images to learn spatially varying defocus kernels and the disparity features. These depth-aware kernels enable bidirectional image transformation—deblurring out-of-focus (OoF) images into all-in-focus (AiF) representations, and conversely reblurring AiF images into OoF outputs—by seamlessly switching between the kernel and its inverse form. These depth-aware kernels enable both deblurring of OoF images into AiF representations and reblurring AiF images into OoF representations by flexibly switching its original form to its inverse one. For user-guided refocusing, we first generate masks based on user prompts using SAM, which modulates disparity features in closed form, allowing dynamic kernel re-estimation for reblurring. This achieves user-controlled refocusing effects. Extensive experiments on both common datasets and the self-collected dataset demonstrate that UiD offers superior flexibility and quality in DoF manipulation imaging.

ICLR Conference 2024 Conference Paper

GTMGC: Using Graph Transformer to Predict Molecule's Ground-State Conformation

  • Guikun Xu
  • Yongquan Jiang
  • PengChuan Lei
  • Yan Yang
  • Jim Chen

The ground-state conformation of a molecule is often decisive for its properties. However, experimental or computational methods, such as density functional theory (DFT), are time-consuming and labor-intensive for obtaining this conformation. Deep learning (DL) based molecular representation learning (MRL) has made significant advancements in molecular modeling and has achieved remarkable results in various tasks. Consequently, it has emerged as a promising approach for directly predicting the ground-state conformation of molecules. In this regard, we introduce GTMGC, a novel network based on Graph-Transformer (GT) that seamlessly predicts the spatial configuration of molecules in a 3D space from their 2D topological architecture in an end-to-end manner. Moreover, we propose a novel self-attention mechanism called Molecule Structural Residual Self-Attention (MSRSA) for molecular structure modeling. This mechanism not only guarantees high model performance and easy implementation but also lends itself well to other molecular modeling tasks. Our method has been evaluated on the Molecule3D benchmark dataset and the QM9 dataset. Experimental results demonstrate that our approach achieves remarkable performance and outperforms current state-of-the-art methods as well as the widely used open-source software RDkit.

IJCAI Conference 2024 Conference Paper

HeterGCL: Graph Contrastive Learning Framework on Heterophilic Graph

  • Chenhao Wang
  • Yong Liu
  • Yan Yang
  • Wei Li

Graph Contrastive Learning (GCL) has attracted significant research attention due to its self-supervised ability to learn robust node representations. Unfortunately, most methods primarily focus on homophilic graphs, rendering them less effective for heterophilic graphs. In addition, the complexity of node interactions in heterophilic graphs poses considerable challenges to augmentation schemes, coding architectures, and contrastive designs for traditional GCL. In this work, we propose HeterGCL, a novel graph contrastive learning framework with structural and semantic learning to explore the true potential of GCL on heterophilic graphs. Specifically, We abandon the random augmentation scheme that leads to the destruction of the graph structure, instead introduce an adaptive neighbor aggregation strategy (ANA) to extract topology-supervised signals from neighboring nodes at different distances and explore the structural information with an adaptive local-to-global contrastive loss. In the semantic learning module, we jointly consider the original nodes' features and the similarity between nodes in the latent feature space to explore hidden associations between nodes. Experimental results on homophilic and heterophilic graphs demonstrate that HeterGCL outperforms existing self-supervised and semi-supervised baselines across various downstream tasks.

AAAI Conference 2023 Conference Paper

Only a Few Classes Confusing: Pixel-Wise Candidate Labels Disambiguation for Foggy Scene Understanding

  • Liang Liao
  • Wenyi Chen
  • Zhen Zhang
  • Jing Xiao
  • Yan Yang
  • Chia-Wen Lin
  • Shin'ichi Satoh

Not all semantics become confusing when deploying a semantic segmentation model for real-world scene understanding of adverse weather. The true semantics of most pixels have a high likelihood of appearing in the few top classes according to confidence ranking. In this paper, we replace the one-hot pseudo label with a candidate label set (CLS) that consists of only a few ambiguous classes and exploit its effects on self-training-based unsupervised domain adaptation. Specifically, we formulate the problem as a coarse-to-fine process. In the coarse-level process, adaptive CLS selection is proposed to pick a minimal set of confusing candidate labels based on the reliability of label predictions. Then, representation learning and label rectification are iteratively performed to facilitate feature clustering in an embedding space and to disambiguate the confusing semantics. Experimentally, our method outperforms the state-of-the-art methods on three realistic foggy benchmarks.

AAAI Conference 2022 Conference Paper

Unsupervised Underwater Image Restoration: From a Homology Perspective

  • Zhenqi Fu
  • Huangxing Lin
  • Yan Yang
  • Shu Chai
  • Liyan Sun
  • Yue Huang
  • Xinghao Ding

Underwater images suffer from degradation due to light scattering and absorption. It remains challenging to restore such degraded images using deep neural networks since real-world paired data is scarcely available while synthetic paired data cannot approximate real-world data perfectly. In this paper, we propose an UnSupervised Underwater Image Restoration method (USUIR) by leveraging the homology property between a raw underwater image and a re-degraded image. Specifically, USUIR first estimates three latent components of the raw underwater image, i. e. , the global background light, the transmission map, and the scene radiance (the clean image). Then, a re-degraded image is generated by randomly mixing up the estimated scene radiance and the raw underwater image. We demonstrate that imposing a homology constraint between the raw underwater image and the re-degraded image is equivalent to minimizing the restoration error and hence can be used for the unsupervised restoration. Extensive experiments show that USUIR achieves promising performance in both inference time and restoration quality.

NeurIPS Conference 2021 Conference Paper

Adversarial Reweighting for Partial Domain Adaptation

  • Xiang Gu
  • Xi Yu
  • Yan Yang
  • Jian Sun
  • Zongben Xu

Partial domain adaptation (PDA) has gained much attention due to its practical setting. The current PDA methods usually adapt the feature extractor by aligning the target and reweighted source domain distributions. In this paper, we experimentally find that the feature adaptation by the reweighted distribution alignment in some state-of-the-art PDA methods is not robust to the ``noisy'' weights of source domain data, leading to negative domain transfer on some challenging benchmarks. To tackle the challenge of negative domain transfer, we propose a novel Adversarial Reweighting (AR) approach that adversarially learns the weights of source domain data to align the source and target domain distributions, and the transferable deep recognition network is learned on the reweighted source domain data. Based on this idea, we propose a training algorithm that alternately updates the parameters of the network and optimizes the weights of source domain data. Extensive experiments show that our method achieves state-of-the-art results on the benchmarks of ImageNet-Caltech, Office-Home, VisDA-2017, and DomainNet. Ablation studies also confirm the effectiveness of our approach.

JBHI Journal 2021 Journal Article

CacheTrack-YOLO: Real-Time Detection and Tracking for Thyroid Nodules and Surrounding Tissues in Ultrasound Videos

  • Xiangqiong Wu
  • Guanghua Tan
  • Ningbo Zhu
  • Zhilun Chen
  • Yan Yang
  • Huaxuan Wen
  • Kenli Li

To accurately detect and track the thyroid nodules in a video is a crucial step in the thyroid screening for identification of benign and malignant nodules in computer-aided diagnosis (CAD) systems. Most existing methods just perform excellent on static frames selected manually from ultrasound videos. However, manual acquisition is labor-intensive work. To make the thyroid screening process in a more natural way with less labor operations, we develop a well-designed framework suitable for practical applications for thyroid nodule detection in ultrasound videos. Particularly, in order to make full use of the characteristics of thyroid videos, we propose a novel post-processing approach, called Cache-Track, which exploits the contextual relation among video frames to propagate the detection results into adjacent frames to refine the detection results. Additionally, our method can not only detect and count thyroid nodules, but also track and monitor surrounding tissues, which can greatly reduce the labor work and achieve computer-aided diagnosis. Experimental results show that our method performs better in balancing accuracy and speed.

IJCAI Conference 2019 Conference Paper

Spectral Perturbation Meets Incomplete Multi-view Data

  • Hao Wang
  • Linlin Zong
  • Bing Liu
  • Yan Yang
  • Wei Zhou

Beyond existing multi-view clustering, this paper studies a more realistic clustering scenario, referred to as incomplete multi-view clustering, where a number of data instances are missing in certain views. To tackle this problem, we explore spectral perturbation theory. In this work, we show a strong link between perturbation risk bounds and incomplete multi-view clustering. That is, as the similarity matrix fed into spectral clustering is a quantity bounded in magnitude O(1), we transfer the missing problem from data to similarity and tailor a matrix completion method for incomplete similarity matrix. Moreover, we show that the minimization of perturbation risk bounds among different views maximizes the final fusion result across all views. This provides a solid fusion criteria for multi-view data. We motivate and propose a Perturbation-oriented Incomplete multi-view Clustering (PIC) method. Experimental results demonstrate the effectiveness of the proposed method.

ICRA Conference 2017 Conference Paper

A distributed approach to automated manufacturing systems with complex structures using Petri nets

  • Yan Yang
  • Hesuan Hu
  • Yang Liu 0003

One of the major challenges from both a theoretical and practical perspectives, for effectively establishing unattended operation of automated manufacturing systems (AMSs), is to resolve the deadlock. In the existing methods on deadlock problem, most of them are focused on the models with either flexible routes or assembly operations, whereas few method investigates them with both. Furthermore, applying these methods into large-scale systems is usually nontrivial, primarily because of the fact that they generally require the enumeration of all the states or siphons. The work presented in this paper proposes a Petri net-based model which can deal with both features and develops an innovative distributed strategy that provides an online and dynamic mechanism for deadlock resolution. Each step's execution depends on a search-based procedure that seeks to determine whether there exists a feasible event sequence bringing the currently-active process or sub-processes to the nearest global critical place when other processes stagnate. By taking our approach, system can benefit from minimal communication amount among different processes, better toleration to many contingencies like resource failure, and improved system performance like throughput.

NeurIPS Conference 2016 Conference Paper

Deep ADMM-Net for Compressive Sensing MRI

  • Yan Yang
  • Jian Sun
  • Huibin Li
  • Zongben Xu

Compressive Sensing (CS) is an effective approach for fast Magnetic Resonance Imaging (MRI). It aims at reconstructing MR image from a small number of under-sampled data in k-space, and accelerating the data acquisition in MRI. To improve the current MRI system in reconstruction accuracy and computational speed, in this paper, we propose a novel deep architecture, dubbed ADMM-Net. ADMM-Net is defined over a data flow graph, which is derived from the iterative procedures in Alternating Direction Method of Multipliers (ADMM) algorithm for optimizing a CS-based MRI model. In the training phase, all parameters of the net, e. g. , image transforms, shrinkage functions, etc. , are discriminatively trained end-to-end using L-BFGS algorithm. In the testing phase, it has computational overhead similar to ADMM but uses optimized parameters learned from the training data for CS-based reconstruction task. Experiments on MRI image reconstruction under different sampling ratios in k-space demonstrate that it significantly improves the baseline ADMM algorithm and achieves high reconstruction accuracies with fast computational speed.

ICRA Conference 2016 Conference Paper

Distributed supervisor synthesis for automated manufacturing systems with flexible routes and assembly operations using Petri nets

  • Chen Chen 0009
  • Yan Yang
  • Hesuan Hu

Automated manufacturing systems (AMSs) are developing rapidly with increasingly sophisticated operations and complex topologies. In this paper, we propose a new kind of AMS structure, namely, AESMs, with processes expressed by flexible routes and embedded by marked graph blocks. Flexible routes and assembly operations are combined organically in this model, making it structurally complex and applicably adaptable. Deadlock resolution in AMSs is a fundamental and prerequisite issue. Conventional approach is based on monolithic methodologies by means of siphons, i. e. , a special structural object closely related to deadlocks. Analysis shows that siphon-based mechanism is of limited applicability to control AESMs. On the contrary, we utilize a distributed approach in accordance with the philosophy of model predictive control techniques, which can create a trajectory dynamically leading the entire system to its desired destination. In our strategy, whether or not concerned tokens can reach critical places or critical place unities determines the feasibility of each step's execution. Control strategy is applied to processes locally such that they can concurrently proceed at the same pace. Global goals become attainable through the local observation, control, and execution upon local processes without knowing external and extra information.

ICRA Conference 2015 Conference Paper

Supervisor design and simplification for Automated Manufacturing Systems using colored Petri nets

  • Hesuan Hu
  • Yan Yang
  • Yang Liu 0003
  • Chen Chen 0009

Colored Petri nets are widely used to model Automated Manufacturing Systems thanks to their compactness to describe complex networked systems. Compared to general Petri nets, they allow many folding techniques so as to condense the system model. With them, many control synthesis problems are reduced to the solution of matrix operations. Supervisory control techniques can be realized in a computationally efficient way. In this work, these matrices are further distinguished by independent and dependent ones. A method is proposed to identify these independent ones while removing those dependent ones. As a result, a supervisor can be significantly simplified. In practice, this implies the significant reduction of cost, increase of reliability, and improvement of performance. Examples are presented to illustrate the effectiveness and efficiency of this method.

ICRA Conference 2015 Conference Paper

Supervisors and their simplification in automated manufacturing systems via Petri nets

  • Hesuan Hu
  • Chen Chen 0009
  • Yang Liu 0003
  • Yan Yang

Most contemporary manufacturing systems appear as complex event-driven automation facilities. Supervisor synthesis and simplification are fundamental in automated manufacturing systems (AMSs). From design and implementation standpoints, it is preferable to decrease supervisor scales so as to mitigate their realization expense. Ordinary and general Petri nets are considered as powerful formalisms to describe, analyze, and control AMSs. The former is concise while the latter is compact. By considering both cases, our approach shows some properties so as to disclose their rationales with regard to liveness enforcement. Many control methodologies are developed in the context of the ordinary ones. They are straightforward but might be unadaptable to the general ones. This work develops new theory in this perspective. Some comparison and contrast survey has been demonstrated in order to clarify their common theoretical basis and distinct supervisory effect. In accordance to a thorough study, a novel simplification method is developed so as to unify both cases and facilitate their application. Experimental results show that the developed strategy is effective and efficient when simplifying supervisors.