Arrow Research search

Author name cluster

Bin Sun

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers
2 author rows

Possible papers

16

EAAI Journal 2026 Journal Article

Meta-learning with variational inference for few-shot faults diagnosis of automotive transmission under variable operating conditions

  • Bin Sun
  • Hongkun Li
  • Nan Liu
  • Feifei Li
  • Zhenhui Ma

The automotive transmission is a critical component for regulating vehicle speed. However, in real industrial settings, the complexity and variability of operating conditions, along with a limited number of fault samples, make traditional deep learning methods inadequate for practical applications. To address these challenges, this paper presents a few-shot fault diagnosis method for automotive transmissions under variable conditions, based on Variational Agnostic Meta-Learning for Robust Inference (VAMPIRE). First, the vibration data collected from sensors is sliced and converted into two-dimensional grayscale images to create the dataset. Next, by integrating Bayesian theory with a meta-learning framework, we use variational inference to approximate the posterior distribution. This allows the learned meta-parameters to coherently explain the variability of the data, thereby enhancing the model's generalization ability across different operating conditions. Finally, this study utilized data from an industrial-grade gearbox test bench and real-road test data of an industrial truck gearbox to conduct comparative experiments under multiple variable working conditions, and compared the results with various methods. The experimental results show that regardless of the sample size or the complexity of working conditions, the proposed method performs excellently in terms of accuracy, stability, and generalizability. For example, in test scenarios involving multiple unknown working conditions, the proposed method achieved an average diagnostic accuracy of 96. 52 % for test bench data and 97. 54 % for real-vehicle data in 5-shot learning tasks. Even in the most challenging 1-shot learning tasks, its average accuracy remained at 93. 88 % and 94. 82 %, respectively, significantly outperforming the comparative methods.

EAAI Journal 2025 Journal Article

A digital twin-assisted algorithm for diagnosis of permanent magnet synchronous generator interturn short circuit fault and converter open circuit fault in wind power systems using Pearson correlation coefficient

  • Bin Sun
  • Ying Zhu
  • Zhinong Wei

Interturn short-circuit faults (ISCFs) in permanent magnet synchronous generators (PMSGs) and open-circuit faults (OCFs) in the machine-side converters represent two critical reliability challenges in wind power systems. Conventional fault diagnosis approaches typically rely on dedicated models for each fault type for each fault type, leading to excessive system complexity and suboptimal computational efficiency. To overcome these limitations, this paper proposes a novel unified digital twin-assisted framework capable of simultaneous diagnosis of both PMSG ISCFs and converter OCFs within a single integrated architecture. The high-fidelity digital twin model based on one-dimensional convolutional neural networks is established to generate real-time reference value of current space vector (SV) for online fault detection, while Pearson correlation coefficient analysis enables accurate differentiation between ISCF and OCF. For ISCFs, the fault severity assessment is performed based on the deviation between reference and measured current SV, with the faulty phase identified using phase current root mean square (RMS) values. In the case of converter OCFs, the proposed method introduces a dual-stage identification process: single and dual insulated-gate bipolar transistor (IGBT) open faults are differentiated through severity estimation analysis, and the faulty IGBT is identified by evaluating the effective current interval ratio (ECIR) and normalized current average (NCA). The experimental results validate the effectiveness of the proposed method, and comparative analysis further demonstrates its superior performance in terms of parameter dependency and diagnostic efficacy.

ICML Conference 2025 Conference Paper

Bridging Layout and RTL: Knowledge Distillation based Timing Prediction

  • Mingjun Wang
  • Yihan Wen
  • Bin Sun
  • Jianan Mu
  • Juan Li
  • Xiaoyi Wang
  • Jing Ye 0001
  • Bei Yu 0001

Accurate and efficient timing prediction at the register-transfer level (RTL) remains a fundamental challenge in electronic design automation (EDA), particularly in striking a balance between accuracy and computational efficiency. While static timing analysis (STA) provides high-fidelity results through comprehensive physical parameters, its computational overhead makes it impractical for rapid design iterations. Conversely, existing RTL-level approaches sacrifice accuracy due to the limited physical information available. We propose RTLDistil, a novel cross-stage knowledge distillation framework that bridges this gap by transferring precise physical characteristics from a layout-aware teacher model (Teacher GNN) to an efficient RTL-level student model (Student GNN), both implemented as graph neural networks (GNNs). RTLDistil efficiently predicts key timing metrics, such as arrival time (AT), and employs a multi-granularity distillation strategy that captures timing-critical features at node, subgraph, and global levels. Experimental results demonstrate that RTLDistil achieves significant improvement in RTL-level timing prediction error reduction, compared to state-of-the-art prediction models. This framework enables accurate early-stage timing prediction, advancing EDA’s “left-shift” paradigm while maintaining computational efficiency. Our code and dataset will be publicly available at https: //github. com/sklp-eda-lab/RTLDistil.

AAMAS Conference 2025 Conference Paper

Computing Efficient Envy-Free Partial Allocations of Indivisible Goods

  • Robert Bredereck
  • Andrzej Kaczmarczyk
  • Junjie Luo
  • Bin Sun

Envy-freeness is one of the most prominent fairness concepts in the allocation of indivisible goods. Even though trivial envy-free allocations always exist, rich literature shows this is not true when one additionally requires some efficiency concept (e. g. , completeness, Pareto-efficiency, or social welfare maximization). In fact, in such case even deciding the existence of an efficient envy-free allocation is notoriously computationally hard. In this paper, we explore the limits of efficient computability by relaxing standard efficiency concepts and analyzing how this impacts the computational complexity of the respective problems. Specifically, we allow partial allocations (where not all goods are allocated) and impose only very mild efficiency constraints, such as ensuring each agent receives a bundle with positive utility. Surprisingly, even such seemingly weak efficiency requirements lead to a diverse computational complexity landscape. We identify several polynomial-time solvable or fixed-parameter tractable cases for binary utilities, yet we also find NP-hardness in very restricted scenarios involving ternary utilities.

AAMAS Conference 2025 Conference Paper

Different Models for Fair and Efficient Resource Allocation

  • Bin Sun

Computing fair and efficient allocations is a very important topic in the area of fair allocation of indivisible resources. There are different models of resource allocation, each applicable to distinct contexts. My research focuses on designing and analyzing various allocation models that are tailored to specific scenarios, as well as their fairness and efficiency. My current research interests include several areas: allocations with costs, allocations and groups, allocations and externalities, allocations allowing sharing, and allocations allowing selling.

JBHI Journal 2025 Journal Article

Enhancing Drug Synergy Combination: Integrating Graph Transformers and BiLSTM for Accurate Drug Synergy Prediction

  • Bin Sun
  • Haoze Du
  • Shumei Hou
  • Qingkai Hu
  • Xiaoxiao Pang
  • Dongqing Wei
  • Xianfang Wang

Combination therapy of drugs showed significant potential in treating complex diseases by overcoming drug resistance and improving therapeutic efficacy. However, due to the rapid increase in the number of available drugs, the cost and time required for experimentally screening synergistic drug combinations became increasingly burdensome. In this work, we proposed a novel drug synergy prediction model called GraphTranSynergy, which utilized graph transformer and BiLSTM to capture the molecular structure of drugs and gene expression features of cell lines. GraphTranSynergy extracted graphical features of drug pairs through the graph transformer module and integrated information from the BiLSTM module to extract useful features from gene expression profiles of cell lines. The final prediction of drug synergy was made through a fully connected neural network. Our model achieved AUC and PRAUC scores of 0. 94, outperforming most existing models. Independent test results demonstrated that GraphTranSynergy exhibited superior generalization ability on the AstraZeneca dataset, particularly excelling in ACC and TPR metrics. Through a series of experiments and analyses, our model not only improved prediction accuracy but also demonstrated advantages in biological interpretability.

EAAI Journal 2025 Journal Article

Enhancing long-term load forecasting with convolutional informer-based hybrid model

  • Bin Sun
  • Xudong Chen
  • Tao Shen
  • Liyao Ma

Long-term load forecasting (LTLF) is essential for energy management but challenged by the complexity of non-stationary time series. The Informer model struggles to capture localized peak–valley patterns, while Variational Mode Decomposition (VMD) faces issues with feature complexity. This study proposes a hybrid framework integrating VMD, Informer, and a Convolutional Long Short-Term Memory (CNN-LSTM) module for accurate LTLF. VMD decomposes non-stationary load data into multi-scale intrinsic mode functions, refined through spectral and autocorrelation analyses to ensure robust feature extraction. The Informer employs sparse self-attention for efficient long-sequence modeling, with CNN-LSTM enhancing the decoder to capture localized temporal dynamics. Experiments on non-stationary load time series across multiple prediction horizons demonstrate that the proposed framework significantly improves forecasting accuracy and robustness compared to baseline models, including Informer and its derivatives. By excelling in complex load pattern prediction, the framework supports efficient grid scheduling and resource optimization in energy systems.

IJCAI Conference 2025 Conference Paper

Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance

  • Yufeng Wang
  • Jinwu Hu
  • Ziteng Huang
  • Kunyang Lin
  • Zitian Zhang
  • Peihao Chen
  • Yu Hu
  • Qianyue Wang

Open-domain dialogue systems aim to generate natural and engaging conversations, providing significant practical value in real applications such as social robotics and personal assistants. The advent of large language models (LLMs) has greatly advanced this field by improving context understanding and conversational fluency. However, existing LLM-based dialogue systems often fall short in proactively understanding the user's chatting preferences and guiding conversations toward user-centered topics. This lack of user-oriented proactivity can lead users to feel unappreciated, reducing their satisfaction and willingness to continue the conversation in human-computer interactions. To address this issue, we propose a User-oriented Proactive Chatbot (UPC) to enhance the user-oriented proactivity. Specifically, we first construct a critic to evaluate this proactivity inspired by the LLM-as-a-judge strategy. Given the scarcity of high-quality training data, we then employ the critic to guide dialogues between the chatbot and user agents, generating a corpus with enhanced user-oriented proactivity. To ensure the diversity of the user backgrounds, we introduce the ISCO-800, a diverse user background dataset for constructing user agents. Moreover, considering the communication difficulty varies among users, we propose an iterative curriculum learning method that trains the chatbot from easy-to-communicate users to more challenging ones, thereby gradually enhancing its performance. Experiments demonstrate that our proposed training method is applicable to different LLMs, improving user-oriented proactivity and attractiveness in open-domain dialogues. Code and appendix are available at github. com/wang678/LLM-UPC.

EAAI Journal 2024 Journal Article

Mechanism-driven and data-driven fusion prediction of seismic damage evolution of concrete structures based on cooperative multi-particle swarm optimization

  • Bin Sun
  • Tong Guo

Due to many uncertainties arising from the seismic process, the current mechanism model-driven forward prediction methods or data-driven smart inversion methods are difficult to achieve reliable damage prediction of one specific structure under the sudden earthquake. To overcome this difficulty, a mechanism-driven and data-driven fusion method is developed to predict seismic damage evolution of concrete structures. In this method, the developed cooperative multi-particle swarm optimization (CMPSO) algorithm is utilized to optimize the seismic damage by combining with finite element computation, which can reduce the optimization variables for searching optimal solution easily. Additionally, a prior knowledge-based mutation procedure is utilized to consider the physical mechanism that damage index is proportional to the stress level during optimization process, which can support the physical guidance for searching optimal solution randomly during the optimization process. Based on the two advantages, a representative numerical case of a concrete frame structure under seismic loading is utilized to support the effectiveness of the method. A comparison of prediction results with experimental results verifies the ability of the developed mechanism-driven and data-driven fusion method, which can be capable of seismic damage prediction of concrete structures.

AAAI Conference 2024 Conference Paper

Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

  • Yiwei Li
  • Peiwen Yuan
  • Shaoxiong Feng
  • Boyuan Pan
  • Bin Sun
  • Xinglin Wang
  • Heda Wang
  • Kan Li

Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathematical problems. Previous studies only transfer knowledge from positive samples and drop the synthesized data with wrong answers. In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones. The framework consists of three progressive steps, covering from training to inference stages, to absorb knowledge from negative data. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.

NeurIPS Conference 2023 Conference Paper

Better Correlation and Robustness: A Distribution-Balanced Self-Supervised Learning Framework for Automatic Dialogue Evaluation

  • Peiwen Yuan
  • Xinglin Wang
  • Jiayi Shi
  • Bin Sun
  • Yiwei Li

Turn-level dialogue evaluation models (TDEMs), using self-supervised learning (SSL) framework, have achieved state-of-the-art performance in open-domain dialogue evaluation. However, these models inevitably face two potential problems. First, they have low correlations with humans on medium coherence samples as the SSL framework often brings training data with unbalanced coherence distribution. Second, the SSL framework leads TDEM to nonuniform score distribution. There is a danger that the nonuniform score distribution will weaken the robustness of TDEM through our theoretical analysis. To tackle these problems, we propose Better Correlation and Robustness (BCR), a distribution-balanced self-supervised learning framework for TDEM. Given a dialogue dataset, BCR offers an effective training set reconstructing method to provide coherence-balanced training signals and further facilitate balanced evaluating abilities of TDEM. To get a uniform score distribution, a novel loss function is proposed, which can adjust adaptively according to the uniformity of score distribution estimated by kernel density estimation. Comprehensive experiments on 17 benchmark datasets show that vanilla BERT-base using BCR outperforms SOTA methods significantly by 11. 3% on average. BCR also demonstrates strong generalization ability as it can lead multiple SOTA methods to attain better correlation and robustness.

AAAI Conference 2023 Conference Paper

Heterogeneous-Branch Collaborative Learning for Dialogue Generation

  • Yiwei Li
  • Shaoxiong Feng
  • Bin Sun
  • Kan Li

With the development of deep learning, advanced dialogue generation methods usually require a greater amount of computational resources. One promising approach to obtaining a high-performance and lightweight model is knowledge distillation, which relies heavily on the pre-trained powerful teacher. Collaborative learning, also known as online knowledge distillation, is an effective way to conduct one-stage group distillation in the absence of a well-trained large teacher model. However, previous work has a severe branch homogeneity problem due to the same training objective and the independent identical training sets. To alleviate this problem, we consider the dialogue attributes in the training of network branches. Each branch learns the attribute-related features based on the selected subset. Furthermore, we propose a dual group-based knowledge distillation method, consisting of positive distillation and negative distillation, to further diversify the features of different branches in a steadily and interpretable way. The proposed approach significantly improves branch heterogeneity and outperforms state-of-the-art collaborative learning methods on two widely used open-domain dialogue datasets.

AAAI Conference 2023 Conference Paper

Hybrid Pixel-Unshuffled Network for Lightweight Image Super-resolution

  • Bin Sun
  • Yulun Zhang
  • Songyao Jiang
  • Yun Fu

Convolutional neural network (CNN) has achieved great success on image super-resolution (SR). However, most deep CNN-based SR models take massive computations to obtain high performance. Downsampling features for multi-resolution fusion is an efficient and effective way to improve the performance of visual recognition. Still, it is counter-intuitive in the SR task, which needs to project a low-resolution input to high-resolution. In this paper, we propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task. The network contains pixel-unshuffled downsampling and Self-Residual Depthwise Separable Convolutions. Specifically, we utilize pixel-unshuffle operation to downsample the input features and use grouped convolution to reduce the channels. Besides, we enhance the depthwise convolution's performance by adding the input feature to its output. The comparison findings demonstrate that, with fewer parameters and computational costs, our HPUN achieves and surpasses the state-of-the-art performance on SISR. All results are provided in the github https://github.com/Sun1992/HPUN.

AAAI Conference 2023 Conference Paper

Towards Diverse, Relevant and Coherent Open-Domain Dialogue Generation via Hybrid Latent Variables

  • Bin Sun
  • Yitong Li
  • Fei Mi
  • Weichao Wang
  • Yiwei Li
  • Kan Li

Conditional variational models, using either continuous or discrete latent variables, are powerful for open-domain dialogue response generation. However, previous works show that continuous latent variables tend to reduce the coherence of generated responses. In this paper, we also found that discrete latent variables have difficulty capturing more diverse expressions. To tackle these problems, we combine the merits of both continuous and discrete latent variables and propose a Hybrid Latent Variable (HLV) method. Specifically, HLV constrains the global semantics of responses through discrete latent variables and enriches responses with continuous latent variables. Thus, we diversify the generated responses while maintaining relevance and coherence. In addition, we propose Conditional Hybrid Variational Transformer (CHVT) to construct and to utilize HLV with transformers for dialogue generation. Through fine-grained symbolic-level semantic information and additive Gaussian mixing, we construct the distribution of continuous variables, prompting the generation of diverse expressions. Meanwhile, to maintain the relevance and coherence, the discrete latent variable is optimized by self-separation training. Experimental results on two dialogue generation datasets (DailyDialog and Opensubtitles) show that CHVT is superior to traditional transformer-based variational mechanism w.r.t. diversity, relevance and coherence metrics. Moreover, we also demonstrate the benefit of applying HLV to fine-tuning two pre-trained dialogue models (PLATO and BART-base).

JBHI Journal 2021 Journal Article

Computer-Aided Intraoperative Toric Intraocular Lens Positioning and Alignment During Cataract Surgery

  • Yuxuan Zhai
  • Guanghua Zhang
  • Longsheng Zheng
  • Guangqian Yang
  • Ke Zhao
  • Yubin Gong
  • Zhe Zhang
  • Ximei Zhang

Cataract causes more than half of all blindness worldwide. The most effective treatment is surgery, where cataract is often replaced by intraocular lens (IOL). Beyond saving vision, toric IOL implantation is becoming increasingly popular to correct corneal astigmatism. It is important to precisely position and align the axis of IOL during surgery to achieve optimal post-operative astigmatism correction. Comparing with conventional manual marking, automated markerless IOL alignment can be faster, more accurate and non-invasive. Here we propose a framework for computer-assisted intraoperative IOL positioning and alignment based on detection and tracking. Firstly, the iris boundary was segmented and the eye center was determined. A statistical sampling method was developed to segment iris and generate training labels, and both conventional algorithms and deep convolutional neural network (CNN) methods were evaluated. Then, regions of interests (ROIs) containing high density of scleral capillaries were used for tracking eye rotations. Both correlation filter and CNN methods were evaluated for tracking. Cumulative errors during long-term tracking were corrected using a reference image. Validation studies against manual labeling using 7 clinical cataract surgical videos demonstrated that the proposed algorithm achieved an average position error around 0. 2 mm, an axis alignment error of $^{\circ}$, and a frame rate of > 25 FPS, and can be potentially used intraoperatively for markerless IOL positioning and alignment during cataract surgery.

AAAI Conference 2018 Conference Paper

Action Prediction From Videos via Memorizing Hard-to-Predict Samples

  • Yu Kong
  • Shangqian Gao
  • Bin Sun
  • Yun Fu

Action prediction based on video is an important problem in computer vision field with many applications, such as preventing accidents and criminal activities. It's challenging to predict actions at the early stage because of the large variations between early observed videos and complete ones. Besides, intra-class variations cause confusions to the predictors as well. In this paper, we propose a mem-LSTM model to predict actions in the early stage, in which a memory module is introduced to record several "hard-to-predict" samples and a variety of early observations. Our method uses Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM) to model partial observed video input. We augment LSTM with a memory module to remember challenging video instances. With the memory module, our mem-LSTM model not only achieves impressive performance in the early stage but also makes predictions without the prior knowledge of observation ratio. Information in future frames is also utilized using a bi-directional layer of LSTM. Experiments on UCF-101 and Sports-1M datasets show that our method outperforms state-of-the-art methods.