Author name cluster

Feng Yan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

2 author rows

IROS Conference 2025 Conference Paper

A light-controlled micromixer using optoelectronic tweezers

Feng Yan
Peisen Liu
Lixiang Zheng
Yuzhao Zhang
Tao Yue
Na Liu

This work presents a flexible and effective micromixer based on optoelectronic tweezers (OET), which leverages both asymmetric induced-charge electro-osmosis (ICEO) and dielectrophoresis (DEP) phenomena on microscale anisotropic NdFeB particles. The asymmetric ICEO phenomenon is generated by symmetry breaking in the induced charge distributions of geometrically anisotropic NdFeB particles under AC electric field polarization. The DEP forces exerted on NdFeB particles are induced by the light-generated non-uniform electric field. Under the combined action of hydrodynamic forces from asymmetric ICEO vortices and positive DEP forces, NdFeB particles can be attracted into light-induced "virtual" electrodes and precisely track along light-defined trajectories. Experimental results demonstrate that the maximum motion speed of the NdFeB particles exceeds 300 μm/s, with the motion speed exhibiting a positive correlation with the applied voltage. Dynamically controlled virtual electrodes enable accurate capture and relocation of microparticles to arbitrary target positions. The stirring and mixing capability of the NdFeB particles is demonstrated by driving yeast cell motion.

Details

ICLR Conference 2025 Conference Paper

CO-MOT: Boosting End-to-end Transformer-based Multi-Object Tracking via Coopetition Label Assignment and Shadow Sets

Feng Yan
Weixin Luo
Yujie Zhong
Yiyang Gan
Lin Ma 0002

Existing end-to-end Multi-Object Tracking (e2e-MOT) methods have not surpassed non-end-to-end tracking-by-detection methods. One possible reason lies in the training label assignment strategy that consistently binds the tracked objects with tracking queries and assigns few newborns to detection queries. Such an assignment, with one-to-one bipartite matching, yields an unbalanced training, _i.e._, scarce positive samples for detection queries, especially for an enclosed scene with the majority of the newborns at the beginning of videos. As such, e2e-MOT will incline to generate a tracking terminal without renewal or re-initialization, compared to other tracking-by-detection methods. To alleviate this problem, we propose **Co-MOT**, a simple yet effective method to facilitate e2e-MOT by a novel coopetition label assignment with a shadow concept. Specifically, we add tracked objects to the matching targets for detection queries when performing the label assignment for training the intermediate decoders. For query initialization, we expand each query by a set of shadow counterparts with limited disturbance to itself. With extensive ablation studies, Co-MOT achieves superior performances without extra costs, _e.g._, 69.4% HOTA on DanceTrack and 52.8% TETA on BDD100K. Impressively, Co-MOT only requires 38% FLOPs of MOTRv2 with comparable performances, resulting in the 1.4× faster inference speed. Source code is publicly available at [GitHub](https://github.com/BingfengYan/CO-MOT).

Details

AAAI Conference 2025 Conference Paper

Fed-DFA: Federated Distillation for Heterogeneous Model Fusion Through the Adversarial Lens

Zichen Wang
Feng Yan
Tianyi Wang
Cong Wang
Yuanchao Shu
Peng Cheng
Jiming Chen

Most of the federated learning techniques are limited to homogeneous model fusion. With the rapid growth of smart applications on resource-constrained edge devices, it becomes a barrier to accommodate their heterogeneous computing power and memory in the real world. Federated Distillation is a promising alternative to enable aggregation from heterogeneous models. However, the effectiveness of knowledge transfer still remains elusive under the shadow of distinct representation power from heterogeneous models. In this paper, we approach from an adversarial perspective to characterize the decision boundaries during distillation. By leveraging K-step PGD attacks, we successfully model the dynamics of the closest boundary points and establish a quantitative connection between the predictive uncertainty and boundary margin. Based on these findings, we further propose a new loss function to make the distillation attend to samples close to the decision boundaries, thus learning from more informed logit distributions. The extensive experiments over CIFAR-10/100 and Tiny-ImageNet demonstrate about 0.5-3.5% improvement of accuracy under different IID and non-IID settings, with only a small increment of computational overhead.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Curse or Redemption? How Data Heterogeneity Affects the Robustness of Federated Learning

Syed Zawad
Ahsan Ali
Pin-Yu Chen
Ali Anwar
Yi Zhou
Nathalie Baracaldo
Yuan Tian
Feng Yan

Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks. This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks. The initial impression driven by our experimental results suggests that data heterogeneity is the dominant factor in the effectiveness of attacks and it may be a redemption for defending against backdooring as it makes the attack less efficient, more challenging to design effective attack strategies, and the attack result also becomes less predictable. However, with further investigations, we found data heterogeneity is more of a curse than a redemption as the attack effectiveness can be significantly boosted by simply adjusting the client-side backdooring timing. More importantly, data heterogeneity may result in overfitting at the local training of benign clients, which can be utilized by attackers to disguise themselves and fool skewed-feature based defenses. In addition, effective attack strategies can be made by adjusting attack data distribution. Finally, we discuss the potential directions of defending the curses brought by data heterogeneity. The results and lessons learned from our extensive experiments and analysis offer new insights for designing robust federated learning methods and systems.

PDF Details

AAAI Conference 2021 Conference Paper

NASGEM: Neural Architecture Search via Graph Embedding Method

Hsin-Pai Cheng
Tunhou Zhang
Yixing Zhang
Shiyu Li
Feng Liang
Feng Yan
Meng Li
Vikas Chandra

Neural Architecture Search (NAS) automates and prospers the design of neural networks. Estimator-based NAS has been proposed recently to model the relationship between architectures and their performance to enable scalable and flexible search. However, existing estimator-based methods encode the architecture into a latent space without considering graph similarity. Ignoring graph similarity in node-based search space may induce a large inconsistency between similar graphs and their distance in the continuous encoding space, leading to inaccurate encoding representation and/or reduced representation capacity that can yield sub-optimal search results. To preserve graph correlation information in encoding, we propose NAS- GEM which stands for Neural Architecture Search via Graph Embedding Method. NASGEM is driven by a novel graph embedding method equipped with similarity measures to capture the graph topology information. By precisely estimating the graph distance and using an auxiliary Weisfeiler-Lehman kernel to guide the encoding, NASGEM can utilize additional structural information to get more accurate graph representation to improve the search efficiency. GEMNet, a set of networks discovered by NASGEM, consistently outperforms networks crafted by existing search methods in classification tasks, i. e. , with 0. 4%-3. 6% higher accuracy while having 11%- 21% fewer Multiply-Accumulates. We further transfer GEM- Net for COCO object detection. In both one-stage and twostage detectors, our GEMNet surpasses its manually-crafted and automatically-searched counterparts.

PDF Details

NeurIPS Conference 2021 Conference Paper

SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement

Heyang Qin
Samyam Rajbhandari
Olatunji Ruwase
Feng Yan
Lei Yang
Yuxiong He

Large scale training requires massive parallelism to finish the training within a reasonable amount of time. To support massive parallelism, large batch training is the key enabler but often at the cost of generalization performance. Existing works explore adaptive batching or hand-tuned static large batching, in order to strike a balance between the computational efficiency and the performance. However, these methods can provide only coarse-grained adaption (e. g. , at a epoch level) due to the intrinsic expensive calculation or hand tuning requirements. In this paper, we propose a fully automated and lightweight adaptive batching methodology to enable fine-grained batch size adaption (e. g. , at a mini-batch level) that can achieve state-of-the-art performance with record breaking batch sizes. The core component of our method is a lightweight yet efficient representation of the critical gradient noise information. We open-source the proposed methodology by providing a plugin tool that supports mainstream machine learning frameworks. Extensive evaluations on popular benchmarks (e. g. , CIFAR10, ImageNet, and BERT-Large) demonstrate that the proposed methodology outperforms state-of-the-art methodologies using adaptive batching approaches or hand-tuned static strategies in both performance and batch size. Particularly, we achieve a new state-of-the-art batch size of 78k in BERT-Large pretraining with SQuAD score 90. 69 compared to 90. 58 reported in previous state-of-the-art with 59k batch size.

PDF Details

AAAI Conference 2020 Conference Paper

AutoShrink: A Topology-Aware NAS for Discovering Efficient Neural Architecture

Tunhou Zhang
Hsin-Pai Cheng
Zhenwen Li
Feng Yan
Chengyu Huang
Hai Li
Yiran Chen

Resource is an important constraint when deploying Deep Neural Networks (DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based search approach, which limits the ﬂexibility of network patterns in learned cell structures. Moreover, due to the topology-agnostic nature of existing works, including both cell-based and node-based approaches, the search process is time consuming and the performance of found architecture may be sub-optimal. To address these problems, we propose AutoShrink, a topologyaware Neural Architecture Search (NAS) for searching ef- ﬁcient building blocks of neural architectures. Our method is node-based and thus can learn ﬂexible network patterns in cell structures within a topological search space. Directed Acyclic Graphs (DAGs) are used to abstract DNN architectures and progressively optimize the cell structure through edge shrinking. As the search space intrinsically reduces as the edges are progressively shrunk, AutoShrink explores more ﬂexible search space with even less search time. We evaluate AutoShrink on image classiﬁcation and language tasks by crafting ShrinkCNN and ShrinkRNN models. ShrinkCNN is able to achieve up to 48% parameter reduction and save 34% Multiply-Accumulates (MACs) on ImageNet-1K with comparable accuracy of state-of-the-art (SOTA) models. Speciﬁcally, both ShrinkCNN and ShrinkRNN are crafted within 1. 5 GPU hours, which is 7. 2× and 6. 7× faster than the crafting time of SOTA CNN and RNN models, respectively.

PDF Details

AAAI Conference 2020 Conference Paper

HDK: Toward High-Performance Deep-Learning-Based Kirchhoff Analysis

Xinying Wang
Olamide Timothy Tawose
Feng Yan
Dongfang Zhao

The Kirchhoff law is one of the most widely used physical laws in many engineering principles, e. g. , biomedical engineering, electrical engineering, and computer engineering. One challenge of applying the Kirchhoff law to real-world applications at scale lies in the high, if not prohibitive, computational cost to solve a large number of nonlinear equations. Despite recent advances in leveraging a convolutional neural network (CNN) to estimate the solutions of Kirchhoff equations, the low performance is still signiﬁcantly hindering the broad adoption of CNN-based approaches. This paper proposes a high-performance deep-learning-based approach for Kirchhoff analysis, namely HDK. HDK employs two techniques to improve the performance: (i) early pruning of unqualiﬁed input candidates and (ii) parallelization of forward labelling. To retain high accuracy, HDK also applies various optimizations to the data such as randomized augmentation and dimension reduction. Collectively, the aforementioned techniques improve the analysis speed by 8× with accuracy as high as 99. 6%.

PDF Details

NeurIPS Conference 2017 Conference Paper

TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning

Wei Wen
Cong Xu
Feng Yan
Chunpeng Wu
Yandan Wang
Yiran Chen
Hai Li

High network communication cost for synchronizing gradients and parameters is the well-known bottleneck of distributed training. In this work, we propose TernGrad that uses ternary gradients to accelerate distributed deep learning in data parallelism. Our approach requires only three numerical levels {-1, 0, 1}, which can aggressively reduce the communication time. We mathematically prove the convergence of TernGrad under the assumption of a bound on gradients. Guided by the bound, we propose layer-wise ternarizing and gradient clipping to improve its convergence. Our experiments show that applying TernGrad on AlexNet does not incur any accuracy loss and can even improve accuracy. The accuracy loss of GoogLeNet induced by TernGrad is less than 2% on average. Finally, a performance model is proposed to study the scalability of TernGrad. Experiments show significant speed gains for various deep neural networks. Our source code is available.

PDF Details

NeurIPS Conference 2011 Conference Paper

EigenNet: A Bayesian hybrid of generative and conditional models for sparse learning

Feng Yan
Yuan Qi

For many real-world applications, we often need to select correlated variables---such as genetic variations and imaging features associated with Alzheimer's disease---in a high dimensional space. The correlation between variables presents a challenge to classical variable selection methods. To address this challenge, the elastic net has been developed and successfully applied to many applications. Despite its great success, the elastic net does not exploit the correlation information embedded in the data to select correlated variables. To overcome this limitation, we present a novel hybrid model, EigenNet, that uses the eigenstructures of data to guide variable selection. Specifically, it integrates a sparse conditional classification model with a generative model capturing variable correlations in a principled Bayesian framework. We develop an efficient active-set algorithm to estimate the model via evidence maximization. Experiments on synthetic data and imaging genetics data demonstrated the superior predictive performance of the EigenNet over the lasso, the elastic net, and the automatic relevance determination.

PDF Details

AAAI Conference 2011 Conference Paper

Sparse Matrix-Variate t Process Blockmodels

Zenglin Xu
Feng Yan
Yuan Qi

We consider the problem of modeling network interactions and identifying latent groups of network nodes. This problem is challenging due to the facts i) that the network nodes are interdependent instead of independent, ii) that the network data are very noisy (e. g. , missing edges), and iii) that the network interactions are often sparse. To address these challenges, we propose a Sparse Matrix-variate t process Blockmodel (SMTB). In particular, we generalize a matrix-variate t distribution to a t process on matrices with nonlinear covariance functions. Due to this generalization, our model can estimate latent memberships for individual network nodes. This separates our model from previous t distribution based relational models. Also, we introduce sparse prior distributions on the latent membership parameters to select group assignments for individual nodes. To learn the model efﬁciently from data, we develop a variational method. When compared with several state-of-the-art models, including the predictive matrixvariate t models and mixed membership stochastic blockmodels, our model achieved improved prediction accuracy on real world network datasets.

PDF Details

NeurIPS Conference 2009 Conference Paper

Parallel Inference for Latent Dirichlet Allocation on Graphics Processing Units

Feng Yan
Ningyi Xu
Yuan Qi

The recent emergence of Graphics Processing Units (GPUs) as general-purpose parallel computing devices provides us with new opportunities to develop scalable learning methods for massive data. In this work, we consider the problem of parallelizing two inference methods on GPUs for latent Dirichlet Allocation (LDA) models, collapsed Gibbs sampling (CGS) and collapsed variational Bayesian (CVB). To address limited memory constraints on GPUs, we propose a novel data partitioning scheme that effectively reduces the memory cost. Furthermore, the partitioning scheme balances the computational cost on each multiprocessor and enables us to easily avoid memory access conflicts. We also use data streaming to handle extremely large datasets. Extensive experiments showed that our parallel inference methods consistently produced LDA models with the same predictive power as sequential training methods did but with 26x speedup for CGS and 196x speedup for CVB on a GPU with 30 multiprocessors; actually the speedup is almost linearly scalable with the number of multiprocessors available. The proposed partitioning scheme and data streaming can be easily ported to many other models in machine learning.

PDF Details