Author name cluster

Steve Drew

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

2 author rows

AAAI Conference 2025 Conference Paper

HVAdam: A Full-Dimension Adaptive Optimizer

Yiheng Zhang
Shaowu Wu
Yuanzhuo Xu
Jiajun Wu
Shang Xu
Steve Drew
Xiaoguang Niu

Adaptive optimizers such as Adam and RMSProp have gained attraction in complex neural networks, including generative adversarial networks (GANs) and Transformers, thanks to their stable performance and fast convergence compared to non-adaptive optimizers. A frequently overlooked limitation of adaptive optimizers is that adjusting the learning rate of each dimension individually would ignore the knowledge of the whole loss landscape, resulting in slow updates of parameters, invalidating the learning rate adjustment strategy and eventually leading to widespread insufficient convergence of parameters. In this paper, we propose HVAdam, a novel optimizer that associates all dimensions of the parameters to find a new parameter update direction, leading to a refined parameter update strategy for an increased convergence rate. We validated HVAdam in extensive experiments, showing its faster convergence, higher accuracy, and more stable performance on image classification, image generation, and natural language processing tasks. Particularly, HVAdam achieves a significant improvement on GANs compared with other state-of-the-art methods, especially in Wasserstein-GAN (WGAN) and its improved version with gradient penalty (WGAN-GP).

PDF Details DOI

ECAI Conference 2025 Conference Paper

Owen Sampling Accelerates Contribution Estimation in Federated Learning

Hossein KhademSohi
Hadi Hemmati
Jiayu Zhou
Steve Drew

Federated Learning (FL) aggregates information from multiple clients to train a shared global model without exposing raw data. Accurately estimating each client’s contribution is essential not just for fair rewards, but for selecting the most useful clients so the global model converges faster. The Shapley value is the principled choice for this, yet exact computation scales exponentially with the number of clients, making it infeasible for real-world FL deployments with many participants. In this paper, we propose FedOwen, an efficient federated contribution evaluation framework adopting Owen sampling to approximate Shapley values under the same total evaluation budget as the existing methods, while keeping the approximation error below a small threshold. In addition, FedOwen applies an adaptive client selection strategy that balances exploiting high-value clients with exploring under-sampled ones, avoiding bias toward a narrow subset, and uncovering rare but informative data. Under a fixed valuation cost, FedOwen achieves up to 23% improvement in final model accuracy within the same number of communication rounds, compared to state-of-the-art baselines on non-IID benchmarks. Code: https: //github. com/hoseinkhs/AdaptiveSelectionFL [17]

Details

AAAI Conference 2025 Conference Paper

Revisiting Interpolation for Noisy Label Correction

Yuanzhuo Xu
Xiaoguang Niu
Jie Yang
Ruiyi Su
Jian Zhang
Shubo Liu
Steve Drew

Label correction methods are popular for their simple architecture in learning with noisy labels. However, they suffer severely from false label correction and achieve subpar performance compared with state-of-the-art methods. In this paper, we revisit the label correction methods through theoretical analysis of gradient scaling and demonstrate that the sample-wise dynamic and class-wise uniformity of interpolation weight prevents memorization of the mislabeled samples. We then propose DULC, a simple yet effective label correction method that uses the normalized Jensen-Shannon divergence (JSD) metric as the interpolation weight to promote sample-wise dynamic and class-wise uniformity. Additionally, we provide theoretical evidence that sharpening predictions in label correction facilitates the memorization of true class, and we achieve it by employing the augmentation strategy along with the sharpening function. Extensive experiments on CIFAR-10, CIFAR-100, TinyImageNet, WebVision and Clothing1M datasets demonstrate substantial improvements over state-of-the-art methods.

PDF Details DOI

TMLR Journal 2025 Journal Article

Trustworthy and Responsible AI for Human-Centric Autonomous Decision-Making Systems

Farzaneh Dehghani
Mahsa Dibaji
Fahim Anzum
Lily Dey
Alican Basdemir
Sayeh Bayat
Jean-Christophe Boucher
Steve Drew

Artificial Intelligence (AI) has paved the way for revolutionary decision-making processes, which, if harnessed appropriately, can contribute to advancements in various sectors, from healthcare to economics. However, its black box nature presents significant ethical challenges related to bias and transparency. AI applications are hugely impacted by biases, presenting inconsistent and unreliable findings, leading to significant costs and consequences, highlighting and perpetuating inequalities and unequal access to resources. Hence, developing safe, reliable, ethical, and Trustworthy AI systems is essential. Our interdisciplinary team of researchers focuses on Trustworthy and Responsible AI, including fairness, bias mitigation, reproducibility, generalization, interpretability, explainability, and authenticity. In this paper, we review and discuss the intricacies of AI biases, definitions, methods of detection and mitigation, and metrics for evaluating bias. We also discuss open challenges with regard to the trustworthiness and widespread application of AI across diverse domains of humancentric decision making, as well as guidelines to foster Responsible and Trustworthy AI models.

PDF Details

TMLR Journal 2024 Journal Article

SelfXit: An Unsupervised Early Exit Mechanism for Deep Neural Networks

Hossein KhademSohi
Mohammadamin Abedi
Yani Ioannou
Steve Drew
Pooyan Jamshidi
Hadi Hemmati

Deep Neural Networks (DNNs) have become an essential component in many application domains, including web-based services. A variety of these services require high throughput and (close to) real-time features, for instance, to respond or react to users' requests or to process a stream of incoming data on time. However, the trend in DNN design is towards larger models with many layers and parameters to achieve more accurate results. Although these models are often pre-trained, the computational complexity in such large models can still be relatively significant, hindering low inference latency. In this paper, we propose SelfXit, an end-to-end automated early exiting solution to improve the performance of DNN-based vision services in terms of computational complexity and inference latency. SelfXit adopts the ideas of self-distillation of DNN models and early exits specifically for vision applications. The proposed solution is an automated unsupervised early exiting mechanism that allows early exiting of a large model during inference time if the early exit model in one of the early exits is confident enough for final prediction. One of the main contributions of this paper is that we have implemented the idea as an unsupervised early exiting, meaning that the early exit models do not need access to training data and perform solely based on the incoming data at run-time, making it suitable for applications using pre-trained models. The results of our experiments on two vision tasks (image classification and object detection) show that, on average, early exiting can reduce the computational complexity of these services up to 58% (in terms of FLOP count) and improve their inference latency up to 46% with a low to zero reduction in accuracy. SelfXit also outperforms existing methods, particularly on complex models and larger datasets. It achieves a notable reduction in latency of 51.6% and 30.4% on CIFAR100/Resnet50, with an accompanying increase in accuracy of 2.31% and 0.72\%, on average, compared to GATI and BranchyNet.

PDF Details

AAAI Conference 2023 Conference Paper

USDNL: Uncertainty-Based Single Dropout in Noisy Label Learning

Yuanzhuo Xu
Xiaoguang Niu
Jie Yang
Steve Drew
Jiayu Zhou
Ruizhi Chen

Deep Neural Networks (DNNs) possess powerful prediction capability thanks to their over-parameterization design, although the large model complexity makes it suffer from noisy supervision. Recent approaches seek to eliminate impacts from noisy labels by excluding data points with large loss values and showing promising performance. However, these approaches usually associate with significant computation overhead and lack of theoretical analysis. In this paper, we adopt a perspective to connect label noise with epistemic uncertainty. We design a simple, efficient, and theoretically provable robust algorithm named USDNL for DNNs with uncertainty-based Dropout. Specifically, we estimate the epistemic uncertainty of the network prediction after early training through single Dropout. The epistemic uncertainty is then combined with cross-entropy loss to select the clean samples during training. Finally, we theoretically show the equivalence of replacing selection loss with single cross-entropy loss. Compared to existing small-loss selection methods, USDNL features its simplicity for practical scenarios by only applying Dropout to a standard network, while still achieving high model accuracy. Extensive empirical results on both synthetic and real-world datasets show that USDNL outperforms other methods. Our code is available at https://github.com/kovelxyz/USDNL.

PDF Details DOI