Author name cluster

Shaofeng Cai

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

2 author rows

ICML Conference 2025 Conference Paper

In-Context Adaptation to Concept Drift for Learned Database Operations

Jiaqi Zhu 0002
Shaofeng Cai
Yanyan Shen
Gang Chen 0001
Fang Deng
Beng Chin Ooi

Machine learning has demonstrated transformative potential for database operations, such as query optimization and in-database data analytics. However, dynamic database environments, characterized by frequent updates and evolving data distributions, introduce concept drift, which leads to performance degradation for learned models and limits their practical applicability. Addressing this challenge requires efficient frameworks capable of adapting to shifting concepts while minimizing the overhead of retraining or fine-tuning. In this paper, we propose FLAIR, an online adaptation framework that introduces a new paradigm called in-context adaptation for learned database operations. FLAIR leverages the inherent property of data systems, i. e. , immediate availability of execution results for predictions, to enable dynamic context construction. By formalizing adaptation as $f: (\mathbf{x} | \mathcal{C}_t) \to \mathbf{y}$, with $\mathcal{C}_t$ representing a dynamic context memory, FLAIR delivers predictions aligned with the current concept, eliminating the need for runtime parameter optimization. To achieve this, FLAIR integrates two key modules: a Task Featurization Module for encoding task-specific features into standardized representations, and a Dynamic Decision Engine, pre-trained via Bayesian meta-training, to adapt seamlessly using contextual information at runtime. Extensive experiments across key database tasks demonstrate that FLAIR outperforms state-of-the-art baselines, achieving up to $5. 2\times$ faster adaptation and reducing error by 22. 5% for cardinality estimation.

Details

ICLR Conference 2025 Conference Paper

Investigating Pattern Neurons in Urban Time Series Forecasting

Chengxin Wang
Yiran Zhao
Shaofeng Cai
Gary Tan

Urban time series forecasting is crucial for smart city development and is key to sustainable urban management. Although urban time series models (UTSMs) are effective in general forecasting, they often overlook low-frequency events, such as holidays and extreme weather, leading to degraded performance in practical applications. In this paper, we first investigate how UTSMs handle these infrequent patterns from a neural perspective. Based on our findings, we propose $\textbf{P}$attern $\textbf{N}$euron guided $\textbf{Train}$ing ($\texttt{PN-Train}$), a novel training method that features (i) a $\textit{perturbation-based detector}$ to identify neurons responsible for low-frequency patterns in UTSMs, and (ii) a $\textit{fine-tuning mechanism}$ that enhances these neurons without compromising representation learning on high-frequency patterns. Empirical results demonstrate that $\texttt{PN-Train}$ considerably improves forecasting accuracy for low-frequency events while maintaining high performance for high-frequency events. The code is available at https://github.com/cwang-nus/PN-Train.

Details

ICML Conference 2025 Conference Paper

NeuralCohort: Cohort-aware Neural Representation Learning for Healthcare Analytics

Changshuo Liu
Lingze Zeng
Kaiping Zheng
Shaofeng Cai
Beng Chin Ooi
James Wei Luen Yip

Electronic health records (EHR) aggregate extensive data critical for advancing patient care and refining intervention strategies. EHR data is essential for epidemiological study, more commonly referred to as cohort study, where patients with shared characteristics or similar diseases are analyzed over time. Unfortunately, existing studies on cohort modeling are limited, struggling to derive fine-grained cohorts or effectively utilize cohort information, which hinders their ability to uncover intrinsic relationships between cohorts. To this end, we propose NeuralCohort, a cohort-aware neural representation learning method that precisely segments patients into finer-grained cohorts via an innovative cohort contextualization mechanism and captures both intra- and inter-cohort information using a Biscale Cohort Learning Module. Designed as a plug-in, NeuralCohort integrates seamlessly with existing backbone models, enhancing their cohort analysis capabilities by infusing deep cohort insights into the representation learning processes. The effectiveness and generalizability of NeuralCohort are validated across extensive real-world EHR datasets. Experimental results demonstrate that NeuralCohort consistently improves the performance of various backbone models, achieving up to an 8. 1% increase in AUROC.

Details

ICLR Conference 2022 Conference Paper

NASI: Label- and Data-agnostic Neural Architecture Search at Initialization

Yao Shu
Shaofeng Cai
Zhongxiang Dai
Beng Chin Ooi
Bryan Kian Hsiang Low

Recent years have witnessed a surging interest in Neural Architecture Search (NAS). Various algorithms have been proposed to improve the search efficiency and effectiveness of NAS, i.e., to reduce the search cost and improve the generalization performance of the selected architectures, respectively. However, the search efficiency of these algorithms is severely limited by the need for model training during the search process. To overcome this limitation, we propose a novel NAS algorithm called NAS at Initialization (NASI) that exploits the capability of a Neural Tangent Kernel in being able to characterize the performance of candidate architectures at initialization, hence allowing model training to be completely avoided to boost the search efficiency. Besides the improved search efficiency, NASI also achieves competitive search effectiveness on various datasets like CIFAR-10/100 and ImageNet. Further, NASI is shown to be label- and data-agnostic under mild conditions, which guarantees the transferability of architectures selected by our NASI over different datasets.

Details

AAAI Conference 2021 Conference Paper

Adaptive Knowledge Driven Regularization for Deep Neural Networks

Zhaojing Luo
Shaofeng Cai
Can Cui
Beng Chin Ooi
Yang Yang

In many real-world applications, the amount of data available for training is often limited, and thus inductive bias and auxiliary knowledge are much needed for regularizing model training. One popular regularization method is to impose prior distribution assumptions on model parameters, and many recent works also attempt to regularize training by integrating external knowledge into specific neurons. However, existing regularization methods fail to take account of the interaction between connected neuron pairs, which is invaluable internal knowledge for adaptive regularization for better representation learning as training progresses. In this paper, we explicitly take into account the interaction between connected neurons, and propose an adaptive internal knowledge driven regularization method, CORR-Reg. The key idea of CORR-Reg is to give a higher significance weight to connections of more correlated neuron pairs. The significance weights adaptively identify more important input neurons for each neuron. Instead of regularizing connection model parameters with a static strength such as weight decay, CORR- Reg imposes weaker regularization strength on more significant connections. As a consequence, neurons attend to more informative input features and thus learn more diversified and discriminative representation. We derive CORR-Reg with the Bayesian inference framework and propose a novel optimization algorithm with the Lagrange multiplier method and Stochastic Gradient Descent. Extensive evaluations on diverse benchmark datasets and neural network structures show that CORR-Reg achieves significant improvement over stateof-the-art regularization methods.

PDF Details

ICLR Conference 2020 Conference Paper

Understanding Architectures Learnt by Cell-based Neural Architecture Search

Yao Shu
Wei Wang 0059
Shaofeng Cai

Neural architecture search (NAS) searches architectures automatically for given tasks, e.g., image classification and language modeling. Improving the search efficiency and effectiveness has attracted increasing attention in recent years. However, few efforts have been devoted to understanding the generated architectures. In this paper, we first reveal that existing NAS algorithms (e.g., DARTS, ENAS) tend to favor architectures with wide and shallow cell structures. These favorable architectures consistently achieve fast convergence and are consequently selected by NAS algorithms. Our empirical and theoretical study further confirms that their fast convergence derives from their smooth loss landscape and accurate gradient information. Nonetheless, these architectures may not necessarily lead to better generalization performance compared with other candidate architectures in the same search space, and therefore further improvement is possible by revising existing NAS algorithms.

Details