Author name cluster

Yifan Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

23 papers

2 author rows

AAAI Conference 2026 Conference Paper

Investigating Data Pruning for Pretraining Biological Foundation Models at Scale

Yifan Wu
Jiyue Jiang
Xichen Ye
Yiqi Wang
Chang Zhou
Yitao Xu
Jiayang Chen
He Hu

Biological foundation models (BioFMs), pretrained on large-scale biological sequences, have recently shown strong potential in providing meaningful representations for diverse downstream bioinformatics tasks. However, such models often rely on millions to billions of training sequences and billions of parameters, resulting in prohibitive computational costs and significant barriers to reproducibility and accessibility—particularly for academic labs. To address these challenges, we investigate the feasibility of data pruning for BioFM pretraining and propose a post-hoc influence-guided data pruning framework tailored to biological domains. Our approach first introduces a subset-based self-influence formulation that enables efficient estimation of sample importance at low computational cost. Built upon this, we propose two simple yet effective selection strategies: Top-k Influence (Top I) and Coverage-Centric Influence (CCI). Then, we empirically validate our method on two representative BioFMs: RNA-FM and ESM-C. For RNA, our framework consistently outperforms random selection baselines under an extreme pruning rate of over 99%, which displays our framework's effectiveness. Furthermore, we demonstrate the generalizability of our framework on protein-related tasks using ESM-C. Specifically, our coreset even outperforms random 10x subsets in both RNA and protein settings, revealing substantial redundancy in biological sequence datasets. These findings underscore the potential of influence-guided data pruning to substantially reduce the computational cost of BioFM pretraining, paving the way for more efficient, accessible, and sustainable biological AI research.

PDF Details DOI

AAAI Conference 2026 Conference Paper

The Publication Choice Problem

Haichuan Wang
Yifan Wu
Haifeng Xu

Researchers strategically choose where to submit their work in order to maximize its impact, and these publication decisions in turn determine venues' impact factors. To analyze how individual publication choices both respond to and shape venue impact, we introduce a game-theoretic framework - coined the Publication Choice Problem - that captures this two‐way interplay. We show the existence of a pure-strategy equilibrium in the Publication Choice Problem and its uniqueness under binary researcher types. Our characterizations of the equilibrium properties offer insights about what publication behaviors better indicate a researcher's impact level. Through equilibrium analysis, we further investigate how labeling papers with ``spotlight'' affects the impact factor of venues in the research community. Our analysis shows that competitive venue labeling top papers with ``spotlight'' may decrease the overall impact of other venues in the community, while less competitive venues with ``spotlight'' labeling have an opposite impact.

PDF Details DOI

EAAI Journal 2025 Journal Article

A dual-objective contrastive learning approach with dynamic self-adaption for zero-shot fault diagnosis

Yifan Wu
Min Xia

Fault type classification and fault severity identification are two critical and complementary tasks in fault diagnosis of industrial machines, providing essential information for the maintenance and safety of the machines. However, variable operating conditions in industrial settings make it hard to collect comprehensive fault data covering all possible types and severities, thereby limiting diagnostic efficiency. To overcome these challenges, a novel multi-task network approach is proposed to detect fault type and severity simultaneously even with zero novel samples. Discriminative features are extracted through a contrastive network with task-specific projection heads, enabling the capture of distinct representations for fault type and severity. Two zero-shot mapping spaces are constructed to diagnose fault types and severity by aligning feature representations with the semantic information of fault types and severity. A dynamic self-adaptation optimization mechanism is introduced considering the dependency of fault severity on fault types. It enhances the identification of fault severity. The proposed method was evaluated on two bearing datasets. It achieved up to 89. 4 % accuracy for fault type and 83. 42 % for fault severity under zero-shot settings, outperforming baselines and demonstrating strong real-world applicability.

Details DOI

AAAI Conference 2025 Conference Paper

Optimized Gradient Clipping for Noisy Label Learning

Xichen Ye
Yifan Wu
Weizhong Zhang
Xiaoqiang Li
Yifan Chen
Cheng Jin

Previous research has shown that constraining the gradient of loss function w.r.t. model-predicted probabilities can enhance the model robustness against noisy labels. These methods typically specify a fixed optimal threshold for gradient clipping through validation data to obtain the desired robustness against noise. However, this common practice overlooks the dynamic distribution of gradients from both clean and noisy-labeled samples at different stages of training, significantly limiting the model capability to adapt to the variable nature of gradients throughout the training process. To address this issue, we propose a simple yet effective approach called Optimized Gradient Clipping (OGC), which dynamically adjusts the clipping threshold based on the ratio of noise gradients to clean gradients after clipping, estimated by modeling the distributions of clean and noisy samples. This approach allows us to modify the clipping threshold at each training step, effectively controlling the influence of noise gradients. Additionally, we provide statistical analysis to certify the noise-tolerance ability of OGC. Our extensive experiments across various types of label noise, including symmetric, asymmetric, instance-dependent, and real-world noise, demonstrate the effectiveness of our approach.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis

Yue Yang
Mona Gandhi
Yufei Wang
Yifan Wu
Michael S. Yao
Chris Callison-Burch
James C. Gee
Mark Yatskar

While deep networks have achieved broad success in analyzing natural images, when applied to medical scans, they often fail in unexcepted situations. We investigate this challenge and focus on model sensitivity to domain shifts, such as data sampled from different hospitals or data confounded by demographic variables such as sex, race, etc, in the context of chest X-rays and skin lesion images. A key finding we show empirically is that existing visual backbones lack an appropriate prior from the architecture for reliable generalization in these settings. Taking inspiration from medical training, we propose giving deep networks a prior grounded in explicit medical knowledge communicated in natural language. To this end, we introduce Knowledge-enhanced Bottlenecks (KnoBo), a class of concept bottleneck models that incorporates knowledge priors that constrain it to reason with clinically relevant factors found in medical textbooks or PubMed. KnoBo uses retrieval-augmented language models to design an appropriate concept space paired with an automatic training procedure for recognizing the concept. We evaluate different resources of knowledge and recognition architectures on a broad range of domain shifts across 20 datasets. In our comprehensive evaluation with two imaging modalities, KnoBo outperforms fine-tuned models on confounded datasets by 32. 4% on average. Finally, evaluations reveal that PubMed is a promising resource for making medical models less sensitive to domain shift, outperforming other resources on both diversity of information and final prediction performance.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency

Zenan Li
Yifan Wu
Zhaoyu Li
Xinming Wei
Fan Yang
Xian Zhang
Xiaoxing Ma

Autoformalization, the task of automatically translating natural language descriptions into a formal language, poses a significant challenge across various domains, especially in mathematics. Recent advancements in large language models (LLMs) have unveiled their promising capabilities to formalize even competition-level math problems. However, we observe a considerable discrepancy between pass@1 and pass@k accuracies in LLM-generated formalizations. To address this gap, we introduce a novel framework that scores and selects the best result from k autoformalization candidates based on two complementary self-consistency methods: symbolic equivalence and semantic consistency. Elaborately, symbolic equivalence identifies the logical homogeneity among autoformalization candidates using automated theorem provers, and semantic consistency evaluates the preservation of the original meaning by informalizing the candidates and computing the similarity between the embeddings of the original and informalized texts. Our extensive experiments on the MATH and miniF2F datasets demonstrate that our approach significantly enhances autoformalization accuracy, achieving up to 0. 22-1. 35x relative improvements across various LLMs and baseline methods.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection

Songmin Dai
Yifan Wu
Xiaoqiang Li
Xiangyang Xue

Recent unsupervised anomaly detection methods often rely on feature extractors pretrained with auxiliary datasets or on well-crafted anomaly-simulated samples. However, this might limit their adaptability to an increasing set of anomaly detection tasks due to the priors in the selection of auxiliary datasets or the strategy of anomaly simulation. To tackle this challenge, we first introduce a prior-less anomaly generation paradigm and subsequently develop an innovative unsupervised anomaly detection framework named GRAD, grounded in this paradigm. GRAD comprises three essential components: (1) a diffusion model (PatchDiff) to generate contrastive patterns by preserving the local structures while disregarding the global structures present in normal images, (2) a self-supervised reweighting mechanism to handle the challenge of long-tailed and unlabeled contrastive patterns generated by PatchDiff, and (3) a lightweight patch-level detector to efficiently distinguish the normal patterns and reweighted contrastive patterns. The generation results of PatchDiff effectively expose various types of anomaly patterns, e.g. structural and logical anomaly patterns. In addition, extensive experiments on both MVTec AD and MVTec LOCO datasets also support the aforementioned observation and demonstrate that GRAD achieves competitive anomaly detection accuracy and superior inference speed.

PDF Details DOI

YNICL Journal 2023 Journal Article

Altered gray matter volumes and plasma IL-6 level in major depressive disorder patients with suicidal ideation

Yingrui Guo
Xiaowei Jiang
Linna Jia
Yue Zhu
Xinyu Han
Yifan Wu
Wen Liu
Wenhui Zhao

BACKGROUNDS: Suicidal ideation (SI) is one of the most serious consequences of major depressive disorder (MDD). Understanding the unique mechanism of MDD with SI (MDD + S) is crucial for treatment development. While abundant research has studied MDD, past studies have not reached a consensus on the mechanism of MDD + S. The study aimed to investigate the abnormalities of the gray matter volumes (GMVs) and plasma IL-6 level in MDD + S to further reveal the mechanism of MDD + S. METHODS: We tested the plasma IL-6 level using Luminex multifactor assays and collected the Structural Magnetic Resonance Imaging (SMRI) data from 34 healthy controls (HCs), 36 MDD patients without SI (MDD - S) and 34 MDD + S patients. We performed a partial correlation between the GMVs of the brain regions with significant differences and plasma IL-6 level with age, sex, medication, scores of HAMD-17 and HAMA as the covariates. RESULTS: Compared with HCs and MDD - S, MDD + S had significantly decreased GMVs in the left cerebellum Crus I/II and significantly increased plasma IL-6 level; compared with HCs, both the MDD + S and MDD - S had significantly decreased GMVs in right precentral and postcentral gyri. No significant correlation was found between the GMVs and the plasma IL-6 level in the MDD + S and MDD - S, respectively. While the GMVs of the right precentral and postcentral gyri negatively correlated with the level of IL-6 in the whole MDD (r = -0.28, P = 0.03). The GMVs of the left cerebellum Crus I/II (r = -0.47, P = 0.02), and the right precentral and postcentral gyri (r = -0.42, P = 0.04) negatively correlated with the level of IL-6 in HCs. CONCLUSION: The altered GMVs and the plasma IL-6 level may provide a scientific basis to understand the pathophysiological mechanisms of MDD + S.

Details DOI

YNICL Journal 2023 Journal Article

Gray matter volume reduction in orbitofrontal cortex correlated with plasma glial cell line-derived neurotrophic factor (GDNF) levels within major depressive disorder

Yifan Wu
Lingtao Kong
Anqi Yang
Kaiqi Xin
Yihui Lu
Xintong Yan
Wen Liu
Yue Zhu

BACKGROUND: Major depressive disorder (MDD) is a severe mental disorder characterized by reduced gray matter volume (GMV). To date, the pathogenesis of MDD remains unclear, but neurotrophic factors play an essential role in the pathophysiological alterations of MDD during disease development. In particular, plasma glial cell line-derived neurotrophic factor (GDNF) has been suggested as a potential biomarker that may be associated with disease activity and neurological progression in MDD. Our study investigated whether plasma GDNF levels in MDD patients and healthy controls (HCs) are correlated with GMV alterations. METHODS: We studied 54 MDD patients and 48 HCs. The effect of different diagnoses on whole-brain GMV was investigated using ANOVA (Analysis of Variance). The threshold of significance was p < 0.05, and Gaussian random-field (GRF) correction for error was used. All analyses were controlled for covariates such as ethnicity, handedness, age, and gender that could affect GMV. RESULT: Compared with the HC group, the GMV in the MDD group was significantly reduced in the right inferior orbitofrontal cortex (OFC), and plasma GDNF levels were significantly higher in the MDD group than in the HC group. In the right inferior OFC, the GDNF levels were positively correlated with GMV reduction in the MDD group, whereas in the HC group, a negative correlation was observed between GDNF levels and GMV reduction. CONCLUSION: Although increased production of GDNF in MDD may help repair neural damage in brain regions associated with brain disease, its repairing effects may be interfered with and hindered by underlying neuroinflammatory processes.

Details DOI

ICML Conference 2023 Conference Paper

Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model

Siyu Chen 0001
Jibang Wu
Yifan Wu
Zhuoran Yang

We study the incentivized information acquisition problem, where a principal hires an agent to gather information on her behalf. Such a problem is modeled as a Stackelberg game between the principal and the agent, where the principal announces a scoring rule that specifies the payment, and then the agent then chooses an effort level that maximizes her own profit and reports the information. We study the online setting of such a problem from the principal’s perspective, i. e. , designing the optimal scoring rule by repeatedly interacting with the strategic agent. We design a provably sample efficient algorithm that tailors the UCB algorithm (Auer et al. , 2002) to our model, which achieves a $\mathcal{O} (K^2\cdot T^{2/3})$ regret after $T$ iterations, where $K$ is the number of effort levels of the agent. Our algorithm features a delicate estimation procedure for the optimal profit of the principal, and a conservative correction scheme that ensures the desired agent’s actions are incentivized. Furthermore, a key feature of our regret bound is that it is independent of the number of states of the environment.

Details

IJCAI Conference 2023 Conference Paper

Singularformer: Learning to Decompose Self-Attention to Linearize the Complexity of Transformer

Yifan Wu
Shichao Kan
Min Zeng
Min Li

Transformers achieve excellent performance in a variety of domains since they can capture long-distance dependencies through the self-attention mechanism. However, self-attention is computationally costly due to its quadratic complexity and high memory consumption. In this paper, we propose a novel Transformer variant (Singularformer) that uses neural networks to learn the singular value decomposition process of the attention matrix to design a linear-complexity and memory-efficient global self-attention mechanism. Specifically, we decompose the attention matrix into the product of three matrix factors based on singular value decomposition and design neural networks to learn these matrix factors, then the associative law of matrix multiplication is used to linearize the calculation of self-attention. The above procedure allows us to compute self-attention as two-dimensional reduction processes in the first and second token dimensional spaces, followed by a multi-head self-attention computational process on the first dimensional reduced token features. Experimental results on 8 real-world datasets demonstrate that Singularformer performs favorably against the other Transformer variants with lower time and space complexity. Our source code is publicly available at https: //github. com/CSUBioGroup/Singularformer.

PDF Details DOI

JBHI Journal 2022 Journal Article

A Pseudo Label-Wise Attention Network for Automatic ICD Coding

Yifan Wu
Min Zeng
Ying Yu
Yaohang Li
Min Li

Automatic International Classification of Diseases (ICD) coding is defined as a kind of text multi-label classification problem, which is difficult because the number of labels is very large and the distribution of labels is unbalanced. The label-wise attention mechanism is widely used in automatic ICD coding because it can assign weights to every word in full Electronic Medical Records (EMR) for different ICD codes. However, the label-wise attention mechanism is redundant and costly in computing. In this paper, we propose a pseudo label-wise attention mechanism to tackle the problem. Instead of computing different attention modes for different ICD codes, the pseudo label-wise attention mechanism automatically merges similar ICD codes and computes only one attention mode for the similar ICD codes, which greatly compresses the number of attention modes and improves the predicted accuracy. In addition, we apply a more convenient and effective way to obtain the ICD vectors, and thus our model can predict new ICD codes by calculating the similarities between EMR vectors and ICD vectors. Our model demonstrates effectiveness in extensive computational experiments. On the public MIMIC-III dataset and private Xiangya dataset, our model achieves the best performance on micro F1 (0. 583 and 0. 806), micro AUC (0. 986 and 0. 994), P@8 (0. 756 and 0. 413), and costs much smaller GPU memory (about 26. 1% of the models with label-wise attention). Furthermore, we verify the ability of our model in predicting new ICD codes. The interpretablility analysis and case study show the effectiveness and reliability of the patterns obtained by the pseudo label-wise attention mechanism.

Details DOI

ICML Conference 2021 Conference Paper

Instabilities of Offline RL with Pre-Trained Neural Representation

Ruosong Wang
Yifan Wu
Ruslan Salakhutdinov
Sham M. Kakade

In offline reinforcement learning (RL), we seek to utilize offline data to evaluate (or learn) policies in scenarios where the data are collected from a distribution that substantially differs from that of the target policy to be evaluated. Recent theoretical advances have shown that such sample-efficient offline RL is indeed possible provided certain strong representational conditions hold, else there are lower bounds exhibiting exponential error amplification (in the problem horizon) unless the data collection distribution has only a mild distribution shift relative to the target policy. This work studies these issues from an empirical perspective to gauge how stable offline RL methods are. In particular, our methodology explores these ideas when using features from pre-trained neural networks, in the hope that these representations are powerful enough to permit sample efficient offline RL. Through extensive experiments on a range of tasks, we see that substantial error amplification does occur even when using such pre-trained representations (trained on the same task itself); we find offline RL is stable only under extremely mild distribution shift. The implications of these results, both from a theoretical and an empirical perspective, are that successful offline RL (where we seek to go beyond the low distribution shift regime) requires substantially stronger conditions beyond those which suffice for successful supervised learning.

Details

NeurIPS Conference 2021 Conference Paper

Mixture Proportion Estimation and PU Learning:A Modern Approach

Saurabh Garg
Yifan Wu
Alexander J. Smola
Sivaraman Balakrishnan
Zachary Lipton

Given only positive examples and unlabeled examples (from both positive and negative classes), we might hope nevertheless to estimate an accurate positive-versus-negative classifier. Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE)---determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning---given such an estimate, learning the desired positive-versus-negative classifier. Unfortunately, classical methods for both problems break down in high-dimensional settings. Meanwhile, recently proposed heuristics lack theoretical coherence and depend precariously on hyperparameter tuning. In this paper, we propose two simple techniques: Best Bin Estimation (BBE) (for MPE); and Conditional Value Ignoring Risk (CVIR), a simple objective for PU-learning. Both methods dominate previous approaches empirically, and for BBE, we establish formal guarantees that hold whenever we can train a model to cleanly separate out a small subset of positive examples. Our final algorithm (TED)$^n$, alternates between the two procedures, significantly improving both our mixture proportion estimator and classifier

PDF Details

ICML Conference 2021 Conference Paper

On the Optimality of Batch Policy Optimization Algorithms

Chenjun Xiao
Yifan Wu
Jincheng Mei
Bo Dai 0001
Tor Lattimore
Lihong Li 0001
Csaba Szepesvári
Dale Schuurmans

Batch policy optimization considers leveraging existing data for policy construction before interacting with an environment. Although interest in this problem has grown significantly in recent years, its theoretical foundations remain under-developed. To advance the understanding of this problem, we provide three results that characterize the limits and possibilities of batch policy optimization in the finite-armed stochastic bandit setting. First, we introduce a class of confidence-adjusted index algorithms that unifies optimistic and pessimistic principles in a common framework, which enables a general analysis. For this family, we show that any confidence-adjusted index algorithm is minimax optimal, whether it be optimistic, pessimistic or neutral. Our analysis reveals that instance-dependent optimality, commonly used to establish optimality of on-line stochastic bandit algorithms, cannot be achieved by any algorithm in the batch setting. In particular, for any algorithm that performs optimally in some environment, there exists another environment where the same algorithm suffers arbitrarily larger regret. Therefore, to establish a framework for distinguishing algorithms, we introduce a new weighted-minimax criterion that considers the inherent difficulty of optimal value prediction. We demonstrate how this criterion can be used to justify commonly used pessimistic principles for batch policy optimization.

Details

NeurIPS Conference 2020 Conference Paper

A Unified View of Label Shift Estimation

Saurabh Garg
Yifan Wu
Sivaraman Balakrishnan
Zachary Lipton

Under label shift, the label distribution $p(y)$ might change but the class-conditional distributions $p(x|y)$ do not. There are two dominant approaches for estimating the label marginal. BBSE, a moment-matching approach based on confusion matrices, is provably consistent and provides interpretable error bounds. However, a maximum likelihood estimation approach, which we call MLLS, dominates empirically. In this paper, we present a unified view of the two methods and the first theoretical characterization of MLLS. Our contributions include (i) consistency conditions for MLLS, which include calibration of the classifier and a confusion matrix invertibility condition that BBSE also requires; (ii) a unified framework, casting BBSE as roughly equivalent to MLLS for a particular choice of calibration method; and (iii) a decomposition of MLLS's finite-sample error into terms reflecting miscalibration and estimation error. Our analysis attributes BBSE's statistical inefficiency to a loss of information due to coarse calibration. Experiments on synthetic data, MNIST, and CIFAR10 support our findings.

PDF Details

ICML Conference 2019 Conference Paper

Domain Adaptation with Asymmetrically-Relaxed Distribution Alignment

Yifan Wu
Ezra Winston
Divyansh Kaushik
Zachary C. Lipton

Domain adaptation addresses the common situation in which the target distribution generating our test data differs from the source distribution generating our training data. While absent assumptions, domain adaptation is impossible, strict conditions, e. g. covariate or label shift, enable principled algorithms. Recently-proposed domain-adversarial approaches consist of aligning source and target encodings, an approach often motivated as minimizing two (of three) terms in a theoretical bound on target error. Unfortunately, this minimization can cause arbitrary increases in the third term, a problem guaranteed to arise under shifting label distributions. We propose asymmetrically-relaxed distribution alignment, a new approach that overcomes some limitations of standard domain-adversarial algorithms. Moreover, we characterize precise assumptions under which our algorithm is theoretically principled and demonstrate empirical benefits on both synthetic and real datasets.

Details

NeurIPS Conference 2019 Conference Paper

Game Design for Eliciting Distinguishable Behavior

Fan Yang
Liu Leqi
Yifan Wu
Zachary Lipton
Pradeep Ravikumar
Tom Mitchell
William Cohen

The ability to inferring latent psychological traits from human behavior is key to developing personalized human-interacting machine learning systems. Approaches to infer such traits range from surveys to manually-constructed experiments and games. However, these traditional games are limited because they are typically designed based on heuristics. In this paper, we formulate the task of designing behavior diagnostic games that elicit distinguishable behavior as a mutual information maximization problem, which can be solved by optimizing a variational lower bound. Our framework is instantiated by using prospect theory to model varying player traits, and Markov Decision Processes to parameterize the games. We validate our approach empirically, showing that our designed games can successfully distinguish among players with different traits, outperforming manually-designed ones by a large margin.

PDF Details

TIST Journal 2019 Journal Article

Predicting Academic Performance for College Students

Huaxiu Yao
Defu Lian
Yi Cao
Yifan Wu
Tao Zhou

Detecting abnormal behaviors of students in time and providing personalized intervention and guidance at the early stage is important in educational management. Academic performance prediction is an important building block to enabling this pre-intervention and guidance. Most of the previous studies are based on questionnaire surveys and self-reports, which suffer from small sample size and social desirability bias. In this article, we collect longitudinal behavioral data from the smart cards of 6,597 students and propose three major types of discriminative behavioral factors, diligence, orderliness, and sleep patterns. Empirical analysis demonstrates these behavioral factors are strongly correlated with academic performance. Furthermore, motivated by the social influence theory, we analyze the correlation between each student’s academic performance with his/her behaviorally similar students’. Statistical tests indicate this correlation is significant. Based on these factors, we further build a multi-task predictive framework based on a learning-to-rank algorithm for academic performance prediction. This framework captures inter-semester correlation, inter-major correlation, and integrates student similarity to predict students’ academic performance. The experiments on a large-scale real-world dataset show the effectiveness of our methods for predicting academic performance and the effectiveness of proposed behavioral factors.

Details DOI

ICRA Conference 2018 Conference Paper

Planar Object Tracking in the Wild: A Benchmark

Pengpeng Liang
Yifan Wu
Hu Lu
Liming Wang
Chunyuan Liao
Haibin Ling

Planar object tracking is an actively studied problem in vision-based robotic applications. While several benchmarks have been constructed for evaluating state-of-the-art algorithms, there is a lack of video sequences captured in the wild rather than in constrained laboratory environment. In this paper, we present a carefully designed planar object tracking benchmark containing 210 videos of 30 planar objects sampled in the natural environment. In particular, for each object, we shoot seven videos involving various challenging factors, namely scale change, rotation, perspective distortion, motion blur, occlusion, out-of-view, and unconstrained. The ground truth is carefully annotated semi-manually to ensure the quality. Moreover, eleven state-of-the-art algorithms are evaluated on the benchmark using two evaluation metrics, with detailed analysis provided for the evaluation results. We expect the proposed benchmark to benefit future studies on planar object tracking.

Details

ICML Conference 2016 Conference Paper

Conservative Bandits

Yifan Wu
Roshan Shariff
Tor Lattimore
Csaba Szepesvári

We study a novel multi-armed bandit problem that models the challenge faced by a company wishing to explore new strategies to maximize revenue whilst simultaneously maintaining their revenue above a fixed baseline, uniformly over time. While previous work addressed the problem under the weaker requirement of maintaining the revenue constraint only at a given fixed time in the future, the design of those algorithms makes them unsuitable under the more stringent constraints. We consider both the stochastic and the adversarial settings, where we propose natural yet novel strategies and analyze the price for maintaining the constraints. Amongst other things, we prove both high probability and expectation bounds on the regret, while we also consider both the problem of maintaining the constraints with high probability or expectation. For the adversarial setting the price of maintaining the constraint appears to be higher, at least for the algorithm considered. A lower bound is given showing that the algorithm for the stochastic setting is almost optimal. Empirical results obtained in synthetic environments complement our theoretical findings.

Details

ICML Conference 2015 Conference Paper

On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments

Yifan Wu
András György 0001
Csaba Szepesvári

We consider the problem of identifying a good option out of finite set of options under combinatorially structured, noisy feedback about the quality of the options in a sequential process: In each round, a subset of the options, from an available set of subsets, can be selected to receive noisy information about the quality of the options in the chosen subset. The goal is to identify the highest quality option, or a group of options of the highest quality, with a small error probability, while using the smallest number of measurements. The problem generalizes best-arm identification problems. By extending previous work, we design new algorithms that are shown to be able to exploit the combinatorial structure of the problem in a nontrivial fashion, while being unimprovable in special cases. The algorithms call a set multi-covering oracle, hence their performance and efficiency is strongly tied to whether the associated set multi-covering problem can be efficiently solved.

Details

NeurIPS Conference 2015 Conference Paper

Online Learning with Gaussian Payoffs and Side Observations

Yifan Wu
András György
Csaba Szepesvari

We consider a sequential learning problem with Gaussian payoffs and side information: after selecting an action $i$, the learner receives information about the payoff of every action $j$ in the form of Gaussian observations whose mean is the same as the mean payoff, but the variance depends on the pair $(i, j)$ (and may be infinite). The setup allows a more refined information transfer from one action to another than previous partial monitoring setups, including the recently introduced graph-structured feedback case. For the first time in the literature, we provide non-asymptotic problem-dependent lower bounds on the regret of any algorithm, which recover existing asymptotic problem-dependent lower bounds and finite-time minimax lower bounds available in the literature. We also provide algorithms that achieve the problem-dependent lower bound (up to some universal constant factor) or the minimax lower bounds (up to logarithmic factors).

PDF Details