Author name cluster

Yi He

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

28 papers

2 author rows

AAAI Conference 2026 Conference Paper

FinMathBench: A Formula-Driven Benchmark for Evaluating LLMs’ Math Reasoning Capabilities in Finance

Yi He
Ping Wang
Shiqiang Xiong
Chao Chen
Haixiang Hu

Many existing financial math reasoning benchmarks suffer from data contamination and high manual construction costs. To address this, we propose a novel formula-driven approach to dynamically construct math reasoning benchmarks in finance. Our two-stage approach: (1) generates single-formula questions by LLMs using a "Mask-for-Solve" paradigm for ground truth answers, and (2) synthesizes multi-formula questions through hierarchical tree-based DAGs. Our approach ensures novelty (via LLMs' creativity) and controllability of difficulty (via DAG structure). Based on a self-constructed financial formula bank, we utilize the proposed method to build FinMathBench, the first formula-driven and fully LLM-generated benchmark aimed at assessing LLMs' math reasoning abilities in finance, containing 946 questions across 4 complexity levels. Evaluation results on 40 LLMs demonstrate significant accuracy drops in multi-formula questions, e.g., 72.9% (1-Formula) to 14.0% (4-Formula) for GPT-4o under Chain-of-Thought prompting. Three critical flaws of LLMs are also observed: poor direct calculation performance, bias toward frequently solved variables in formulas, and erroneous "correction" of valid but extreme financial values. These findings highlight gaps in current LLMs' domain-specific reasoning and underscore FinMathBench's value for advancing robust financial LLMs.

PDF Details DOI

AAAI Conference 2026 Conference Paper

IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech

Siyi Zhou
Yiquan Zhou
Yi He
Xun Zhou
Jinchao Wang
Wei Deng
Jingchen Shu

Existing autoregressive large-scale text-to-speech (TTS) models have advantages in speech naturalness, but their token-by-token generation mechanism makes it difficult to precisely control the duration of synthesized speech. This becomes a significant limitation in applications requiring strict audio-visual synchronization, such as video dubbing. This paper introduces IndexTTS2, which proposes a novel, general, and autoregressive model-friendly method for speech duration control. The method supports two generation modes: one explicitly specifies the number of generated tokens to precisely control speech duration; the other freely generates speech in an autoregressive manner without specifying the number of tokens, while faithfully reproducing the prosodic features of the input prompt. Furthermore, IndexTTS2 achieves disentanglement between emotional expression and speaker identity, enabling independent control over timbre and emotion. In the zero-shot setting, the model can accurately reconstruct the target timbre (from the timbre prompt) while perfectly reproducing the specified emotional tone (from the style prompt). To enhance speech clarity in highly emotional expressions, we incorporate GPT latent representations and design a novel three-stage training paradigm to improve the stability of the generated speech. Additionally, to lower the barrier for emotional control, we designed a soft instruction mechanism based on text descriptions by fine-tuning Qwen3, effectively guiding the generation of speech with the desired emotional orientation. Finally, experimental results on multiple datasets show that IndexTTS2 outperforms state-of-the-art zero-shot TTS models in terms of word error rate, speaker similarity, and emotional fidelity.

PDF Details DOI

AAAI Conference 2026 Conference Paper

MARLIN: Multi-Agent Reinforcement Learning for Incremental DAG Discovery

Dong Li
Zhengzhang Chen
Xujiang Zhao
Linlin Yu
Zhong Chen
Yi He
Haifeng Chen
Chen Zhao

Uncovering causal structures from observational data is crucial for understanding complex systems and making informed decisions. While reinforcement learning (RL) has shown promise in identifying these structures in the form of a directed acyclic graph (DAG), existing methods often lack efficiency, making them unsuitable for online applications. In this paper, we propose MARLIN, an efficient multi-agent RL-based approach for incremental DAG learning. MARLIN uses a DAG generation policy that maps a continuous real-valued space to the DAG space as an intra-batch strategy, then incorporates two RL agents—state-specific and state-invariant—to uncover causal relationships and integrates these agents into an incremental learning framework. Furthermore, the framework leverages a factored action space to enhance parallelization efficiency. Extensive experiments on synthetic and real datasets demonstrate that MARLIN outperforms state-of-the-art methods in terms of both efficiency and effectiveness.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Relink: Constructing Query-Driven Evidence Graph On-the-Fly for GraphRAG

Manzong Huang
Chenyang Bu
Yi He
Xingrui Zhuo
Xindong Wu

Graph-based Retrieval-Augmented Generation (GraphRAG) mitigates hallucinations in Large Language Models (LLMs) by grounding them in structured knowledge. However, current GraphRAG methods are constrained by a prevailing build-then-reason paradigm, which relies on a static, pre-constructed Knowledge Graph (KG). This paradigm faces two critical challenges. First, the KG's inherent incompleteness often breaks reasoning paths. Second, the graph’s low signal-to-noise ratio introduces distractor facts, presenting query-relevant but misleading knowledge that disrupts the reasoning process. To address these challenges, we argue for a reason-and-construct paradigm and propose Relink, a framework that dynamically builds a query-specific evidence graph. To tackle incompleteness, Relink instantiates required facts from a latent relation pool derived from the original text corpus, repairing broken paths on the fly. To handle misleading or distractor facts, Relink employs a unified, query-aware evaluation strategy that jointly considers candidates from both the KG and latent relations, selecting those most useful for answering the query rather than relying on their pre-existence. This empowers Relink to actively discard distractor facts and construct the most faithful and precise evidence path for each query. Extensive experiments on five Open-Domain Question Answering benchmarks show that Relink achieves significant average improvements of 5.4% in EM and 5.2% in F1 over leading GraphRAG baselines, demonstrating the superiority of our proposed framework.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

A Novel Sparse Active Online Learning Framework for Fast and Accurate Streaming Anomaly Detection Over Data Streams

Zhong Chen
Yi He
Di Wu
Chen Zhao
Meikang Qiu

Online Anomaly Detection (OAD) is critical for identifying rare yet important data points in large, dynamic, and complex data streams. A key challenge lies in achieving accurate and consistent detection of anomalies while maintaining computational and memory efficiency. Conventional OAD approaches, which depend on distributional deviations and static thresholds, struggle with model update delays and catastrophic forgetting, leading to missed detections and high false positive rates. To address these limitations, we propose a novel Streaming Anomaly Detection (SAD) method, grounded in a sparse active online learning framework. Our approach uniquely integrates ℓ1, 2-norm sparse online learning with CUR decomposition-based active learning, enabling simultaneous fast feature selection and dynamic instance selection. The efficient CUR decomposition further supports real-time residual analysis for anomaly scoring, eliminating the need for manual threshold settings about temporal data distributions. Extensive experiments on diverse streaming datasets demonstrate SAD's superiority, achieving a 14. 06% reduction in detection error rates compared to five state-of-the-art competitors.

PDF Details DOI

ICML Conference 2025 Conference Paper

Chaos Meets Attention: Transformers for Large-Scale Dynamical Prediction

Yi He
Yiming Yang
Xiaoyuan Cheng
Hai Wang
Xiao Xue
Boli Chen
Yukun Hu

Generating long-term trajectories of dissipative chaotic systems autoregressively is a highly challenging task. The inherent positive Lyapunov exponents amplify prediction errors over time. Many chaotic systems possess a crucial property — ergodicity on their attractors, which makes long-term prediction possible. State-of-the-art methods address ergodicity by preserving statistical properties using optimal transport techniques. However, these methods face scalability challenges due to the curse of dimensionality when matching distributions. To overcome this bottleneck, we propose a scalable transformer-based framework capable of stably generating long-term high-dimensional and high-resolution chaotic dynamics while preserving ergodicity. Our method is grounded in a physical perspective, revisiting the Von Neumann mean ergodic theorem to ensure the preservation of long-term statistics in the $\mathcal{L}^2$ space. We introduce novel modifications to the attention mechanism, making the transformer architecture well-suited for learning large-scale chaotic systems. Compared to operator-based and transformer-based methods, our model achieves better performances across five metrics, from short-term prediction accuracy to long-term statistics. In addition to our methodological contributions, we introduce new chaotic system benchmarks: a machine learning dataset of 140$k$ snapshots of turbulent channel flow and a processed high-dimensional Kolmogorov Flow dataset, along with various evaluation metrics for both short- and long-term performances. Both are well-suited for machine learning research on chaotic systems.

Details

EAAI Journal 2025 Journal Article

Density ball-based neighborhood rough set model for attribute reduction and classification in uncertain data

Yi He
Yabin Shao
Xueqin Zhu
Yunlong Cheng
Youlin Hua
Laquan Li

The neighborhood rough set model is widely used to handle uncertain and fuzzy data, especially for tasks such as attribute reduction and classification. In this model, the neighborhood radius is considered crucial, as it directly affects the accuracy of approximation and the stability of decision results. Granular ball generation is an information granulation technique that divides a dataset into adaptive spherical regions. It is commonly used to automatically determine the neighborhood radius, enabling efficient and scalable neighborhood construction for attribute reduction and classification tasks. However, many existing granular ball generation methods rely on randomly selected initial centers, which introduces considerable randomness and instability into the granulation results and subsequent tasks. To enhance the stability and robustness of the granulation structure, a density ball generation (DBG) algorithm is proposed. This algorithm follows a “coarse-to-fine” strategy, in which coarse-grained balls are first generated and then refined. In addition, density-based clustering is integrated to identify noise points and avoid generating balls that contain only a single object. On this basis, the DBG method is integrated into the neighborhood rough set framework to construct a density ball-based neighborhood rough set (DBNRS) model. Furthermore, a weighted DBG-based granular ball k -nearest neighbors (W-DB k NN) algorithm is proposed. Experiments are conducted on twenty public datasets from the UCI Machine Learning Repository, which are widely used due to their diversity and representativeness. The proposed method performs well in both attribute reduction and classification tasks, with an average improvement of 14. 3% in classification accuracy after reduction and 12. 1% in classifier performance comparison. These results indicate that the method demonstrates strong effectiveness and robustness across various scenarios.

Details DOI

IJCAI Conference 2025 Conference Paper

HGEN: Heterogeneous Graph Ensemble Networks

Jiajun Shen
Yufei Jin
Kaibu Feng
Yi He
Xingquan Zhu

This paper presents HGEN that pioneers ensemble learning for heterogeneous graphs. We argue that the heterogeneity in node types, nodal features, and local neighborhood topology poses significant challenges for ensemble learning, particularly in accommodating diverse graph learners. Our HGEN framework ensembles multiple learners through a meta-path and transformation-based optimization pipeline to uplift classification accuracy. Specifically, HGEN uses meta-path combined with random dropping to create Allele Graph Neural Networks (GNNs), whereby the base graph learners are trained and aligned for later ensembling. To ensure effective ensemble learning, HGEN presents two key components: 1) a residual-attention mechanism to calibrate allele GNNs of different meta-paths, thereby enforcing node embeddings to focus on more informative graphs to improve base learner accuracy, and 2) a correlation-regularization term to enlarge the disparity among embedding matrices generated from different meta-paths, thereby enriching base learner diversity. We analyze the convergence of HGEN and attest its higher regularization magnitude over simple voting. Experiments on five heterogeneous networks validate that HGEN consistently outperforms its state-of-the-art competitors by substantial margin. Codes are available at https: //github. com/Chrisshen12/HGEN.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

How to Mitigate Information Loss in Knowledge Graphs for GraphRAG: Leveraging Triple Context Restoration and Query-Driven Feedback

Manzong Huang
Chenyang Bu
Yi He
Xindong Wu

Knowledge Graph (KG)-augmented Large Language Models (LLMs) have recently propelled significant advances in complex reasoning tasks, thanks to their broad domain knowledge and contextual awareness. Unfortunately, current methods often assume KGs to be complete, which is impractical given the inherent limitations of KG construction and the potential loss of contextual cues when converting unstructured text into entity-relation triples. In response, this paper proposes the Triple Context Restoration and Query-driven Feedback (TCR-QF) framework, which reconstructs the textual context underlying each triple to mitigate information loss, while dynamically refining the KG structure by iteratively incorporating query-relevant missing knowledge. Experiments on five benchmark question-answering datasets substantiate the effectiveness of TCR-QF in KG and LLM integration, where itachieves a 29. 1% improvement in Exact Match and a 15. 5% improvement in F1 over its state-of-the-art GraphRAG competitors. The code is publicly available at https: //github. com/HFUT-DMiC-Lab/TCR-QF. git.

PDF Details DOI

ICLR Conference 2025 Conference Paper

Learning Chaos In A Linear Way

Xiaoyuan Cheng
Yi He
Yiming Yang
Xiao Xue
Sibo Cheng
Daniel Giles
Xiaohang Tang
Yukun Hu

Learning long-term behaviors in chaotic dynamical systems, such as turbulent flows and climate modelling, is challenging due to their inherent instability and unpredictability. These systems exhibit positive Lyapunov exponents, which significantly hinder accurate long-term forecasting. As a result, understanding long-term statistical behavior is far more valuable than focusing on short-term accuracy. While autoregressive deep sequence models have been applied to capture long-term behavior, they often lead to exponentially increasing errors in learned dynamics. To address this, we shift the focus from simple prediction errors to preserving an invariant measure in dissipative chaotic systems. These systems have attractors, where trajectories settle, and the invariant measure is the probability distribution on attractors that remains unchanged under dynamics. Existing methods generate long trajectories of dissipative chaotic systems by aligning invariant measures, but it is not always possible to obtain invariant measures for arbitrary datasets. We propose the Poincaré Flow Neural Network (PFNN), a novel operator learning framework designed to capture behaviors of chaotic systems without any explicit knowledge of the invariant measure. PFNN employs an auto-encoder to map the chaotic system to a finite-dimensional feature space, effectively linearizing the chaotic evolution. It then learns the linear evolution operators to match the physical dynamics by addressing two critical properties in dissipative chaotic systems: (1) contraction, the system’s convergence toward its attractors, and (2) measure invariance, trajectories on the attractors following a probability distribution invariant to the dynamics. Our experiments on a variety of chaotic systems, including Lorenz systems, Kuramoto-Sivashinsky equation and Navier–Stokes equation, demonstrate that PFNN has more accurate predictions and physical statistics compared to competitive baselines including the Fourier Neural Operator and the Markov Neural Operator.

Details

AAAI Conference 2025 Conference Paper

Metric-Agnostic Continual Learning for Sustainable Group Fairness

Heng Lian
Chen Zhao
Zhong Chen
Xingquan Zhu
My T. Thai
Yi He

Group Fairness-aware Continual Learning (GFCL) aims to eradicate discriminatory predictions against certain demographic groups in a sequence of diverse learning tasks. This paper explores an even more challenging GFCL problem – how to sustain a fair classifier across a sequence of tasks with covariate shifts and unlabeled data. We propose the MacFRL solution, with its key idea to optimize the sequence of learning tasks. We hypothesize that high-confident learning can be enabled in the optimized task sequence, where the classifier learns from a set of prioritized tasks to glean knowledge, thereby becoming more capable to handle the tasks with substantial distribution shifts that were originally deferred. Theoretical and empirical studies substantiate that MacFRL excels among its GFCL competitors in terms of prediction accuracy and group fair-ness metrics.

PDF Details DOI

TMLR Journal 2025 Journal Article

Multi-Modal Foundation Models for Computational Pathology: A Survey

Dong Li
Guihong Wan
Xintao Wu
Xinyu Wu
Xiaohui Chen
Yi He
Zhong Chen
Peter K Sorger

Foundation models have emerged as a powerful paradigm in computational pathology (CPath), enabling scalable and generalizable analysis of histopathological images. While early developments centered on uni-modal models trained solely on visual data, recent advances have highlighted the promise of multi-modal foundation models that integrate heterogeneous data sources such as textual reports, structured domain knowledge, and molecular profiles. In this survey, we provide a comprehensive and up-to-date review of multi-modal foundation models in CPath, with a particular focus on models built upon hematoxylin and eosin (H&E) stained whole slide images (WSIs) and tile-level representations. We categorize 34 state-of-the-art multi-modal foundation models into three major paradigms: vision-language, vision-knowledge graph, and vision-gene expression. We further divide vision-language models into non-LLM-based and LLM-based approaches. Additionally, we analyze 30 available multi-modal datasets tailored for pathology, grouped into image-text pairs, instruction datasets, and image-other modality pairs. Our survey also presents a taxonomy of downstream tasks, highlights training and evaluation strategies, and identifies key challenges and future directions. We aim for this survey to serve as a valuable resource for researchers and practitioners working at the intersection of pathology and AI.

PDF Details

NeurIPS Conference 2025 Conference Paper

On the Sample Complexity of Differentially Private Policy Optimization

Yi He
Xingyu Zhou

Policy optimization (PO) is a cornerstone of modern reinforcement learning (RL), with diverse applications spanning robotics, healthcare, and large language model training. The increasing deployment of PO in sensitive domains, however, raises significant privacy concerns. In this paper, we initiate a theoretical study of differentially private policy optimization, focusing explicitly on its sample complexity. We first formalize an appropriate definition of differential privacy (DP) tailored to PO, addressing the inherent challenges arising from on-policy learning dynamics and the subtlety involved in defining the unit of privacy. We then systematically analyze the sample complexity of widely-used PO algorithms, including policy gradient (PG), natural policy gradient (NPG) and more, under DP constraints and various settings, via a unified framework. Our theoretical results demonstrate that privacy costs can often manifest as lower-order terms in the sample complexity, while also highlighting subtle yet important observations in private PO settings. These offer valuable practical insights for privacy-preserving PO algorithms.

PDF Details

ICML Conference 2025 Conference Paper

Tensor-Var: Efficient Four-Dimensional Variational Data Assimilation

Yiming Yang
Xiaoyuan Cheng
Daniel Giles
Sibo Cheng
Yi He
Xiao Xue
Boli Chen
Yukun Hu

Variational data assimilation estimates the dynamical system states by minimizing a cost function that fits the numerical models with the observational data. Although four-dimensional variational assimilation (4D-Var) is widely used, it faces high computational costs in complex nonlinear systems and depends on imperfect state-observation mappings. Deep learning (DL) offers more expressive approximators, while integrating DL models into 4D-Var is challenging due to their nonlinearities and lack of theoretical guarantees in assimilation results. In this paper, we propose Tensor-Var, a novel framework that integrates kernel conditional mean embedding (CME) with 4D-Var to linearize nonlinear dynamics, achieving convex optimization in a learned feature space. Moreover, our method provides a new perspective for solving 4D-Var in a linear way, offering theoretical guarantees of consistent assimilation results between the original and feature spaces. To handle large-scale problems, we propose a method to learn deep features (DFs) using neural networks within the Tensor-Var framework. Experiments on chaotic systems and global weather prediction with real-time observations show that Tensor-Var outperforms conventional and DL hybrid 4D-Var baselines in accuracy while achieving a 10- to 20-fold speed improvement.

Details

EAAI Journal 2025 Journal Article

Vehicle real-time collision risk prediction: A multi-modal learning approach for diverse urban road scenarios based on a large-scale near-crash event dataset

Jipu Li
Yi He
Ye Li
Helai Huang
Dan Wu
Jieling Jin

The effectiveness of vehicle collision avoidance systems depends on the precision of collision risk prediction models. However, current models often neglect the driver's condition, resulting in their poor ability to predict near-crash events triggered by aggressive, fatigued, or distracted driving. Additionally, current models overlook the differences in modality, type, and variability of multi-source data, leading to insufficient feature extraction from input data, which in turn limits the model's prediction accuracy. To address these issues, we developed an end-to-end pre-trained deep framework (PM-Transformer) with a Transformer, consisting of multi-module recurrent convolutional neural networks. The framework includes four modules: (1) pre-trained time series module that extracts spatiotemporal information from traffic time series using one-dimensional convolutional neural network - long short-term memory; (2) pre-trained spectral module that learns visual temporal representations from traffic spectrograms using two-dimensional convolutional neural network - long short-term memory; (3) metadata module for vectorizing traffic metadata; (4) fusion module that semantically integrates features from the three modules using a Transformer. Results show that the proposed model can achieve the same prediction accuracy as other models using only 5 % of their training sample size. Compared to other traditional models, our model improves the accuracy of risk prediction by 6 %, 9 %, and 4 %, respectively, with small sample sizes (0. 5 s, 1 s, and 2 s in advance), while also maintaining the best performance with larger sample sizes. Findings of this study hold significant potential for improving the effectiveness of vehicle collision avoidance systems.

Details DOI

AAAI Conference 2024 Conference Paper

GLDL: Graph Label Distribution Learning

Yufei Jin
Richard Gao
Yi He
Xingquan Zhu

Label Distribution Learning (LDL), as a more general learning setting than generic single-label and multi-label learning, has been commonly used in computer vision and many other applications. To date, existing LDL approaches are designed and applied to data without considering the interdependence between instances. In this paper, we propose a Graph Label Distribution Learning (GLDL) framework, which explicitly models three types of relationships: instance-instance, label-label, and instance-label, to learn the label distribution for networked data. A label-label network is learned to capture label-to-label correlation, through which GLDL can accurately learn label distributions for nodes. Dual graph convolution network (GCN) Co-training with heterogeneous message passing ensures two GCNs, one focusing on instance-instance relationship and the other one targeting label-label correlation, are jointly trained such that instance-instance relationship can help induce label-label correlation and vice versa. Our theoretical study derives the error bound of GLDL. For verification, four benchmark datasets with label distributions for nodes are created using common graph benchmarks. The experiments show that considering dependency helps learn better label distributions for networked data, compared to state-of-the-art LDL baseline. In addition, GLDL not only outperforms simple GCN and graph attention networks (GAT) using distribution loss but is also superior to its variant considering label-label relationship as a static network. GLDL and its benchmarks are the first research endeavors to address LDL for graphs. Code and benchmark data are released for public access.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

HGDL: Heterogeneous Graph Label Distribution Learning

Yufei Jin
Heng Lian
Yi He
Xingquan Zhu

Label Distribution Learning (LDL) has been extensively studied in IID data applications such as computer vision, thanks to its more generic setting over single-label and multi-label classification. This paper advances LDL into graph domains and aims to tackle a novel and fundamentalheterogeneous graph label distribution learning (HGDL) problem. We argue that the graph heterogeneity reflected on node types, node attributes, and neighborhood structures can impose significant challenges for generalizing LDL onto graphs. To address the challenges, we propose a new learning framework with two key components: 1) proactive graph topology homogenization, and 2) topology and content consistency-aware graph transformer. Specifically, the former learns optimal information aggregation between meta-paths, so that the nodeheterogeneity can be proactively addressed prior to the succeeding embedding learning; the latter leverages an attention mechanism to learn consistency between meta-path and node attributes, allowing network topology and nodal attributes to be equally emphasized during the label distribution learning. By using KL-divergence and additional constraints, \method~delivers an end-to-end solution for learning and predicting label distribution for nodes. Both theoretical and empirical studies substantiate the effectiveness of our HGDL approach. Our code and datasets are available at https: //github. com/Listener-Watcher/HGDL.

PDF Details DOI

AAAI Conference 2024 Conference Paper

MKG-FENN: A Multimodal Knowledge Graph Fused End-to-End Neural Network for Accurate Drug–Drug Interaction Prediction

Di Wu
Wu Sun
Yi He
Zhong Chen
Xin Luo

Taking incompatible multiple drugs together may cause adverse interactions and side effects on the body. Accurate prediction of drug-drug interaction (DDI) events is essential for avoiding this issue. Recently, various artificial intelligence-based approaches have been proposed for predicting DDI events. However, DDI events are associated with complex relationships and mechanisms among drugs, targets, enzymes, transporters, molecular structures, etc. Existing approaches either partially or loosely consider these relationships and mechanisms by a non-end-to-end learning framework, resulting in sub-optimal feature extractions and fusions for prediction. Different from them, this paper proposes a Multimodal Knowledge Graph Fused End-to-end Neural Network (MKGFENN) that consists of two main parts: multimodal knowledge graph (MKG) and fused end-to-end neural network (FENN). First, MKG is constructed by comprehensively exploiting DDI events-associated relationships and mechanisms from four knowledge graphs of drugs-chemical entities, drug-substructures, drugs-drugs, and molecular structures. Correspondingly, a four channels graph neural network is designed to extract high-order and semantic features from MKG. Second, FENN designs a multi-layer perceptron to fuse the extracted features by end-to-end learning. With such designs, the feature extractions and fusions of DDI events are guaranteed to be comprehensive and optimal for prediction. Through extensive experiments on real drug datasets, we demonstrate that MKG-FENN exhibits high accuracy and significantly outperforms state-of-the-art models in predicting DDI events. The source code and supplementary file of this article are available on: https://github.com/wudi1989/MKG-FENN.

PDF Details DOI

AAAI Conference 2023 Short Paper

Exploring Hypergraph of Earnings Call for Risk Prediction (Student Abstract)

Yi He
Wenxin Tai
Fan Zhou
Yi Yang

In financial economics, studies have shown that the textual content in the earnings conference call transcript has predictive power for a firm's future risk. However, the conference call transcript is very long and contains diverse non-relevant content, which poses challenges for the text-based risk forecast. This study investigates the structural dependency within a conference call transcript by explicitly modeling the dialogue between managers and analysts. Specifically, we utilize TextRank to extract information and exploit the semantic correlation within a discussion using hypergraph learning. This novel design can improve the transcript representation performance and reduce the risk of forecast errors. Experimental results on a large-scale dataset show that our approach can significantly improve prediction performance compared to state-of-the-art text-based models.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Online Random Feature Forests for Learning in Varying Feature Spaces

Christian Schreckenberger
Yi He
Stefan Lüdtke
Christian Bartelt
Heiner Stuckenschmidt

In this paper, we propose a new online learning algorithm tailored for data streams described by varying feature spaces (VFS), wherein new features constantly emerge and old features may stop to be observed over various time spans. Our proposed algorithm, named Online Random Feature Forests for Feature space Variabilities (ORF3V), provides a strategy to respect such feature dynamics by generating, updating, pruning, as well as online re-weighing an ensemble of what we call feature forests, which are generated and updated based on a compressed and storage efficient representation for each observed feature. We benchmark our algorithm on 12 datasets, including one novel real-world dataset of government COVID-19 responses collected through a crowd-sensing program in Spain. The empirical results substantiate the viability and effectiveness of our ORF3V algorithm and its superior accuracy performance over the state-of-the-art rival models.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Online Semi-supervised Learning with Mix-Typed Streaming Features

Di Wu
Shengda Zhuo
Yu Wang
Zhong Chen
Yi He

Online learning with feature spaces that are not fixed but can vary over time renders a seemingly flexible learning paradigm thus has drawn much attention. Unfortunately, two restrictions prohibit a ubiquitous application of this learning paradigm in practice. First, whereas prior studies mainly assume a homogenous feature type, data streams generated from real applications can be heterogeneous in which Boolean, ordinal, and continuous co-exist. Existing methods that prescribe parametric distributions such as Gaussians would not suffice to model the correlation among such mixtyped features. Second, while full supervision seems to be a default setup, providing labels to all arriving data instances over a long time span is tangibly onerous, laborious, and economically unsustainable. Alas, a semi-supervised online learner that can deal with mix-typed, varying feature spaces is still missing. To fill the gap, this paper explores a novel problem, named Online Semi-supervised Learning with Mixtyped streaming Features (OSLMF), which strives to relax the restrictions on the feature type and supervision information. Our key idea to solve the new problem is to leverage copula model to align the data instances with different feature spaces so as to make their distance measurable. A geometric structure underlying data instances is then established in an online fashion based on their distances, through which the limited labeling information is propagated, from the scarce labeled instances to their close neighbors. Experimental results are documented to evidence the viability and effectiveness of our proposed approach. Code is released in https://github.com/wudi1989/OSLMF.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Towards Utilitarian Online Learning -- A Review of Online Algorithms in Open Feature Space

Yi He
Christian Schreckenberger
Heiner Stuckenschmidt
Xindong Wu

Human intelligence comes from the capability to describe and make sense of the world surrounding us, often in a lifelong manner. Online Learning (OL) allows a model to simulate this capability, which involves processing data in sequence, making predictions, and learning from predictive errors. However, traditional OL assumes a fixed set of features to describe data, which can be restrictive. In reality, new features may emerge and old features may vanish or become obsolete, leading to an open feature space. This dynamism can be caused by more advanced or outdated technology for sensing the world, or it can be a natural process of evolution. This paper reviews recent breakthroughs that strived to enable OL in open feature spaces, referred to as Utilitarian Online Learning (UOL). We taxonomize existing UOL models into three categories, analyze their pros and cons, and discuss their application scenarios. We also benchmark the performance of representative UOL models, highlighting open problems, challenges, and potential future directions of this emerging topic.

PDF Details DOI

YNICL Journal 2021 Journal Article

In vivo tensor-valued diffusion MRI of focal demyelination in white and deep grey matter of rodents

Yi He
Susana Aznar
Hartwig R. Siebner
Tim B. Dyrby

BACKGROUND: Multiple sclerosis (MS) is a chronic inflammatory demyelinating disease leading to damage of white matter (WM) and grey matter (GM). Magnetic resonance imaging (MRI) is the modality of choice to assess brain damage in MS, but there is an unmet need in MRI for achieving higher sensitivity and specificity to MS-related microstructural alterations in WM and GM. OBJECTIVE: To explore whether tensor-valued diffusion MRI (dMRI) can yield sensitive microstructural read-outs for focal demyelination in cerebral WM and deep GM (DGM). METHODS: Eight rats underwent L-α-Lysophosphatidylcholine (LPC) injections in the WM and striatum to introduce focal demyelination. Multimodal MRI was performed at 7 Tesla after 7 days. Tensor-valued dMRI was complemented by diffusion tensor imaging, quantitative MRI and proton magnetic resonance spectroscopy (MRS). RESULTS: ) in DGM (P < 0.0001). CONCLUSION: increase may be particularly sensitive in detecting DGM lesions of MS.

Details DOI

IJCAI Conference 2021 Conference Paper

Interpretable Minority Synthesis for Imbalanced Classification

Yi He
Fudong Lin
Xu Yuan
Nian-Feng Tzeng

This paper proposes a novel oversampling approach that strives to balance the class priors with a considerably imbalanced data distribution of high dimensionality. The crux of our approach lies in learning interpretable latent representations that can model the synthetic mechanism of the minority samples by using a generative adversarial network(GAN). A Bayesian regularizer is imposed to guide the GAN to extract a set of salient features that are either disentangled or intensionally entangled, with their interplay controlled by a prescribed structure, defined with human-in-the-loop. As such, our GAN enjoys an improved sample complexity, being able to synthesize high-quality minority samples even if the sizes of minority classes are extremely small during training. Empirical studies substantiate that our approach can empower simple classifiers to achieve superior imbalanced classification performance over the state-of-the-art competitors and is robust across various imbalance settings. Code is released in github. com/fudonglin/IMSIC.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Online Learning in Variable Feature Spaces under Incomplete Supervision

Yi He
Xu Yuan
Sheng Chen
Xindong Wu

This paper explores a new online learning problem where the input sequence lives in an over-time varying feature space and the ground-truth label of any input point is given only occasionally, making online learners less restrictive and more applicable. The crux in this setting lies in how to exploit the very limited labels to efficiently update the online learners. Plausible ideas such as propagating labels from labeled points to their neighbors through uncovering the point-wise geometric relations face two challenges: (1) distance measurement fails to work as different points may be described by disparate sets of features and (2) storing the geometric shape, which is formed by all arrived points, is unrealistic in an online setting. To address these challenges, we first construct a universal feature space that accumulates all observed features, making distance measurement feasible. Then, we use manifolds to represent the geometric shapes and approximate them in a sparse means, making manifolds computational and memory tractable in online learning. We frame these two building blocks into a regularized risk minimization algorithm. Theoretical analysis and empirical evidence substantiate the viability and effectiveness of our proposal.

PDF Details

JBHI Journal 2021 Journal Article

Reinventing 2D Convolutions for 3D Images

Jiancheng Yang
Xiaoyang Huang
Yi He
Jingwei Xu
Canqian Yang
Guozheng Xu
Bingbing Ni

There have been considerable debates over 2D and 3D representation learning on 3D medical images. 2D approaches could benefit from large-scale 2D pretraining, whereas they are generally weak in capturing large 3D contexts. 3D approaches are natively strong in 3D contexts, however few publicly available 3D medical dataset is large and diverse enough for universal 3D pretraining. Even for hybrid (2D + 3D) approaches, the intrinsic disadvantages within the 2D/3D parts still exist. In this study, we bridge the gap between 2D and 3D convolutions by reinventing the 2D convolutions. We propose ACS (axial-coronal-sagittal) convolutions to perform natively 3D representation learning, while utilizing the pretrained weights on 2D datasets. In ACS convolutions, 2D convolution kernels are split by channel into three parts, and convoluted separately on the three views (axial, coronal and sagittal) of 3D representations. Theoretically, ANY 2D CNN (ResNet, DenseNet, or DeepLab) is able to be converted into a 3D ACS CNN, with pretrained weight of a same parameter size. Extensive experiments validate the consistent superiority of the pretrained ACS CNNs, over the 2D/3D CNN counterparts with/without pretraining. Even without pretraining, the ACS convolution can be used as a plug-and-play replacement of standard 3D convolution, with smaller model size and less computation.

Details DOI

IJCAI Conference 2019 Conference Paper

Online Learning from Capricious Data Streams: A Generative Approach

Yi He
Baijun Wu
Di Wu
Ege Beyazit
Sheng Chen
Xindong Wu

Learning with streaming data has received extensive attention during the past few years. Existing approaches assume the feature space is fixed or changes by following explicit regularities, limiting their applicability in dynamic environments where the data streams are described by an arbitrarily varying feature space. To handle such capricious data streams, we in this paper develop a novel algorithm, named OCDS (Online learning from Capricious Data Streams), which does not make any assumption on feature space dynamics. OCDS trains a learner on a universal feature space that establishes relationships between old and new features, so that the patterns learned in the old feature space can be used in the new feature space. Specifically, the universal feature space is constructed by leveraging the relatednesses among features. We propose a generative graphical model to model the construction process, and show that learning from the universal feature space can effectively improve performance with theoretical analysis. The experimental results demonstrate that OCDS achieves conspicuous performance on synthetic and real datasets.

PDF Details

IJCAI Conference 2018 Conference Paper

Toward Designing Convergent Deep Operator Splitting Methods for Task-specific Nonconvex Optimization

Risheng Liu
Shichao Cheng
Yi He
Xin Fan
Zhongxuan Luo

Operator splitting methods have been successfully used in computational sciences, statistics, learning and vision areas to reduce complex problems into a series of simpler subproblems. However, prevalent splitting schemes are mostly established only based on the mathematical properties of some general optimization models. So it is a laborious process and often requires many iterations of ideation and validation to obtain practical and task-specific optimal solutions, especially for nonconvex problems in real-world scenarios. To break through the above limits, we introduce a new algorithmic framework, called Learnable Bregman Splitting (LBS), to perform deep-architecture-based operator splitting for nonconvex optimization based on specific task model. Thanks to the data-dependent (i. e. , learnable) nature, our LBS can not only speed up the convergence, but also avoid unwanted trivial solutions for real-world tasks. Though with inexact deep iterations, we can still establish the global convergence and estimate the asymptotic convergence rate of LBS only by enforcing some fairly loose assumptions. Extensive experiments on different applications (e. g. , image completion and deblurring) verify our theoretical results and show the superiority of LBS against existing methods.

PDF Details