Arrow Research search

Author name cluster

Yao Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers
2 author rows

Possible papers

25

AAAI Conference 2026 Conference Paper

Let’s Think with Images Efficiently! An Interleaved-Modal Chain-of-Thought Reasoning Framework with Dynamic and Precise Visual Thoughts

  • Xu Liu
  • Yongheng Zhang
  • Qiguang Chen
  • Yao Li
  • Sheng Wang
  • Libo Qin

Recently, Interleaved-modal Chain-of-Thought (ICoT) reasoning has achieved remarkable success by leveraging both multimodal inputs and outputs, attracting increasing attention. While achieving promising performance, current ICoT methods still suffer from two major limitations: (1) Static Visual Thought Positioning, which statically inserts visual information at fixed steps, resulting in inefficient and inflexible reasoning; and (2) Broken Visual Thought Representation, which involves discontinuous and semantically incoherent visual tokens. To address these limitations, we introduce Interleaved-modal Chain-of-Thought reasoning with Dynamic and Precise Visual Thoughts (DaP-ICoT), which incorporates two key components: (1) Dynamic Visual Thought Integration adaptively introduces visual inputs based on reasoning needs, reducing redundancy and improving efficiency. (2) Precise Visual Thought Guidance ensures visual semantically coherent and contextually aligned representations. Experiments across multiple benchmarks and models demonstrate that DaP-ICoT achieves state-of-the-art performance. In addition, DaP-ICoT significantly reduces the number of inserted images, leading to a 72.6% decrease in token consumption, enabling more efficient ICoT reasoning.

AAAI Conference 2026 Conference Paper

Quantum Lipschitz Bandits

  • Bongsoo Yi
  • Yue Kang
  • Yao Li

The Lipschitz bandit is a key variant of stochastic bandit problems where the expected reward function satisfies a Lipschitz condition with respect to an arm metric space. With its wide-ranging practical applications, various Lipschitz bandit algorithms have been developed, achieving the optimal regret performance in the classical setting. Motivated by recent advancements in quantum computing and the demonstrated success of quantum Monte Carlo in simpler bandit settings, we introduce the first quantum Lipschitz bandit algorithms to address the challenges of continuous action spaces and non-linear reward functions. Specifically, we first leverage the elimination-based framework to propose an efficient quantum Lipschitz bandit algorithm named Q-LAE. Next, we present novel modifications to the classical Zooming algorithm, which results in a simple quantum Lipschitz bandit method, Q-Zooming. Both algorithms exploit the computational power of quantum methods to obtain a provably improved regret bound over classical Lipschitz bandit algorithms. Comprehensive experiments further validate our improved theoretical findings, demonstrating superior empirical performance compared to existing Lipschitz bandit methods.

EAAI Journal 2026 Journal Article

Spatial dependency learning for image-based anomaly detection in engine combustion

  • Luyun Miao
  • Dazhi Zhang
  • Zhen Cao
  • Zhichang Guo
  • Yao Li
  • Xun Yuan
  • Jangbo Peng
  • Chaobo Yang

Traditional scramjet anomaly detection methods are constrained by delayed pressure responses and handcrafted features that depend on expert experience. To address this issue, this paper proposes an intelligent situational awareness algorithm for engine anomaly detection based on chemiluminescence imaging of combustion processes. The model learns the spatial dependencies of local features in stable flame images, using a self-supervised learning framework to characterize the feature distribution of normal image patches and identify anomalies as deviations from this distribution. Experimental results demonstrate that the proposed method achieves 100. 0% accuracy and 100. 0% area under the receiver operating characteristic curve (AUROC) at the image level, while 90. 9% accuracy and 94. 8% AUROC at the pixel level. The algorithm is trained solely on normal images and is capable of simultaneously detecting both abnormal states and abnormal regions.

IROS Conference 2025 Conference Paper

Autonomous Suturing Method for Robot-Assisted Minimally Invasive Surgery

  • Mei Feng
  • Haoju Li
  • Yao Li
  • Kun Yang
  • Dong He
  • Xiuquan Lu

Robot-assisted minimally invasive surgery is widely used because of its superior postoperative recovery outcomes. However, the workload for surgeons remains high. The development of autonomous suturing capabilities in surgical robots is poised to significantly reduce surgeon workload. In this study, we present a novel method or autonomous suturing using a minimally invasive surgical robot. We quantify the surgical suturing requirements and propose corresponding metrics for evaluating the suturing effect. We also use the dynamic adjustment of stitch position to optimize the surgical robot autonomous suturing scheme. Furthermore, we employ particle swarm algorithms to enhance the grasping posture of surgical instruments, enabling the robot to achieve optimal suture needle clamping. Our method maintains the same level of expert operator in the suturing parametric index of suturing when suturing two types of wounds: gauze and egg membrane. The autonomous suturing method proposed in this study is currently deployed on our own surgical robot, and it can be generalized to other surgical robots. This will lay the foundation for surgical robots to achieve fully autonomous surgery. The experimental results show that the stitching effect of our proposed autonomous robot stitching method is already close to that of surgeons using the same robot, and it maintains good consistency in multiple sets of experiments. The method proposed in this study can be generalized to various other surgical robots, laying the foundation for surgical robots to achieve fully autonomous surgery.

EAAI Journal 2025 Journal Article

Candidate ratio guided proximal policy optimization

  • Yao Li
  • Xiaoyang Tan

Proximal Policy Optimization (PPO) is a well-studied policy search method that improves policies monotonically by enforcing the probability ratio to remain within a clipping range. The probability ratios used to measure the policy distance change dynamically during policy optimization. However, the clipping range remains fixed throughout the policy training stage. A fixed clipping range may lead to sample inefficiency or aggressive policy updates, negatively affecting policy performance. In this paper, we propose a candidate-ratio-guided Proximal Policy Optimization method with self-adaptive clipping ratios to design the clipping range for improved sample efficiency and monotonic policy improvement. The clipping ratio is adjusted based on the average candidate ratios derived from actions sampled around the policy suggested actions. Increasing the clipping ratio allows more collected data to be used for policy optimization, while decreasing it effectively enforces probability ratios within the clipping range for monotonic policy improvement. Experimental results on MuJoCo tasks show that our method achieves more stable performance compared to existing baselines.

IROS Conference 2025 Conference Paper

Cockroach's Turning Strategy Enhanced Hexapod Robot with Flexible Torso

  • Yiming Li
  • Xingyu Li
  • Jie Zhou
  • Chenfeng Xie
  • Yao Li
  • Bing Li

The design and control of hexapod robots have become an active research field due to the ability to achieve adaptive and stable multi-terrain locomotion. However, existing hexapod robots focus on the integration of flexible pitch joints to enhance their obstacle-crossing and slope-climbing abilities, and few biological observations have been made to gain insight into the agile steering mechanisms of hexapod insects. Herein, we observed the steering movements of Madagascar cockroaches. Observations showed that cockroaches exhibited specific phase relationships in addition to regular tripod gait pattern during steering. Moreover, we also found that a smaller steering radius resulted in a larger lateral bending angle of the thoracic segments. Inspired by this, a hexapod robot with a flexible torso (F-RHex) was designed and fabricated. Bio-inspired gait patterns were abstracted and simplified into two steering strategies: gait-based and mix-based. Compared to the purely gait-based strategy, the F-RHex testing results demonstrated a ~27. 4% reduction in turning radius and ~40% enhancement in steering velocity, implying that the mix-based strategy offers superior steering capability.

IJCAI Conference 2025 Conference Paper

Counterfactual Knowledge Maintenance for Unsupervised Domain Adaptation

  • Yao Li
  • Yong Zhou
  • Jiaqi Zhao
  • Wen-Liang Du
  • Rui Yao
  • Bing Liu

Traditional unsupervised domain adaptation (UDA) struggles to extract rich semantics due to backbone limitations. Recent large-scale pre-trained visual-language models (VLMs) have shown strong zero-shot learning capabilities in UDA tasks. However, directly using VLMs results in a mixture of semantic and domain-specific information, complicating knowledge transfer. Complex scenes with subtle semantic differences are prone to misclassification, which in turn can result in the loss of features that are crucial for distinguishing between classes. To address these challenges, we propose a novel counterfactual knowledge maintenance UDA framework. Specifically, we employ counterfactual disentanglement to separate the representation of semantic information from domain features, thereby reducing domain bias. Furthermore, to clarify ambiguous visual information specific to classes, we maintain the discriminative knowledge of both visual and textual information. This approach synergistically leverages multimodal information to preserve modality-specific distinguishable features. We conducted extensive experimental evaluations on several public datasets to demonstrate the effectiveness of our method. The source code is available at https: //github. com/LiYaolab/CMKUDA

YNIMG Journal 2025 Journal Article

Self-organizing dynamic research based on phase coherence graph autoencoders: Analysis of brain metastable states across the lifespan

  • Hao Guo
  • Yu-Xuan Liu
  • Yao Li
  • Qi-Li Guo
  • Zhi-Peng Hao
  • Yan-Li Yang
  • Jing Wei

The development of the human brain is a complex, lifelong process during which collective behaviors of neurons exhibit self-organizing dynamics. Metastable states play a crucial role in understanding the complex dynamical mechanisms of the brain, and analyzing them helps to reveal the mechanisms of functional changes in the brain throughout development and aging. Specifically, global metastable state provides a overall perspective on the flexibility of brain reorganization, while the evolution trajectories of transient functional patterns capture detailed changes in brain activity. The leading eigenvector dynamics analysis (LEiDA) method significantly reduces the dimensionality of data and is widely used to capture the temporal trajectory characteristics of transient functional patterns, i.e., metastable brain states. However, LEiDA's linear dimensionality reduction of high-dimensional raw brain data may overlook non-linear information and lose some relationships between features. We developed a framework based on Phase Coherence Graph Autoencoder (PCGAE) that employs graph autoencoders (GAE) for non-linear dimensionality reduction of phase coherence matrices. This approach clusters to identify more distinct metastable brain states and is applied to the analysis of resting-state functional magnetic resonance imaging (rs-fMRI) data across the human lifespan. This paper investigates age-related differences and continuity changes from different perspectives: metastable state indicators and state trajectory indicators (occurrence probability, lifetime, and state transition metrics). Global metastable state shows a linear decline with age, while both linear and quadratic effects of age-related changes are observed in detailed state metastable and state trajectory indicators. Finally, the proposed feature extraction scheme demonstrates good classification performance for categorizing brain age groups. These findings can help us understand the self-organizing reorganization characteristics associated with aging and their complex dynamic changes, providing new insights into brain development throughout the entire lifespan.

TMLR Journal 2024 Journal Article

Biased Dueling Bandits with Stochastic Delayed Feedback

  • Bongsoo Yi
  • Yue Kang
  • Yao Li

The dueling bandit problem, an essential variation of the traditional multi-armed bandit problem, has become significantly prominent recently due to its broad applications in online advertising, recommendation systems, information retrieval, and more. However, in many real-world applications, the feedback for actions is often subject to unavoidable delays and is not immediately available to the agent. This partially observable issue poses a significant challenge to existing dueling bandit literature, as it significantly affects how quickly and accurately the agent can update their policy on the fly. In this paper, we introduce and examine the biased dueling bandit problem with stochastic delayed feedback, revealing that this new practical problem will delve into a more realistic and intriguing scenario involving a preference bias between the selections. We present two algorithms designed to handle situations involving delay. Our first algorithm, requiring complete delay distribution information, achieves the optimal regret bound for the dueling bandit problem when there is no delay. The second algorithm is tailored for situations where the distribution is unknown, but only the expected value of delay is available. We provide a comprehensive regret analysis for the two proposed algorithms and then evaluate their empirical performance on both synthetic and real datasets.

NeurIPS Conference 2024 Conference Paper

DALD: Improving Logits-based Detector without Logits from Black-box LLMs

  • Cong Zeng
  • Shengkun Tang
  • Xianjun Yang
  • Yuanzhou Chen
  • Yiyou Sun
  • Zhiqiang Xu
  • Yao Li
  • Haifeng Chen

The advent of Large Language Models (LLMs) has revolutionized text generation, producing outputs that closely mimic human writing. This blurring of lines between machine- and human-written text presents new challenges in distinguishing one from the other – a task further complicated by the frequent updates and closed nature of leading proprietary LLMs. Traditional logits-based detection methods leverage surrogate models for identifying LLM-generated content when the exact logits are unavailable from black-box LLMs. However, these methods grapple with the misalignment between the distributions of the surrogate and the often undisclosed target models, leading to performance degradation, particularly with the introduction of new, closed-source models. Furthermore, while current methodologies are generally effective when the source model is identified, they falter in scenarios where the model version remains unknown, or the test set comprises outputs from various source models. To address these limitations, we present \textbf{D}istribution-\textbf{A}ligned \textbf{L}LMs \textbf{D}etection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection even without logits from source LLMs. DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations with minimal training investment. By leveraging corpus samples from publicly accessible outputs of advanced models such as ChatGPT, GPT-4 and Claude-3, DALD fine-tunes surrogate models to synchronize with unknown source model distributions effectively. Our approach achieves SOTA performance in black-box settings on different advanced closed-source and open-source models. The versatility of our method enriches widely adopted zero-shot detection frameworks (DetectGPT, DNA-GPT, Fast-DetectGPT) with a `plug-and-play' enhancement feature. Extensive experiments validate that our methodology reliably secures high detection precision for LLM-generated text and effectively detects text from diverse model origins through a singular detector. Our method is also robust under the revised text attack and non-English texts.

NeurIPS Conference 2024 Conference Paper

SciCode: A Research Coding Benchmark Curated by Scientists

  • Minyang Tian
  • Luyu Gao
  • Shizhuo D. Zhang
  • Xinan Chen
  • Cunwei Fan
  • Xuefei Guo
  • Roland Haas
  • Pan Ji

Since language models (LMs) now outperform average humans on many challenging tasks, it is becoming increasingly difficult to develop challenging, high-quality, and realistic evaluations. We address this by examining LM capabilities to generate code for solving real scientific research problems. Incorporating input from scientists and AI researchers in 16 diverse natural science sub-fields, including mathematics, physics, chemistry, biology, and materials science, we create a scientist-curated coding benchmark, SciCode. The problems naturally factorize into multiple subproblems, each involving knowledge recall, reasoning, and code synthesis. In total, SciCode contains 338 subproblems decomposed from 80 challenging main problems, and it offers optional descriptions specifying useful scientific background information and scientist-annotated gold-standard solutions and test cases for evaluation. OpenAI o1-preview, the best-performing model among those tested, can solve only 7. 7\% of the problems in the most realistic setting. We believe that SciCode demonstrates both contemporary LMs' progress towards realizing helpful scientific assistants and sheds light on the building and evaluation of scientific AI in the future.

TMLR Journal 2024 Journal Article

Trusted Aggregation (TAG): Backdoor Defense in Federated Learning

  • Joseph Lavond
  • Minhao Cheng
  • Yao Li

Federated learning is a framework for training machine learning models from clients with multiple local data sets without access to the data in its aggregate. Instead, a shared model is jointly learned through an interactive process between a centralized server that combines locally learned model gradients or weights from the client. However, the lack of data transparency naturally raises concerns about model security. Recently, several state-of-the-art backdoor attacks have been proposed, which achieve high attack success rates while simultaneously being difficult to detect, leading to compromised federated learning models. In this paper, motivated by differences in the logits of models trained with and without the presence of backdoor attacks, we propose a defense method that can prevent backdoor attacks from influencing the model while maintaining the accuracy of the original classification task. TAG leverages a small validation data set to estimate the most considerable change a benign client's local training can make to the shared model, which can be used to filter clients from updating the shared model. Experimental results on multiple data sets show that TAG defends against backdoor attacks even when 40 percent of user submissions to update the shared model are malicious.

YNIMG Journal 2023 Journal Article

Assessment of vibration modulated regional cerebral blood flow with MRI

  • Linghan Kong
  • Suhao Qiu
  • Yu Chen
  • Zhao He
  • Peiyu Huang
  • Qiang He
  • Ru-Yuan Zhang
  • Xi-Qiao Feng

Human brain experiences vibration of certain magnitude and frequency during various physical activities such as vehicle transportation and machine operation, which may cause traumatic brain injury or other brain diseases. However, the mechanisms of brain pathogenesis due to vibration are not fully elucidated due to the lack of techniques to study brain functions while applying vibration to the brain at a specific magnitude and frequency. Here, this study reported a custom-built head-worn electromagnetic actuator that applied vibration to the brain in vivo at an accurate frequency inside a magnetic resonance imaging scanner while cerebral blood flow (CBF) was acquired. Using this technique, CBF values from 45 healthy volunteers were quantitatively measured immediately following vibration at 20, 30, 40 Hz, respectively. Results showed increasingly reduced CBF with increasing frequency at multiple regions of the brain, while the size of the regions expanded. Importantly, the vibration-induced CBF reduction regions largely fell inside the brain's default mode network (DMN), with about 58 or 46% overlap at 30 or 40 Hz, respectively. These findings demonstrate that vibration as a mechanical stimulus can change strain conditions, which may induce CBF reduction in the brain with regional differences in a frequency-dependent manner. Furthermore, the overlap between vibration-induced CBF reduction regions and DMN suggested a potential relationship between external mechanical stimuli and cognitive functions.

NeurIPS Conference 2023 Conference Paper

CluB: Cluster Meets BEV for LiDAR-Based 3D Object Detection

  • Yingjie Wang
  • Jiajun Deng
  • Yuenan Hou
  • Yao Li
  • Yu Zhang
  • Jianmin Ji
  • Wanli Ouyang
  • Yanyong Zhang

Currently, LiDAR-based 3D detectors are broadly categorized into two groups, namely, BEV-based detectors and cluster-based detectors. BEV-based detectors capture the contextual information from the Bird's Eye View (BEV) and fill their center voxels via feature diffusion with a stack of convolution layers, which, however, weakens the capability of presenting an object with the center point. On the other hand, cluster-based detectors exploit the voting mechanism and aggregate the foreground points into object-centric clusters for further prediction. In this paper, we explore how to effectively combine these two complementary representations into a unified framework. Specifically, we propose a new 3D object detection framework, referred to as CluB, which incorporates an auxiliary cluster-based branch into the BEV-based detector by enriching the object representation at both feature and query levels. Technically, CluB is comprised of two steps. First, we construct a cluster feature diffusion module to establish the association between cluster features and BEV features in a subtle and adaptive fashion. Based on that, an imitation loss is introduced to distill object-centric knowledge from the cluster features to the BEV features. Second, we design a cluster query generation module to leverage the voting centers directly from the cluster branch, thus enriching the diversity of object queries. Meanwhile, a direction loss is employed to encourage a more accurate voting center for each cluster. Extensive experiments are conducted on Waymo and nuScenes datasets, and our CluB achieves state-of-the-art performance on both benchmarks.

NeurIPS Conference 2023 Conference Paper

DSR: Dynamical Surface Representation as Implicit Neural Networks for Protein

  • Daiwen Sun
  • He Huang
  • Yao Li
  • Xinqi Gong
  • Qiwei Ye

We propose a novel neural network-based approach to modeling protein dynamics using an implicit representation of a protein’s surface in 3D and time. Our method utilizes the zero-level set of signed distance functions (SDFs) to represent protein surfaces, enabling temporally and spatially continuous representations of protein dynamics. Our experimental results demonstrate that our model accurately captures protein dynamic trajectories and can interpolate and extrapolate in 3D and time. Importantly, this is the first study to introduce this method and successfully model large-scale protein dynamics. This approach offers a promising alternative to current methods, overcoming the limitations of first-principles-based and deep learning methods, and provides a more scalable and efficient approach to modeling protein dynamics. Additionally, our surface representation approach simplifies calculations and allows identifying movement trends and amplitudes of protein domains, making it a useful tool for protein dynamics research. Codes are available at https: //github. com/Sundw-818/DSR, and we have a project webpage that shows some video results, https: //sundw-818. github. io/DSR/.

NeurIPS Conference 2023 Conference Paper

Perturbation Towards Easy Samples Improves Targeted Adversarial Transferability

  • Junqi Gao
  • Biqing Qi
  • Yao Li
  • Zhichang Guo
  • Dong Li
  • Yuming Xing
  • Dazhi Zhang

The transferability of adversarial perturbations provides an effective shortcut for black-box attacks. Targeted perturbations have greater practicality but are more difficult to transfer between models. In this paper, we experimentally and theoretically demonstrated that neural networks trained on the same dataset have more consistent performance in High-Sample-Density-Regions (HSDR) of each class instead of low sample density regions. Therefore, in the target setting, adding perturbations towards HSDR of the target class is more effective in improving transferability. However, density estimation is challenging in high-dimensional scenarios. Further theoretical and experimental verification demonstrates that easy samples with low loss are more likely to be located in HSDR. Perturbations towards such easy samples in the target class can avoid density estimation for HSDR location. Based on the above facts, we verified that adding perturbations to easy samples in the target class improves targeted adversarial transferability of existing attack methods. A generative targeted attack strategy named Easy Sample Matching Attack (ESMA) is proposed, which has a higher success rate for targeted attacks and outperforms the SOTA generative method. Moreover, ESMA requires only $5\%$ of the storage space and much less computation time comparing to the current SOTA, as ESMA attacks all classes with only one model instead of seperate models for each class. Our code is available at https: //github. com/gjq100/ESMA

YNICL Journal 2022 Journal Article

Disrupted coupling between salience network segregation and glucose metabolism is associated with cognitive decline in Alzheimer's disease – A simultaneous resting-state FDG-PET/fMRI study

  • Miao Zhang
  • Ziyun Guan
  • Yaoyu Zhang
  • Wanqing Sun
  • Wenli Li
  • Jialin Hu
  • Binyin Li
  • Guanyu Ye

The aberrant organization and functioning of three core neurocognitive networks (NCNs), i.e., default-mode network (DMN), central executive network (CEN), and salience network (SN), are among the prominent features in Alzheimer's disease (AD). The dysregulation of both intra- and inter-network functional connectivities (FCs) of the three NCNs contributed to AD-related cognitive and behavioral abnormalities. Brain functional network segregation, integrating intra- and inter-network FCs, is essential for maintaining the energetic efficiency of brain metabolism. The association of brain functional network segregation, together with glucose metabolism, with age-related cognitive decline was recently shown. Yet how these joint functional-metabolic biomarkers relate to cognitive decline along with mild cognitive impairment (MCI) and AD remains to be elucidated. In this study, under the framework of the triple-network model, we performed a hybrid FDG-PET/fMRI study to evaluate the concurrent changes of resting-state brain intrinsic FCs and glucose metabolism of the three NCNs across cognitively normal (CN) (N = 24), MCI (N = 21), and AD (N = 21) groups. Lower network segregation and glucose metabolism were observed in all three NCNs in patients with AD. More interestingly, in the SN, the coupled relationship between network segregation and glucose metabolism existed in the CN group (r = 0.523, p = 0.013) and diminished in patients with MCI (r = 0.431, p = 0.065) and AD (r = 0.079, p = 0.748). Finally, the glucose metabolism of the DMN (r = 0.380, p = 0.017) and the network segregation of the SN (r = 0.363, p = 0.023) were significantly correlated with the general cognitive status of the patients. Our findings suggest that the impaired SN segregation and its uncoupled relationship with glucose metabolism contribute to the cognitive decline in AD.

YNICL Journal 2022 Journal Article

Neurometabolic and functional changes of default-mode network relate to clinical recovery in first-episode psychosis patients: A longitudinal 1H-MRS and fMRI study

  • Wenli Li
  • Jiale Xu
  • Qiong Xiang
  • Kaiming Zhuo
  • Yaoyu Zhang
  • Dengtang Liu
  • Yao Li

BACKGROUND: Antipsychotic treatment has improved the disrupted functional connectivity (FC) and neurometabolites levels of the default mode network (DMN) in schizophrenia patients, but a direct relationship between FC change, neurometabolic level alteration, and symptom improvement has not been built. This study examined the association between the alterations in DMN FC, the changes of neurometabolites levels in the medial prefrontal cortex (MPFC), and the improvementsinpsychopathology in a longitudinal study of drug-naïve first-episode psychosis (FEP) patients. METHODS: Thirty-two drug-naïve FEP patients and 30 matched healthy controls underwent repeated assessments with the Positive and Negative Syndrome Scale (PANSS) and 3T proton magnetic resonance spectroscopy as well as resting-state functional magnetic resonance imaging. The levels of γ-aminobutyric acid, glutamate, N-acetyl-aspartate in MPFC, and the FC of DMN were measured. After 8-week antipsychotic treatment, 24 patients were re-examined. RESULTS: After treatment, the changes in γ-aminobutyric acid were correlated with the alterations of FC between the MPFC and DMN, while the changes in N-acetyl-aspartate were associated with the alterations of FC between the posterior cingulate cortex/precuneus and DMN. The FC changes of both regions were correlated with patients PANSS positive score reductions. The structural equation modeling analyses revealed that the changes of DMN FC mediated the relationship between the changes of neurometabolites and the symptom improvements of the patients. CONCLUSIONS: The derived neurometabolic-functional changes underlying the clinical recovery provide insights into the prognosis of FEP patients. It is noteworthy that this is an exploratory study, and future work with larger sample size is needed to validate our findings.

YNICL Journal 2021 Journal Article

Disrupted functional connectivity of precuneus subregions in obsessive-compulsive disorder

  • Qianqian Ye
  • Zongfeng Zhang
  • Wanqing Sun
  • Qing Fan
  • Yao Li

Obsessive-compulsive disorder (OCD) is a chronic and disablingpsychiatric disorder with high lifetime prevalence, yet the underlying pathogenesis remains not fully understood. Increasing neuroimaging evidence has shown that the disrupted activity of brain functional hubs might contribute to the pathophysiology of OCD. Precuneus is an important brain hub which showed structural and functional abnormalities in OCD patients. However, the functional heterogeneity of the precuneus subregion has not been considered and its relation to OCD symptomatology remains to be elucidated. In this paper, a total of 73 unmedicated OCD patients and 79 matched healthy subjects were recruited and the heterogeneous functional connectivities (FCs) of the precuneus subregions were investigated using resting-state functional magnetic resonance imaging. The FC-based subdivision of the precuneus was performed using the K-means clustering algorithm, which led to a tripartite functional parcellation of precuneus. For each subregion, the distinct connectivity pattern with the whole brain was shown, using both voxel-wise and module-wise analysis, respectively. Decreased FC between dorsal posterior precuneus and vermis (corrected p<0.01) was shown in the patient group, which was negatively correlated with patient compulsions score (ρ = - 0.393, p = 0.001), indicating its contribution to the compulsive behavior inhibition of OCD. Our work might provide new insights into the understanding of precuneus subregion function and the importance of dorsal precuneus-cerebellum functional connectivity in OCD pathophysiology.

NeurIPS Conference 2021 Conference Paper

ErrorCompensatedX: error compensation for variance reduced algorithms

  • Hanlin Tang
  • Yao Li
  • Ji Liu
  • Ming Yan

Communication cost is one major bottleneck for the scalability for distributed learning. One approach to reduce the communication cost is to compress the gradient during communication. However, directly compressing the gradient decelerates the convergence speed, and the resulting algorithm may diverge for biased compression. Recent work addressed this problem for stochastic gradient descent by adding back the compression error from the previous step. This idea was further extended to one class of variance reduced algorithms, where the variance of the stochastic gradient is reduced by taking a moving average over all history gradients. However, our analysis shows that just adding the previous step's compression error, as done in existing work, does not fully compensate the compression error. So, we propose ErrorCompensateX, which uses the compression error from the previous two steps. We show that ErrorCompensateX can achieve the same asymptotic convergence rate with the training without compression. Moreover, we provide a unified theoretical analysis framework for this class of variance reduced algorithms, with or without error compensation.

NeurIPS Conference 2018 Conference Paper

Learning from Group Comparisons: Exploiting Higher Order Interactions

  • Yao Li
  • Minhao Cheng
  • Kevin Fujii
  • Fushing Hsieh
  • Cho-Jui Hsieh

We study the problem of learning from group comparisons, with applications in predicting outcomes of sports and online games. Most of the previous works in this area focus on learning individual effects---they assume each player has an underlying score, and the ''ability'' of the team is modeled by the sum of team members' scores. Therefore, all the current approaches cannot model deeper interaction between team members: some players perform much better if they play together, and some players perform poorly together. In this paper, we propose a new model that takes the player-interaction effects into consideration. However, under certain circumstances, the total number of individuals can be very large, and number of player interactions grows quadratically, which makes learning intractable. In this case, we propose a latent factor model, and show that the sample complexity of our model is bounded under mild assumptions. Finally, we show that our proposed models have much better prediction power on several E-sports datasets, and furthermore can be used to reveal interesting patterns that cannot be discovered by previous methods.

IJCAI Conference 2017 Conference Paper

Deep Descriptor Transforming for Image Co-Localization

  • Xiu-Shen Wei
  • Chen-Lin Zhang
  • Yao Li
  • Chen-Wei Xie
  • Jianxin Wu
  • Chunhua Shen
  • Zhi-Hua Zhou

Reusable model design becomes desirable with the rapid expansion of machine learning applications. In this paper, we focus on the reusability of pre-trained deep convolutional models. Specifically, different from treating pre-trained models as feature extractors, we reveal more treasures beneath convolutional layers, i. e. , the convolutional activations could act as a detector for the common object in the image co-localization problem. We propose a simple but effective method, named Deep Descriptor Transforming (DDT), for evaluating the correlations of descriptors and then obtaining the category-consistent regions, which can accurately locate the common object in a set of images. Empirical studies validate the effectiveness of the proposed DDT method. On benchmark image co-localization datasets, DDT consistently outperforms existing state-of-the-art methods by a large margin. Moreover, DDT also demonstrates good generalization ability for unseen categories and robustness for dealing with noisy data.

NeurIPS Conference 2017 Conference Paper

Scalable Demand-Aware Recommendation

  • Jinfeng Yi
  • Cho-Jui Hsieh
  • Kush Varshney
  • Lijun Zhang
  • Yao Li

Recommendation for e-commerce with a mix of durable and nondurable goods has characteristics that distinguish it from the well-studied media recommendation problem. The demand for items is a combined effect of form utility and time utility, i. e. , a product must both be intrinsically appealing to a consumer and the time must be right for purchase. In particular for durable goods, time utility is a function of inter-purchase duration within product category because consumers are unlikely to purchase two items in the same category in close temporal succession. Moreover, purchase data, in contrast to rating data, is implicit with non-purchases not necessarily indicating dislike. Together, these issues give rise to the positive-unlabeled demand-aware recommendation problem that we pose via joint low-rank tensor completion and product category inter-purchase duration vector estimation. We further relax this problem and propose a highly scalable alternating minimization approach with which we can solve problems with millions of users and millions of items in a single thread. We also show superior prediction accuracies on multiple real-world datasets.

NeurIPS Conference 2014 Conference Paper

Iterative Neural Autoregressive Distribution Estimator NADE-k

  • Tapani Raiko
  • Yao Li
  • Kyunghyun Cho
  • Yoshua Bengio

Training of the neural autoregressive density estimator (NADE) can be viewed as doing one step of probabilistic inference on missing values in data. We propose a new model that extends this inference scheme to multiple steps, arguing that it is easier to learn to improve a reconstruction in $k$ steps rather than to learn to reconstruct in a single inference step. The proposed model is an unsupervised building block for deep learning that combines the desirable properties of NADE and multi-predictive training: (1) Its test likelihood can be computed analytically, (2) it is easy to generate independent samples from it, and (3) it uses an inference engine that is a superset of variational inference for Boltzmann machines. The proposed NADE-k is competitive with the state-of-the-art in density estimation on the two datasets tested.