Arrow Research search

Author name cluster

Pan Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

45 papers
1 author row

Possible papers

45

AAAI Conference 2026 Conference Paper

AR-Nav Benchmark: Augmented Reality Navigation with Vision and Language

  • Liqi Yan
  • Yihao Wu
  • Chenyi Xu
  • Chao Yang
  • Jianhui Zhang
  • Pan Li

Augmented Reality (AR) navigation has emerged as a transformative tool for spatial intelligence, enabling users to interactively explore complex environments through wearable and mobile AR devices. However, current AR navigation systems struggle with low indoor localization accuracy, weak semantic understanding, and limited long-term memory, which severely limits their adaptability in dynamic, multi-floor, and large-scale real-world settings. To address these challenges, we present AR-Nav benchmark, a novel dataset with corresponding suite that leverages vision and language for AR navigation. First, to construct this benchmark, we proposed an Augmented Reality Visual-Language Memory Model (AR‑VLM²), which generates structured, semantically rich, and temporally indexed representations for long-term AR navigation. Second, we design a lightweight navigation intent recommending module with hierarchical topological reasoning and language-grounded path planning, called ARN‑Pilot, enabling low-latency and personalized route selection. Third, we introduce a closed-loop AR interaction module that supports real-time multi-modal feedback, dynamic memory updates, and human-in-the-loop query refinement. Extensive experiments in indoor multi-floor and outdoor parking scenarios show that AR-Nav suite significantly outperforms state-of-the-art AR navigation methods.

TIST Journal 2026 Journal Article

Autonomous Domain Adaptation Self-Optimization Approach for Cross-Domain Industrial Agents

  • Tian-Yu Zuo
  • Kai Di
  • Pan Li
  • Yichuan Jiang

In the heterogeneous and dynamically evolving Industrial Internet, industrial agents are required to possess cross-domain adaptability and self-learning capabilities to facilitate task generalization and scalable deployment across diverse operational contexts. However, existing domain adaptation approaches predominantly rely on static feature alignment or domain-invariant assumptions, lacking a systematic consideration of working condition variability and the interplay between self-learning and adaptation. This oversight hampers their effectiveness in real-world industrial scenarios, where agents must operate under complex conditions with limited target domain knowledge. Consequently, these methods often suffer from knowledge shift and insufficient policy generalization. To address these limitations, this article introduces the instance weighting-based domain-adaptive optimization (IW-DAO) framework. IW-DAO combines an instance weighting-based knowledge alignment mechanism with a Bayesian optimization strategy, forming a dynamic self-learning loop tailored for cross-domain adaptation. Specifically, the framework constructs an adaptive knowledge representation in a high-dimensional invariant feature space and formulates a cross-domain performance evaluation estimator to guide the unsupervised learning of knowledge transfer and adaptive optimization via Bayesian iterative search. Extensive experiments on industrial asset management tasks as well as a real-world industrial flow process dataset with various operating conditions demonstrate the effectiveness of IW-DAO. The proposed framework enables industrial agents to evolve autonomously and be deployed efficiently across diverse domains. IW-DAO consistently outperforms baseline and expert-tuned methods, demonstrating strong generalization and adaptability in both industrial asset management and complex flow process scenarios.

TIST Journal 2026 Journal Article

Chain Disruption Risk-Oriented Task Migration in Multiplex Networked Industrial Chains

  • Kai Di
  • Tian-Yu Zuo
  • Pan Li
  • Jiuchuan Jiang
  • Yichuan Jiang

In industrial production processes, disruptions within the industrial chain can severely affect the collaborative capabilities of production agents. A notable example occurred during the COVID-19 pandemic, when many agents faced interruption risks and were unable to participate in coordinated production. Ensuring continuity under such conditions requires migrating tasks from disrupted agents to viable alternatives. Designing effective task migration strategies, however, must account for the emergent multiplex nature of modern industrial chains. In these multiplex networked industrial chains, disruption risk in one layer can propagate to others, generating cascading failures across the system. This introduces two key challenges: (1) disruption risk creates mismatches not only between product agents and tasks but also across network layers, enlarging the problem dimensionality; and (2) simultaneous disruptions across multiple agents and layers increase the volume of tasks needing migration, greatly expanding the solution space. To address these challenges, we introduce the notion of a multiplex potential field, which captures cross-layer interdependencies and system-level dynamics in multiplex industrial chains. Building on this concept, we develop a hierarchical contextual task migration algorithm that exploits the multiplex potential field to guide both inter-layer and intra-layer task reallocations. Extensive experiments show that our approach consistently achieves superior utility, markedly improves task completion ratios, and reduces execution costs compared to benchmark algorithms. Furthermore, it attains solution quality comparable to that of the optimal CPLEX solver while requiring substantially less computation time. Finally, a case study on the FAO international food trade network demonstrates that the proposed framework is not only theoretically robust but also practically effective when deployed on large-scale real-world multiplex systems.

AAAI Conference 2026 Conference Paper

Chain-of-Search: Parameter-Efficient Reasoning for Zero-Shot Object Navigation

  • Hanrui Chen
  • Liqi Yan
  • Qifan Wang
  • Jianhui Zhang
  • Fangli Guan
  • Pan Li

Zero-shot object navigation tasks agents with locating target objects in unseen environments—a core capability of embodied intelligence. While recent vision-language navigation methods leverage Large Language Models (LLMs) for multimodal reasoning, they suffer from two key limitations: (1) semantic misalignment between language-grounded maps and real-world layouts, and (2) inefficiency due to LLMs’ lack of specialization for navigation-specific tasks. To address these challenges, we propose Chain-of-Search (CoS), a novel parameter-efficient framework that enables human-like decision-making via iterative semantic reasoning. First, CoS replaces traditional global maps with an optimal-benefit multi-map construction that continuously balances expected gain and cost throughout the navigation process. Second, we introduce a Parameter-Efficient Intent Aligner (PEIA), trained via a prompt-guided paradigm to align directional decisions with navigation intent. PEIA injects semantic cues into benefit-aware maps, enabling more rational and goal-consistent exploration. Finally, a Reflection-Guided Destination Verifier (RDV) confirms whether the target is reached via language-driven reasoning and corrects potential errors through self-reflection. CoS achieves state-of-the-art performance on HM3D (+2.8% SR) and MP3D (+1.2% SR) without relying on LLMs, demonstrating the effectiveness of lightweight, reasoning-centered navigation.

NeurIPS Conference 2025 Conference Paper

Differentially Private Relational Learning with Entity-level Privacy Guarantees

  • Yinan Huang
  • Haoteng YIN
  • Eli Chien
  • Rongzhe Wei
  • Pan Li

Learning with relational and network-structured data is increasingly vital in sensitive domains where protecting the privacy of individual entities is paramount. Differential Privacy (DP) offers a principled approach for quantifying privacy risks, with DP-SGD emerging as a standard mechanism for private model training. However, directly applying DP-SGD to relational learning is challenging due to two key factors: (i) entities often participate in multiple relations, resulting in high and difficult-to-control sensitivity; and (ii) relational learning typically involves multi-stage, potentially coupled (interdependent) sampling procedures that make standard privacy amplification analyses inapplicable. This work presents a principled framework for relational learning with formal entity-level DP guarantees. We provide a rigorous sensitivity analysis and introduce an adaptive gradient clipping scheme that modulates clipping thresholds based on entity occurrence frequency. We also extend the privacy amplification results to a tractable subclass of coupled sampling, where the dependence arises only through sample sizes. These contributions lead to a tailored DP-SGD variant for relational data with provable privacy guarantees. Experiments on fine-tuning text encoders over text-attributed network-structured relational data demonstrate the strong utility-privacy trade-offs of our approach.

NeurIPS Conference 2025 Conference Paper

Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness

  • Rongzhe Wei
  • Peizhi Niu
  • Hans Hao-Hsun Hsu
  • Ruihan Wu
  • Haoteng YIN
  • Mohsen Ghassemi
  • Yifan Li
  • Vamsi Potluru

Machine unlearning techniques aim to mitigate unintended memorization in large language models (LLMs). However, existing approaches predominantly focus on the explicit removal of isolated facts, often overlooking latent inferential dependencies and the non-deterministic nature of knowledge within LLMs. Consequently, facts presumed forgotten may persist implicitly through correlated information. To address these challenges, we propose a knowledge unlearning evaluation framework that more accurately captures the implicit structure of real-world knowledge by representing relevant factual contexts as knowledge graphs with associated confidence scores. We further develop an inference-based evaluation protocol leveraging powerful LLMs as judges; these judges reason over the extracted knowledge subgraph to determine unlearning success. Our LLM judges utilize carefully designed prompts and are calibrated against human evaluations to ensure their trustworthiness and stability. Extensive experiments on our newly constructed benchmark demonstrate that our framework provides a more realistic and rigorous assessment of unlearning performance. Moreover, our findings reveal that current evaluation strategies tend to overestimate unlearning effectiveness.

NeurIPS Conference 2025 Conference Paper

Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models

  • Haoyu Wang
  • Peihao Wang
  • Mufei Li
  • Shikun Liu
  • Siqi Miao
  • Zhangyang "Atlas" Wang
  • Pan Li

Modern large language models (LLMs) are inherently auto-regressive, requiring input to be serialized into flat sequences regardless of their structural dependencies. This serialization hinders the model’s ability to leverage structural inductive biases, especially in tasks such as retrieval-augmented generation (RAG) and reasoning on data with native graph structures, where inter-segment dependencies are crucial. We introduce Graph-KV with the potential to overcome this limitation. Graph-KV leverages the KV-cache of text segments as condensed representations and governs their interaction through structural inductive biases. In this framework, ''target'' segments selectively attend only to the KV-caches of their designated ''source'' segments, rather than all preceding segments in a serialized sequence. This approach induces a graph-structured block mask, sparsifying attention and enabling a message-passing-like step within the LLM. Furthermore, strategically allocated positional encodings for source and target segments reduce positional bias and context window consumption. We evaluate Graph-KV across three scenarios: (1) seven RAG benchmarks spanning direct inference, multi-hop reasoning, and long-document understanding; (2) Arxiv-QA, a novel academic paper QA task with full-text scientific papers structured as citation ego-graphs; and (3) paper topic classification within a citation network. By effectively reducing positional bias and harnessing structural inductive biases, Graph-KV substantially outperforms baselines, including standard costly sequential encoding, across various settings.

JMLR Journal 2025 Journal Article

Improving Graph Neural Networks on Multi-node Tasks with the Labeling Trick

  • Xiyuan Wang
  • Pan Li
  • Muhan Zhang

In this paper, we study using graph neural networks (GNNs) for multi-node representation learning, where a representation for a set of more than one node (such as a link) is to be learned. Existing GNNs are mainly designed to learn single-node representations. When used for multi-node representation learning, a common practice is to directly aggregate the single-node representations obtained by a GNN. In this paper, we show a fundamental limitation of such an approach, namely the inability to capture the dependence among multiple nodes in the node set. A straightforward solution is to distinguish target nodes from others. Formalizing this idea, we propose \text{labeling trick}, which first labels nodes in the graph according to their relationships with the target node set before applying a GNN and then aggregates node representations obtained in the labeled graph for multi-node representations. Besides node sets in graphs, we also extend labeling tricks to posets, subsets and hypergraphs. Experiments verify that the labeling trick technique can boost GNNs on various tasks, including undirected link prediction, directed link prediction, hyperedge prediction, and subgraph prediction. Our work explains the superior performance of previous node-labeling-based methods and establishes a theoretical foundation for using GNNs for multi-node representation learning. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2025. ( edit, beta )

IJCAI Conference 2025 Conference Paper

Optimal Distributed Training With Co-Adaptive Data Parallelism in Heterogeneous Environments

  • Lifang Chen
  • Zhichao Chen
  • Liqi Yan
  • Yanyu Cheng
  • Fangli Guan
  • Pan Li

The computational power required for training deep learning models has been skyrocketing in the past decade as they scale with big data, and has become a very expensive and scarce resource. Therefore, distributed training, which can leverage distributed available computational power, is vital for efficient large-scale model training. However, most previous distributed training frameworks like DDP and DeepSpeed are primarily designed for co-located clusters under homogeneous computing and communication conditions, and hence cannot account for geo-distributed clusters with both computing and communication heterogeneity. To address this challenge, we develop a new data parallel based distributed training framework called Co-Adaptive Data Parallelism (C-ADP). First, we consider a data owner and parameter server that distributes data to and coordinates the collaborative learning across all the computing devices. We employ local training and delayed parameter synchronization to reduce communication costs. Second, we formulate a data parallel scheduling optimization problem to minimize the training time by optimizing data distribution. Third, we devise an efficient algorithm to solve this scheduling problem, and formally prove that the obtained solution is optimal in the asymptotic sense. Experiments on the ImageNet100 dataset demonstrate that C-ADP achieves fast convergence in heterogeneous distributed training environments. Compared to Distributed Data Parallel (DDP) and DeepSpeed, C-ADP achieves 21. 6 times and 26. 3 times improvements in FLOPS, respectively, and a reduction in training time of about 72% and 47%, respectively.

IJCAI Conference 2025 Conference Paper

Risk-Aware Task Migration for Multiplex Unmanned Swarm Networks in Adversarial Environments

  • Kai Di
  • Tienyu Zuo
  • Pan Li
  • Yuanshuang Jiang
  • Fulin Chen
  • Yichuan Jiang

With the rapid development and deep integration of artificial intelligence and automation technologies, autonomous unmanned swarms dynamically organize into multiplex network structures based on diverse task requirements in adversarial environments. Frequent task variations lead to load imbalances among agents and between network layers, significantly increasing the risk of enemy detection and destruction. Existing approaches typically simplify multiplex networks into single-layer structures for task scheduling, failing to address these load imbalance issues. Moreover, the coupling between task dynamics and network multiplexity dramatically increases the complexity of designing task migration strategies, and it is proven NP-hard to achieve such load balancing. To address these challenges, this paper proposes a risk-aware task migration method that achieves dynamic load balancing by matching task requirements with both intra-layer agent capabilities and inter-layer swarm capabilities. Simulation results demonstrate that our approach significantly outperforms benchmark algorithms in task completion cost, task completion proportion, and system robustness. In particular, the algorithm achieves solutions statistically indistinguishable from the optimal solutions computed by the CPLEX solver, while exhibiting significantly reduced computational overhead.

NeurIPS Conference 2025 Conference Paper

RoFt-Mol: Benchmarking Robust Fine-tuning with Molecular Graph Foundation Models

  • Shikun Liu
  • Deyu Zou
  • Nima Shoghi
  • Victor Fung
  • Kai Liu
  • Pan Li

In the era of foundation models, fine-tuning pre-trained models for specific downstream tasks has become crucial. This drives the need for robust fine-tuning methods to address challenges such as model overfitting and sparse labeling. Moleculargraph foundation models (MGFMs) face unique difficulties that complicate fine-tuning. These models are limited by smaller pre-training datasets and more severedata scarcity for downstream tasks, both of which require enhanced model generalization. Moreover, MGFMs must accommodate diverse objectives, including bothregression and classification tasks. To better understand and improve fine-tuningtechniques under these conditions, we classify eight fine-tuning methods into threemechanisms: weight-based, representation-based, and partial fine-tuning. Webenchmark these methods on downstream regression and classification tasks acrosssupervised and self-supervised pre-trained models in diverse labeling settings. Thisextensive evaluation provides valuable insights and informs the design of a refinedrobust fine-tuning method, ROFT-MOL. This approach combines the strengths ofsimple post-hoc weight interpolation with more complex weight ensemble fine-tuning methods, delivering improved performance across both task types whilemaintaining the ease of use inherent in post-hoc weight interpolation.

AAAI Conference 2024 Conference Paper

A Non-parametric Graph Clustering Framework for Multi-View Data

  • Shengju Yu
  • Siwei Wang
  • Zhibin Dong
  • Wenxuan Tu
  • Suyuan Liu
  • Zhao Lv
  • Pan Li
  • Miao Wang

Multi-view graph clustering (MVGC) derives encouraging grouping results by seamlessly integrating abundant information inside heterogeneous data, and has captured surging focus recently. Nevertheless, the majority of current MVGC works involve at least one hyper-parameter, which not only requires additional efforts for tuning, but also leads to a complicated solving procedure, largely harming the flexibility and scalability of corresponding algorithms. To this end, in the article we are devoted to getting rid of hyper-parameters, and devise a non-parametric graph clustering (NpGC) framework to more practically partition multi-view data. To be specific, we hold that hyper-parameters play a role in balancing error item and regularization item so as to form high-quality clustering representations. Therefore, under without the assistance of hyper-parameters, how to acquire high-quality representations becomes the key. Inspired by this, we adopt two types of anchors, view-related and view-unrelated, to concurrently mine exclusive characteristics and common characteristics among views. Then, all anchors' information is gathered together via a consensus bipartite graph. By such ways, NpGC extracts both complementary and consistent multi-view features, thereby obtaining superior clustering results. Also, linear complexities enable it to handle datasets with over 120000 samples. Numerous experiments reveal NpGC's strong points compared to lots of classical approaches.

NeurIPS Conference 2024 Conference Paper

Certified Machine Unlearning via Noisy Stochastic Gradient Descent

  • Eli Chien
  • Haoyu Wang
  • Ziang Chen
  • Pan Li

``The right to be forgotten'' ensured by laws for user data privacy becomes increasingly important. Machine unlearning aims to efficiently remove the effect of certain data points on the trained model parameters so that it can be approximately the same as if one retrains the model from scratch. We propose to leverage projected noisy stochastic gradient descent for unlearning and establish its first approximate unlearning guarantee under the convexity assumption. Our approach exhibits several benefits, including provable complexity saving compared to retraining, and supporting sequential and batch unlearning. Both of these benefits are closely related to our new results on the infinite Wasserstein distance tracking of the adjacent (un)learning processes. Extensive experiments show that our approach achieves a similar utility under the same privacy constraint while using $2\%$ and $10\%$ of the gradient computations compared with the state-of-the-art gradient-based approximate unlearning methods for mini-batch and full-batch settings, respectively.

NeurIPS Conference 2024 Conference Paper

Differentially Private Graph Diffusion with Applications in Personalized PageRanks

  • Rongzhe Wei
  • Eli Chien
  • Pan Li

Graph diffusion, which iteratively propagates real-valued substances among the graph, is used in numerous graph/network-involved applications. However, releasing diffusion vectors may reveal sensitive linking information in the data such as transaction information in financial network data. However, protecting the privacy of graph data is challenging due to its interconnected nature. This work proposes a novel graph diffusion framework with edge-level different privacy guarantees by using noisy diffusion iterates. The algorithm injects Laplace noise per diffusion iteration and adopts a degree-based thresholding function to mitigate the high sensitivity induced by low-degree nodes. Our privacy loss analysis is based on Privacy Amplification by Iteration (PABI), which to our best knowledge, is the first effort that analyzes PABI with Laplace noise and provides relevant applications. We also introduce a novel $\infty$-Wasserstein distance tracking method, which tightens the analysis of privacy leakage and makes PABI more applicable in practice. We evaluate this framework by applying it to Personalized Pagerank computation for ranking tasks. Experiments on real-world network data demonstrate the superiority of our method under stringent privacy conditions.

TMLR Journal 2024 Journal Article

DIG-MILP: a Deep Instance Generator for Mixed-Integer Linear Programming with Feasibility Guarantee

  • Haoyu Peter Wang
  • Jialin Liu
  • Xiaohan Chen
  • Xinshang Wang
  • Pan Li
  • Wotao Yin

Mixed-integer linear programming (MILP) stands as a notable NP-hard problem pivotal to numerous crucial industrial applications. The development of effective algorithms, the tuning of solvers, and the training of machine learning models for MILP resolution all hinge on access to extensive, diverse, and representative data. Yet compared to the abundant naturally occurring data in image and text realms, MILP is markedly data deficient, underscoring the vital role of synthetic MILP generation. We present DIG-MILP, a deep generative framework adept at extracting deep-level structural features from highly limited MILP data and producing instances that closely mirror the target data. Notably, by leveraging the MILP duality, DIG-MILP guarantees a correct and complete generation space as well as ensures the boundedness and feasibility of the generated instances. Our empirical study highlights the novelty and quality of the instances generated by DIG-MILP through two distinct downstream tasks: (S1) Data sharing, where solver solution times correlate highly positive between original and DIG-MILP-generated instances, allowing data sharing for solver tuning without publishing the original data; (S2) Data Augmentation, wherein the DIG-MILP-generated instances bolster the generalization performance of machine learning models tasked with resolving MILP problems.

NeurIPS Conference 2024 Conference Paper

GeSS: Benchmarking Geometric Deep Learning under Scientific Applications with Distribution Shifts

  • Deyu Zou
  • Shikun Liu
  • Siqi Miao
  • Victor Fung
  • Shiyu Chang
  • Pan Li

Geometric deep learning (GDL) has gained significant attention in scientific fields, for its proficiency in modeling data with intricate geometric structures. Yet, very few works have delved into its capability of tackling the distribution shift problem, a prevalent challenge in many applications. To bridge this gap, we propose GeSS, a comprehensive benchmark designed for evaluating the performance of GDL models in scientific scenarios with distribution shifts. Our evaluation datasets cover diverse scientific domains from particle physics, materials science to biochemistry, and encapsulate a broad spectrum of distribution shifts including conditional, covariate, and concept shifts. Furthermore, we study three levels of information access from the out-of-distribution (OOD) test data, including no OOD information, only unlabeled OOD data, and OOD data with a few labels. Overall, our benchmark results in 30 different experiment settings, and evaluates 3 GDL backbones and 11 learning algorithms in each setting. A thorough analysis of the evaluation results is provided, poised to illuminate insights for GDL researchers and domain practitioners who are to use GDL in their applications.

TMLR Journal 2024 Journal Article

GraphMaker: Can Diffusion Models Generate Large Attributed Graphs?

  • Mufei Li
  • Eleonora Kreacic
  • Vamsi K. Potluru
  • Pan Li

Large-scale graphs with node attributes are increasingly common in various real-world applications. Creating synthetic, attribute-rich graphs that mirror real-world examples is crucial, especially for sharing graph data for analysis and developing learning models when original data is restricted to be shared. Traditional graph generation methods are limited in their capacity to handle these complex structures. Recent advances in diffusion models have shown potential in generating graph structures without attributes and smaller molecular graphs. However, these models face challenges in generating large attributed graphs due to the complex attribute-structure correlations and the large size of these graphs. This paper introduces a novel diffusion model, GraphMaker, specifically designed for generating large attributed graphs. We explore various combinations of node attribute and graph structure generation processes, finding that an asynchronous approach more effectively captures the intricate attribute-structure correlations. We also address scalability issues through edge mini-batching generation. To demonstrate the practicality of our approach in graph data dissemination, we introduce a new evaluation pipeline. The evaluation demonstrates that synthetic graphs generated by GraphMaker can be used to develop competitive graph machine learning models for the tasks defined over the original graphs without actually accessing these graphs, while many leading graph generation methods fall short in this evaluation.

NeurIPS Conference 2024 Conference Paper

Langevin Unlearning: A New Perspective of Noisy Gradient Descent for Machine Unlearning

  • Eli Chien
  • Haoyu Wang
  • Ziang Chen
  • Pan Li

Machine unlearning has raised significant interest with the adoption of laws ensuring the ``right to be forgotten''. Researchers have provided a probabilistic notion of approximate unlearning under a similar definition of Differential Privacy (DP), where privacy is defined as statistical indistinguishability to retraining from scratch. We propose Langevin unlearning, an unlearning framework based on noisy gradient descent with privacy guarantees for approximate unlearning problems. Langevin unlearning unifies the DP learning process and the privacy-certified unlearning process with many algorithmic benefits. These include approximate certified unlearning for non-convex problems, complexity saving compared to retraining, sequential and batch unlearning for multiple unlearning requests.

NeurIPS Conference 2024 Conference Paper

MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training

  • Bo Chen
  • Zhilei Bei
  • Xingyi Cheng
  • Pan Li
  • Jie Tang
  • Le Song

Multiple Sequence Alignment (MSA) plays a pivotal role in unveiling the evolutionary trajectories of protein families. The accuracy of protein structure predictions is often compromised for protein sequences that lack sufficient homologous information to construct high-quality MSA. Although various methods have been proposed to generate high-quality MSA under these conditions, they fall short in comprehensively capturing the intricate co-evolutionary patterns within MSA or require guidance from external oracle models. Here we introduce MSAGPT, a novel approach to prompt protein structure predictions via MSA generative pre-training in a low-MSA regime. MSAGPT employs a simple yet effective 2D evolutionary positional encoding scheme to model the complex evolutionary patterns. Endowed by this, the flexible 1D MSA decoding framework facilitates zero- or few-shot learning. Moreover, we demonstrate leveraging the feedback from AlphaFold2 (AF2) can further enhance the model’s capacity via Rejective Fine-tuning (RFT) and Reinforcement Learning from AF2 Feedback (RLAF). Extensive experiments confirm the efficacy of MSAGPT in generating faithful and informative MSA (up to +8. 5% TM-Score on few-shot scenarios). The transfer learning also demonstrates its great potential for the wide range of tasks resorting to the quality of MSA.

TMLR Journal 2024 Journal Article

On the Inherent Privacy Properties of Discrete Denoising Diffusion Models

  • Rongzhe Wei
  • Eleonora Kreacic
  • Haoyu Peter Wang
  • Haoteng YIN
  • Eli Chien
  • Vamsi K. Potluru
  • Pan Li

Privacy concerns have led to a surge in the creation of synthetic datasets, with diffusion models emerging as a promising avenue. Although prior studies have performed empirical evaluations on these models, there has been a gap in providing a mathematical characterization of their privacy-preserving capabilities. To address this, we present the pioneering theoretical exploration of the privacy preservation inherent in \emph{discrete diffusion models} (DDMs) for discrete dataset generation. Focusing on per-instance differential privacy (pDP), our framework elucidates the potential privacy leakage for each data point in a given training dataset, offering insights into how the privacy loss of each point correlates with the dataset's distribution. Our bounds also show that training with $s$-sized data points leads to a surge in privacy leakage from $(\epsilon, \mathcal{O}(\frac{1}{s^2\epsilon}))$-pDP to $(\epsilon, \mathcal{O}(\frac{1}{s\epsilon}))$-pDP of the DDM during the transition from the pure noise to the synthetic clean data phase, and a faster decay in diffusion coefficients amplifies the privacy guarantee. Finally, we empirically verify our theoretical findings on both synthetic and real-world datasets.

NeurIPS Conference 2024 Conference Paper

Training Compute-Optimal Protein Language Models

  • Xingyi Cheng
  • Bo Chen
  • Pan Li
  • Jing Gong
  • Jie Tang
  • Le Song

We explore optimally training protein language models, an area of significant interest in biological research where guidance on best practices is limited. Most models are trained with extensive compute resources until performance gains plateau, focusing primarily on increasing model sizes rather than optimizing the efficient compute frontier that balances performance and compute budgets. Our investigation is grounded in a massive dataset consisting of 939 million protein sequences. We trained over 300 models ranging from 3. 5 million to 10. 7 billion parameters on 5 to 200 billion unique tokens, to investigate the relations between model sizes, training token numbers, and objectives. First, we observed the effect of diminishing returns for the Causal Language Model (CLM) and that of overfitting for Masked Language Model (MLM) when repeating the commonly used Uniref database. To address this, we included metagenomic protein sequences in the training set to increase the diversity and avoid the plateau or overfitting effects. Second, we obtained the scaling laws of CLM and MLM on Transformer, tailored to the specific characteristics of protein sequence data. Third, we observe a transfer scaling phenomenon from CLM to MLM, further demonstrating the effectiveness of transfer through scaling behaviors based on estimated Effectively Transferred Tokens. Finally, to validate our scaling laws, we compare the large-scale versions of ESM-2 and PROGEN2 on downstream tasks, encompassing evaluations of protein generation as well as structure- and function-related tasks, all within less or equivalent pre-training compute budgets.

NeurIPS Conference 2023 Conference Paper

Differentially Private Decoupled Graph Convolutions for Multigranular Topology Protection

  • Eli Chien
  • Wei-Ning Chen
  • Chao Pan
  • Pan Li
  • Ayfer Ozgur
  • Olgica Milenkovic

Graph Neural Networks (GNNs) have proven to be highly effective in solving real-world learning problems that involve graph-structured data. However, GNNs can also inadvertently expose sensitive user information and interactions through their model predictions. To address these privacy concerns, Differential Privacy (DP) protocols are employed to control the trade-off between provable privacy protection and model utility. Applying standard DP approaches to GNNs directly is not advisable due to two main reasons. First, the prediction of node labels, which relies on neighboring node attributes through graph convolutions, can lead to privacy leakage. Second, in practical applications, the privacy requirements for node attributes and graph topology may differ. In the latter setting, existing DP-GNN models fail to provide multigranular trade-offs between graph topology privacy, node attribute privacy, and GNN utility. To address both limitations, we propose a new framework termed Graph Differential Privacy (GDP), specifically tailored to graph learning. GDP ensures both provably private model parameters as well as private predictions. Additionally, we describe a novel unified notion of graph dataset adjacency to analyze the properties of GDP for different levels of graph topology privacy. Our findings reveal that DP-GNNs, which rely on graph convolutions, not only fail to meet the requirements for multigranular graph topology privacy but also necessitate the injection of DP noise that scales at least linearly with the maximum node degree. In contrast, our proposed Differentially Private Decoupled Graph Convolutions (DPDGCs) represent a more flexible and efficient alternative to graph convolutions that still provides the necessary guarantees of GDP. To validate our approach, we conducted extensive experiments on seven node classification benchmarking and illustrative synthetic datasets. The results demonstrate that DPDGCs significantly outperform existing DP-GNNs in terms of privacy-utility trade-offs.

TIST Journal 2022 Journal Article

Adversarial Learning for Cross Domain Recommendations

  • Pan Li
  • Brian Brost
  • Alexander Tuzhilin

Existing cross domain recommender systems typically assume homogeneous user preferences across multiple domains to capture similarities of user-item interactions and to provide cross domain recommendations accordingly. Meanwhile, the heterogeneity of user behaviors is usually not well studied and captured during the recommendation process, where users might have vastly different interests in different domains. In addition, previous models focus primarily on recommendation tasks between domain pairs, and cannot be naturally extended to serve for multiple domain recommendation applications. To address these challenges, we propose to utilize the idea of adversarial learning to intelligently incorporate global user preferences and domain-specific user preferences for providing satisfying cross domain recommendations. In particular, our proposed Adversarial Cross Domain Recommendation (ACDR) model first obtains the latent representations of global user preferences from their explicit feature information, and then transforms them into domain-specific user embeddings, where we take into account user behaviors and their heterogeneous preferences among different domains. By doing so, we address the differences among user representations in the domain-specific latent space while also preserving global user preferences, as we effectively segment the distributions of domain-specific user embeddings in the shared latent space. The convergence of our proposed model is theoretically guaranteed. The proposed ACDR model leads to significant and consistent improvements in cross domain recommendation performance over the state-of-the-art baseline models, which we demonstrate through extensive experiments on three real-world datasets. In addition, we show that the improvements are greater on those datasets that are smaller and more sparse, on those users that have fewer interaction records in the dataset, and when user interactions from more product domains are included in the cross domain recommendation model.

NeurIPS Conference 2022 Conference Paper

Understanding Non-linearity in Graph Neural Networks from the Bayesian-Inference Perspective

  • Rongzhe Wei
  • Haoteng YIN
  • Junteng Jia
  • Austin R. Benson
  • Pan Li

Graph neural networks (GNNs) have shown superiority in many prediction tasks over graphs due to their impressive capability of capturing nonlinear relations in graph-structured data. However, for node classification tasks, often, only marginal improvement of GNNs has been observed in practice over their linear counterparts. Previous works provide very few understandings of this phenomenon. In this work, we resort to Bayesian learning to give an in-depth investigation of the functions of non-linearity in GNNs for node classification tasks. Given a graph generated from the statistical model CSBM, we observe that the max-a-posterior estimation of a node label given its own and neighbors' attributes consists of two types of non-linearity, the transformation of node attributes and a ReLU-activated feature aggregation from neighbors. The latter surprisingly matches the type of non-linearity used in many GNN models. By further imposing Gaussian assumption on node attributes, we prove that the superiority of those ReLU activations is only significant when the node attributes are far more informative than the graph structure, which nicely explains previous empirical observations. A similar argument is derived when there is a distribution shift of node attributes between the training and testing datasets. Finally, we verify our theory on both synthetic and real-world networks. Our code is available at https: //github. com/Graph-COM/Bayesian_inference_based_GNN. git.

NeurIPS Conference 2022 Conference Paper

Unsupervised Learning for Combinatorial Optimization with Principled Objective Relaxation

  • Haoyu Peter Wang
  • Nan Wu
  • Hang Yang
  • Cong Hao
  • Pan Li

Using machine learning to solve combinatorial optimization (CO) problems is challenging, especially when the data is unlabeled. This work proposes an unsupervised learning framework for CO problems. Our framework follows the standard relaxation-plus-rounding approach and adopts neural networks to parameterize the relaxed solutions so that simple back-propagation can train them end-to-end. Our key contribution is the observation that if the relaxed objective satisfies entry-wise concavity, a low optimization loss guarantees the quality of the obtained integral solutions. This observation significantly generalizes the applicability of the previous framework inspired by Erdos' probabilistic method (Karalias & Loukas, 2020). Our framework is particularly suitable to guide the design of objective models in the applications where the objectives are not given explicitly while requiring being modeled and learned first. We evaluate our framework by solving a synthetic graph optimization problem, and two real-world applications including resource allocation in circuit design and approximate computing. Our framework largely outperforms the baselines based on reinforcement learning and Gumbel-softmax tricks.

NeurIPS Conference 2021 Conference Paper

Adversarial Graph Augmentation to Improve Graph Contrastive Learning

  • Susheel Suresh
  • Pan Li
  • Cong Hao
  • Jennifer Neville

Self-supervised learning of graph neural networks (GNN) is in great need because of the widespread label scarcity issue in real-world graph/network data. Graph contrastive learning (GCL), by training GNNs to maximize the correspondence between the representations of the same graph in its different augmented forms, may yield robust and transferable GNNs even without using labels. However, GNNs trained by traditional GCL often risk capturing redundant graph features and thus may be brittle and provide sub-par performance in downstream tasks. Here, we propose a novel principle, termed adversarial-GCL (\textit{AD-GCL}), which enables GNNs to avoid capturing redundant information during the training by optimizing adversarial graph augmentation strategies used in GCL. We pair AD-GCL with theoretical explanations and design a practical instantiation based on trainable edge-dropping graph augmentation. We experimentally validate AD-GCL by comparing with the state-of-the-art GCL methods and achieve performance gains of up-to~14\% in unsupervised, ~6\% in transfer and~3\% in semi-supervised learning settings overall with 18 different benchmark datasets for the tasks of molecule property regression and classification, and social network classification.

NeurIPS Conference 2021 Conference Paper

Generic Neural Architecture Search via Regression

  • Yuhong Li
  • Cong Hao
  • Pan Li
  • Jinjun Xiong
  • Deming Chen

Most existing neural architecture search (NAS) algorithms are dedicated to and evaluated by the downstream tasks, e. g. , image classification in computer vision. However, extensive experiments have shown that, prominent neural architectures, such as ResNet in computer vision and LSTM in natural language processing, are generally good at extracting patterns from the input data and perform well on different downstream tasks. In this paper, we attempt to answer two fundamental questions related to NAS. (1) Is it necessary to use the performance of specific downstream tasks to evaluate and search for good neural architectures? (2) Can we perform NAS effectively and efficiently while being agnostic to the downstream tasks? To answer these questions, we propose a novel and generic NAS framework, termed Generic NAS (GenNAS). GenNAS does not use task-specific labels but instead adopts regression on a set of manually designed synthetic signal bases for architecture evaluation. Such a self-supervised regression task can effectively evaluate the intrinsic power of an architecture to capture and transform the input signal patterns, and allow more sufficient usage of training samples. Extensive experiments across 13 CNN search spaces and one NLP space demonstrate the remarkable efficiency of GenNAS using regression, in terms of both evaluating the neural architectures (quantified by the ranking correlation Spearman's rho between the approximated performances and the downstream task performances) and the convergence speed for training (within a few seconds). For example, on NAS-Bench-101, GenNAS achieves 0. 85 rho while the existing efficient methods only achieve 0. 38. We then propose an automatic task search to optimize the combination of synthetic signals using limited downstream-task-specific labels, further improving the performance of GenNAS. We also thoroughly evaluate GenNAS's generality and end-to-end NAS performance on all search spaces, which outperforms almost all existing works with significant speedup. For example, on NASBench-201, GenNAS can find near-optimal architectures within 0. 3 GPU hour.

NeurIPS Conference 2021 Conference Paper

Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning

  • Muhan Zhang
  • Pan Li
  • Yinglong Xia
  • Kai Wang
  • Long Jin

In this paper, we provide a theory of using graph neural networks (GNNs) for multi-node representation learning (where we are interested in learning a representation for a set of more than one node, such as link). We know that GNN is designed to learn single-node representations. When we want to learn a node set representation involving multiple nodes, a common practice in previous works is to directly aggregate the single-node representations obtained by a GNN into a joint node set representation. In this paper, we show a fundamental constraint of such an approach, namely the inability to capture the dependence between nodes in the node set, and argue that directly aggregating individual node representations does not lead to an effective joint representation for multiple nodes. Then, we notice that a few previous successful works for multi-node representation learning, including SEAL, Distance Encoding, and ID-GNN, all used node labeling. These methods first label nodes in the graph according to their relationships with the target node set before applying a GNN. Then, the node representations obtained in the labeled graph are aggregated into a node set representation. By investigating their inner mechanisms, we unify these node labeling techniques into a single and most general form---labeling trick. We prove that with labeling trick a sufficiently expressive GNN learns the most expressive node set representations, thus in principle solves any joint learning tasks over node sets. Experiments on one important two-node representation learning task, link prediction, verified our theory. Our work explains the superior performance of previous node-labeling-based methods, and establishes a theoretical foundation of using GNNs for multi-node representation learning.

NeurIPS Conference 2021 Conference Paper

Local Hyper-Flow Diffusion

  • Kimon Fountoulakis
  • Pan Li
  • Shenghao Yang

Recently, hypergraphs have attracted a lot of attention due to their ability to capture complex relations among entities. The insurgence of hypergraphs has resulted in data of increasing size and complexity that exhibit interesting small-scale and local structure, e. g. , small-scale communities and localized node-ranking around a given set of seed nodes. Popular and principled ways to capture the local structure are the local hypergraph clustering problem and the related seed set expansion problem. In this work, we propose the first local diffusion method that achieves edge-size-independent Cheeger-type guarantee for the problem of local hypergraph clustering while applying to a rich class of higher-order relations that covers a number of previously studied special cases. Our method is based on a primal-dual optimization formulation where the primal problem has a natural network flow interpretation, and the dual problem has a cut-based interpretation using the $\ell_2$-norm penalty on associated cut-costs. We demonstrate the new technique is significantly better than state-of-the-art methods on both synthetic and real-world data.

NeurIPS Conference 2021 Conference Paper

Nested Graph Neural Networks

  • Muhan Zhang
  • Pan Li

Graph neural network (GNN)'s success in graph classification is closely related to the Weisfeiler-Lehman (1-WL) algorithm. By iteratively aggregating neighboring node features to a center node, both 1-WL and GNN obtain a node representation that encodes a rooted subtree around the center node. These rooted subtree representations are then pooled into a single representation to represent the whole graph. However, rooted subtrees are of limited expressiveness to represent a non-tree graph. To address it, we propose Nested Graph Neural Networks (NGNNs). NGNN represents a graph with rooted subgraphs instead of rooted subtrees, so that two graphs sharing many identical subgraphs (rather than subtrees) tend to have similar representations. The key is to make each node representation encode a subgraph around it more than a subtree. To achieve this, NGNN extracts a local subgraph around each node and applies a base GNN to each subgraph to learn a subgraph representation. The whole-graph representation is then obtained by pooling these subgraph representations. We provide a rigorous theoretical analysis showing that NGNN is strictly more powerful than 1-WL. In particular, we proved that NGNN can discriminate almost all r-regular graphs, where 1-WL always fails. Moreover, unlike other more powerful GNNs, NGNN only introduces a constant-factor higher time complexity than standard GNNs. NGNN is a plug-and-play framework that can be combined with various base GNNs. We test NGNN with different base GNNs on several benchmark datasets. NGNN uniformly improves their performance and shows highly competitive performance on all datasets.

IJCAI Conference 2021 Conference Paper

Regularising Knowledge Transfer by Meta Functional Learning

  • Pan Li
  • Yanwei Fu
  • Shaogang Gong

Machine learning classifiers’ capability is largely dependent on the scale of available training data and limited by the model overfitting in data-scarce learning tasks. To address this problem, this work proposes a novel Meta Functional Learning (MFL) by meta-learning a generalisable functional model from data-rich tasks whilst simultaneously regularising knowledge transfer to data-scarce tasks. The MFL computes meta-knowledge on functional regularisation generalisable to different learning tasks by which functional training on limited labelled data promotes more discriminative functions to be learned. Moreover, we adopt an Iterative Update strategy on MFL (MFL-IU). This improves knowledge transfer regularisation from MFL by progressively learning the functional regularisation in knowledge transfer. Experiments on three Few-Shot Learning (FSL) benchmarks (miniImageNet, CIFAR-FS and CUB) show that meta functional learning for regularisation knowledge transfer can benefit improving FSL classifiers.

NeurIPS Conference 2020 Conference Paper

Distance Encoding: Design Provably More Powerful Neural Networks for Graph Representation Learning

  • Pan Li
  • Yanbang Wang
  • Hongwei Wang
  • Jure Leskovec

Learning representations of sets of nodes in a graph is crucial for applications ranging from node-role discovery to link prediction and molecule classification. Graph Neural Networks (GNNs) have achieved great success in graph representation learning. However, expressive power of GNNs is limited by the 1-Weisfeiler-Lehman (WL) test and thus GNNs generate identical representations for graph substructures that may in fact be very different. More powerful GNNs, proposed recently by mimicking higher-order-WL tests, only focus on representing entire graphs and they are computationally inefficient as they cannot utilize sparsity of the underlying graph. Here we propose and mathematically analyze a general class of structure-related features, termed Distance Encoding (DE). DE assists GNNs in representing any set of nodes, while providing strictly more expressive power than the 1-WL test. DE captures the distance between the node set whose representation is to be learned and each node in the graph. To capture the distance DE can apply various graph-distance measures such as shortest path distance or generalized PageRank scores. We propose two ways for GNNs to use DEs (1) as extra node features, and (2) as controllers of message aggregation in GNNs. Both approaches can utilize the sparse structure of the underlying graph, which leads to computational efficiency and scalability. We also prove that DE can distinguish node sets embedded in almost all regular graphs where traditional GNNs always fail. We evaluate DE on three tasks over six real networks: structural role prediction, link prediction, and triangle prediction. Results show that our models outperform GNNs without DE by up-to 15\% in accuracy and AUROC. Furthermore, our models also significantly outperform other state-of-the-art methods especially designed for the above tasks.

NeurIPS Conference 2020 Conference Paper

Graph Information Bottleneck

  • Tailin Wu
  • Hongyu Ren
  • Pan Li
  • Jure Leskovec

Representation learning of graph-structured data is challenging because both graph structure and node features carry important information. Graph Neural Networks (GNNs) provide an expressive way to fuse information from network structure and node features. However, GNNs are prone to adversarial attacks. Here we introduce Graph Information Bottleneck (GIB), an information-theoretic principle that optimally balances expressiveness and robustness of the learned representation of graph-structured data. Inheriting from the general Information Bottleneck (IB), GIB aims to learn the minimal sufficient representation for a given task by maximizing the mutual information between the representation and the target, and simultaneously constraining the mutual information between the representation and the input data. Different from the general IB, GIB regularizes the structural as well as the feature information. We design two sampling algorithms for structural regularization and instantiate the GIB principle with two new models: GIB-Cat and GIB-Bern, and demonstrate the benefits by evaluating the resilience to adversarial attacks. We show that our proposed models are more robust than state-of-the-art graph defense models. GIB-based models empirically achieve up to 31% improvement with adversarial perturbation of the graph structure as well as node features.

TIST Journal 2020 Journal Article

Latent Unexpected Recommendations

  • Pan Li
  • Alexander Tuzhilin

Unexpected recommender system constitutes an important tool to tackle the problem of filter bubbles and user boredom, which aims at providing unexpected and satisfying recommendations to target users at the same time. Previous unexpected recommendation methods only focus on the straightforward relations between current recommendations and user expectations by modeling unexpectedness in the feature space, thus resulting in the loss of accuracy measures to improve unexpectedness performance. In contrast to these prior models, we propose to model unexpectedness in the latent space of user and item embeddings, which allows us to capture hidden and complex relations between new recommendations and historic purchases. In addition, we develop a novel Latent Closure (LC) method to construct a hybrid utility function and provide unexpected recommendations based on the proposed model. Extensive experiments on three real-world datasets illustrate superiority of our proposed approach over the state-of-the-art unexpected recommendation models, which leads to significant increase in unexpectedness measure without sacrificing any accuracy metric under all experimental settings in this article.

JMLR Journal 2020 Journal Article

Quadratic Decomposable Submodular Function Minimization: Theory and Practice

  • Pan Li
  • Niao He
  • Olgica Milenkovic

We introduce a new convex optimization problem, termed quadratic decomposable submodular function minimization (QDSFM), which allows to model a number of learning tasks on graphs and hypergraphs. The problem exhibits close ties to decomposable submodular function minimization (DSFM) yet is much more challenging to solve. We approach the problem via a new dual strategy and formulate an objective that can be optimized through a number of double-loop algorithms. The outer-loop uses either random coordinate descent (RCD) or alternative projection (AP) methods, for both of which we prove linear convergence rates. The inner-loop computes projections onto cones generated by base polytopes of the submodular functions via the modified min-norm-point or Frank-Wolfe algorithms. We also describe two new applications of QDSFM: hypergraph-adapted PageRank and semi-supervised learning. The proposed hypergraph-based PageRank algorithm can be used for local hypergraph partitioning and comes with provable performance guarantees. For hypergraph-adapted semi-supervised learning, we provide numerical experiments demonstrating the efficiency of our QDSFM solvers and their significant improvements on prediction accuracy when compared to state-of-the-art methods. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2020. ( edit, beta )

NeurIPS Conference 2019 Conference Paper

Conditional Structure Generation through Graph Variational Generative Adversarial Nets

  • Carl Yang
  • Peiye Zhuang
  • Wenhan Shi
  • Alan Luu
  • Pan Li

Graph embedding has been intensively studied recently, due to the advance of various neural network models. Theoretical analyses and empirical studies have pushed forward the translation of discrete graph structures into distributed representation vectors, but seldom considered the reverse direction, i. e. , generation of graphs from given related context spaces. Particularly, since graphs often become more meaningful when associated with semantic contexts (e. g. , social networks of certain communities, gene networks of certain diseases), the ability to infer graph structures according to given semantic conditions could be of great value. While existing graph generative models only consider graph structures without semantic contexts, we formulate the novel problem of conditional structure generation, and propose a novel unified model of graph variational generative adversarial nets (CondGen) to handle the intrinsic challenges of flexible context-structure conditioning and permutation-invariant generation. Extensive experiments on two deliberately created benchmark datasets of real-world context-enriched networks demonstrate the supreme effectiveness and generalizability of CondGen.

IJCAI Conference 2019 Conference Paper

Learning to Learn Gradient Aggregation by Gradient Descent

  • Jinlong Ji
  • Xuhui Chen
  • Qianlong Wang
  • Lixing Yu
  • Pan Li

In the big data era, distributed machine learning emerges as an important learning paradigm to mine large volumes of data by taking advantage of distributed computing resources. In this work, motivated by learning to learn, we propose a meta-learning approach to coordinate the learning process in the master-slave type of distributed systems. Specifically, we utilize a recurrent neural network (RNN) in the parameter server (the master) to learn to aggregate the gradients from the workers (the slaves). We design a coordinatewise preprocessing and postprocessing method to make the neural network based aggregator more robust. Besides, to address the fault tolerance, especially the Byzantine attack, in distributed machine learning systems, we propose an RNN aggregator with additional loss information (ARNN) to improve the system resilience. We conduct extensive experiments to demonstrate the effectiveness of the RNN aggregator, and also show that it can be easily generalized and achieve remarkable performance when transferred to other distributed systems. Moreover, under majoritarian Byzantine attacks, the ARNN aggregator outperforms the Krum, the state-of-art fault tolerance aggregation method, by 43. 14%. In addition, our RNN aggregator enables the server to aggregate gradients from variant local models, which significantly improve the scalability of distributed learning.

NeurIPS Conference 2019 Conference Paper

Optimizing Generalized PageRank Methods for Seed-Expansion Community Detection

  • Pan Li
  • I Chien
  • Olgica Milenkovic

Landing probabilities (LP) of random walks (RW) over graphs encode rich information regarding graph topology. Generalized PageRanks (GPR), which represent weighted sums of LPs of RWs, utilize the discriminative power of LP features to enable many graph-based learning studies. Previous work in the area has mostly focused on evaluating suitable weights for GPRs, and only a few studies so far have attempted to derive the optimal weights of GPRs for a given application. We take a fundamental step forward in this direction by using random graph models to better our understanding of the behavior of GPRs. In this context, we provide a rigorous non-asymptotic analysis for the convergence of LPs and GPRs to their mean-field values on edge-independent random graphs. Although our theoretical results apply to many problem settings, we focus on the task of seed-expansion community detection over stochastic block models. There, we find that the predictive power of LPs decreases significantly slower than previously reported based on asymptotic findings. Given this result, we propose a new GPR, termed Inverse PR (IPR), with LP weights that increase for the initial few steps of the walks. Extensive experiments on both synthetic and real, large-scale networks illustrate the superiority of IPR compared to other GPRs for seeded community detection.

AAAI Conference 2018 Conference Paper

Measuring the Popularity of Job Skills in Recruitment Market: A Multi-Criteria Approach

  • Tong Xu
  • Hengshu Zhu
  • Chen Zhu
  • Pan Li
  • Hui Xiong

To cope with the accelerating pace of technological changes, talents are urged to add and refresh their skills for staying in active and gainful employment. This raises a natural question: what are the right skills to learn? Indeed, it is a nontrivial task to measure the popularity of job skills due to the diversified criteria of jobs and the complicated connections within job skills. To that end, in this paper, we propose a data driven approach for modeling the popularity of job skills based on the analysis of large-scale recruitment data. Specifically, we first build a job skill network by exploring a large corpus of job postings. Then, we develop a novel Skill Popularity based Topic Model (SPTM) for modeling the generation of the skill network. In particular, SPTM can integrate different criteria of jobs (e. g. , salary levels, company size) as well as the latent connections within skills, thus we can effectively rank the job skills based on their multi-faceted popularity. Extensive experiments on real-world recruitment data validate the effectiveness of SPTM for measuring the popularity of job skills, and also reveal some interesting rules, such as the popular job skills which lead to high-paid employment.

NeurIPS Conference 2018 Conference Paper

Quadratic Decomposable Submodular Function Minimization

  • Pan Li
  • Niao He
  • Olgica Milenkovic

We introduce a new convex optimization problem, termed quadratic decomposable submodular function minimization. The problem is closely related to decomposable submodular function minimization and arises in many learning on graphs and hypergraphs settings, such as graph-based semi-supervised learning and PageRank. We approach the problem via a new dual strategy and describe an objective that may be optimized via random coordinate descent (RCD) methods and projections onto cones. We also establish the linear convergence rate of the RCD algorithm and develop efficient projection algorithms with provable performance guarantees. Numerical experiments in semi-supervised learning on hypergraphs confirm the efficiency of the proposed algorithm and demonstrate the significant improvements in prediction accuracy with respect to state-of-the-art methods.

NeurIPS Conference 2018 Conference Paper

Revisiting Decomposable Submodular Function Minimization with Incidence Relations

  • Pan Li
  • Olgica Milenkovic

We introduce a new approach to decomposable submodular function minimization (DSFM) that exploits incidence relations. Incidence relations describe which variables effectively influence the component functions, and when properly utilized, they allow for improving the convergence rates of DSFM solvers. Our main results include the precise parametrization of the DSFM problem based on incidence relations, the development of new scalable alternative projections and parallel coordinate descent methods and an accompanying rigorous analysis of their convergence rates.

NeurIPS Conference 2017 Conference Paper

Inhomogeneous Hypergraph Clustering with Applications

  • Pan Li
  • Olgica Milenkovic

Hypergraph partitioning is an important problem in machine learning, computer vision and network analytics. A widely used method for hypergraph partitioning relies on minimizing a normalized sum of the costs of partitioning hyperedges across clusters. Algorithmic solutions based on this approach assume that different partitions of a hyperedge incur the same cost. However, this assumption fails to leverage the fact that different subsets of vertices within the same hyperedge may have different structural importance. We hence propose a new hypergraph clustering technique, termed inhomogeneous hypergraph partitioning, which assigns different costs to different hyperedge cuts. We prove that inhomogeneous partitioning produces a quadratic approximation to the optimal solution if the inhomogeneous costs satisfy submodularity constraints. Moreover, we demonstrate that inhomogenous partitioning offers significant performance improvements in applications such as structure learning of rankings, subspace segmentation and motif clustering.

JBHI Journal 2014 Journal Article

Cloud-Assisted Mobile-Access of Health Data With Privacy and Auditability

  • Yue Tong
  • Jinyuan Sun
  • Sherman S. M. Chow
  • Pan Li

Motivated by the privacy issues, curbing the adoption of electronic healthcare systems and the wild success of cloud service models, we propose to build privacy into mobile healthcare systems with the help of the private cloud. Our system offers salient features including efficient key management, privacy-preserving data storage, and retrieval, especially for retrieval at emergencies, and auditability for misusing health data. Specifically, we propose to integrate key management from pseudorandom number generator for unlinkability, a secure indexing method for privacy-preserving keyword search which hides both search and access patterns based on redundancy, and integrate the concept of attribute-based encryption with threshold signing for providing role-based access control with auditability to prevent potential misbehavior, in both normal and emergency cases.