Arrow Research search

Author name cluster

Bo Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

67 papers
2 author rows

Possible papers

67

AAAI Conference 2026 Conference Paper

MSME: A Multi-Stage Multi-Expert Framework for Zero-Shot Stance Detection

  • Yuanshuo Zhang
  • Aohua Li
  • Bo Chen
  • Jingbo Sun
  • Xiaobing Zhao

LLM-based approaches have recently achieved impressive results in zero-shot stance detection. However, they still struggle in complex real-world scenarios, where stance understanding requires dynamic background knowledge, target definitions involve compound entities or events that must be explicitly linked to stance labels, and rhetorical devices such as irony often obscure the author’s actual intent. To address these challenges, we propose MSME, a Multi-Stage, Multi-Expert framework for zero-shot stance detection. MSME consists of three stages: (1) Knowledge Preparation, where relevant background knowledge is retrieved and stance labels are clarified; (2) Expert Reasoning, involving three specialized modules—Knowledge Expert distills salient facts and reasons from a knowledge perspective, Label Expert refines stance labels and reasons accordingly, and Pragmatic Expert detects rhetorical cues such as irony to infer intent from a pragmatic angle; (3) Decision Aggregation, where a Meta-Judge integrates all expert analyses to produce the final stance prediction. Experiments on three public datasets show that MSME achieves state-of-the-art performance across the board.

ICLR Conference 2025 Conference Paper

Advancing Graph Generation through Beta Diffusion

  • Xinyang Liu
  • Yilin He
  • Bo Chen
  • Mingyuan Zhou

Diffusion models have excelled in generating natural images and are now being adapted to a variety of data types, including graphs. However, conventional models often rely on Gaussian or categorical diffusion processes, which can struggle to accommodate the mixed discrete and continuous components characteristic of graph data. Graphs typically feature discrete structures and continuous node attributes that often exhibit rich statistical patterns, including sparsity, bounded ranges, skewed distributions, and long-tailed behavior. To address these challenges, we introduce Graph Beta Diffusion (GBD), a generative model specifically designed to handle the diverse nature of graph data. GBD leverages a beta diffusion process, effectively modeling both continuous and discrete elements. Additionally, we propose a modulation technique that enhances the realism of generated graphs by stabilizing critical graph topology while maintaining flexibility for other components. GBD competes strongly with existing models across multiple general and biochemical graph benchmarks, showcasing its ability to capture the intricate balance between discrete and continuous features inherent in real-world graph data. Our PyTorch code is available at https://github.com/xinyangATK/GraphBetaDiffusion.

NeurIPS Conference 2025 Conference Paper

Channel Matters: Estimating Channel Influence for Multivariate Time Series

  • Muyao Wang
  • Zeke Xie
  • Bo Chen
  • Hongwei Liu
  • James Kwok

The influence function serves as an efficient post-hoc interpretability tool that quantifies the impact of training data modifications on model parameters, enabling enhanced model performance, improved generalization, and interpretability insights without the need for expensive retraining processes. Recently, Multivariate Time Series (MTS) analysis has become an important yet challenging task, attracting significant attention. While channel extremely matters to MTS tasks, channel-centric methods are still largely under-explored for MTS. Particularly, no previous work studied the effects of channel information of MTS in order to explore counterfactual effects between these channels and model performance. To fill this gap, we propose a novel Channel-wise Influence (ChInf) method that is the first to estimate the influence of different channels in MTS. Based on ChInf, we naturally derived two channel-wise algorithms by incorporating ChInf into classic MTS tasks. Extensive experiments demonstrate the effectiveness of ChInf and ChInf-based methods in critical MTS analysis tasks, such as MTS anomaly detection and MTS data pruning. Specifically, our ChInf-based methods rank top-1 among all methods for comparison, while previous influence functions do not perform well on MTS anomaly detection tasks and MTS data pruning problem. This fully supports the superiority and necessity of ChInf.

NeurIPS Conference 2025 Conference Paper

FedEL: Federated Elastic Learning for Heterogeneous Devices

  • Letian Zhang
  • Bo Chen
  • Jieming Bian
  • Lei Wang
  • Jie Xu

Federated learning (FL) enables distributed devices to collaboratively train machine learning (ML) models while maintaining data privacy. However, the heterogeneous hardware capabilities of participating devices often result in significant training delays, as straggler clients with limited resources prolong the aggregation process. Existing solutions such as client selection, asynchronous FL, and partial training partially address these challenges but encounter issues such as reduced accuracy, stale updates, and compromised model performance due to inconsistent training contributions. To overcome these limitations, we propose FedEL, a federated elastic learning framework that enhances training efficiency while maintaining model accuracy. FedEL introduces a novel window-based training process, sliding the window to locate the training part of the model and dynamically selecting important tensors for training within a coordinated runtime budget. This approach ensures progressive and balanced training across all clients, including stragglers. Additionally, FedEL employs a tensor importance adjustment module, harmonizing local and global tensor importance to mitigate biases caused by data heterogeneity. The experiment results shows that FedEL achieves up to 3. 87× improvement in time-to-accuracy compared to baselines while maintaining or exceeding final test accuracy.

AAAI Conference 2025 Conference Paper

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

  • Xuan Shen
  • Zhao Song
  • Yufa Zhou
  • Bo Chen
  • Yanyu Li
  • Yifan Gong
  • Kai Zhang
  • Hao Tan

Diffusion Transformers have emerged as the preeminent models for a wide array of generative tasks, demonstrating superior performance and efficacy across various applications. The promising results come at the cost of slow inference, as each denoising step requires running the whole transformer model with a large amount of parameters. In this paper, we show that performing the full computation of the model at each diffusion step is unnecessary, as some computations can be skipped by lazily reusing the results of previous steps. Furthermore, we show that the lower bound of similarity between outputs at consecutive steps is notably high, and this similarity can be linearly approximated using the inputs. To verify our demonstrations, we propose the **LazyDiT**, a lazy learning framework that efficiently leverages cached results from earlier steps to skip redundant computations. Specifically, we incorporate lazy learning layers into the model, effectively trained to maximize laziness, enabling dynamic skipping of redundant computations. Experimental results show that LazyDiT outperforms the DDIM sampler across multiple diffusion transformer models at various resolutions. Furthermore, we implement our method on mobile devices, achieving better performance than DDIM with similar latency.

NeurIPS Conference 2025 Conference Paper

NegoCollab: A Common Representation Negotiation Approach for Heterogeneous Collaborative Perception

  • CONGZHANG SHAO
  • Quan Yuan
  • Guiyang Luo
  • Yue Hu
  • Danni Wang
  • Liu Yilin
  • Rui Pan
  • Bo Chen

Collaborative perception improves task performance by expanding the perception range through information sharing among agents. Immutable heterogeneity poses a significant challenge in collaborative perception, as participating agents may employ different and fixed perception models. This leads to domain gaps in the intermediate features shared among agents, consequently degrading collaborative performance. Aligning the features of all agents to a common representation can eliminate domain gaps with low training cost. However, in existing methods, the common representation is designated as the representation of a specific agent, making it difficult for agents with significant domain discrepancies from this specific agent to achieve proper alignment. This paper proposes NegoCollab, a heterogeneous collaboration method based on the negotiated common representation. It introduces a negotiator during training to derive the common representation from the local representations of each modality's agent, effectively reducing the inherent domain gap with the various local representations. In NegoCollab, the mutual transformation of features between the local representation space and the common representation space is achieved by a pair of sender and receiver. To better align local representations to the common representation containing multimodal information, we introduce structural alignment loss and pragmatic alignment loss in addition to the distribution alignment loss to supervise the training. This enables the knowledge in the common representation to be fully distilled into the sender. The experimental results demonstrate that NegoCollab significantly outperforms existing methods in common representation-based collaboration approaches. The mechanism of obtaining common representations through negotiation provides a more reliable and flexible option for common representations in heterogeneous collaborative perception.

AAAI Conference 2025 Conference Paper

Numerical Pruning for Efficient Autoregressive Models

  • Xuan Shen
  • Zhao Song
  • Yufa Zhou
  • Bo Chen
  • Jing Liu
  • Ruiyi Zhang
  • Ryan A. Rossi
  • Hao Tan

Transformers have emerged as the leading architecture in deep learning, proving to be versatile and highly effective across diverse domains beyond language and image processing. However, their impressive performance often incurs high computational costs due to their substantial model size. This paper focuses on compressing decoder-only transformer-based autoregressive models through structural weight pruning to improve the model efficiency while preserving performance for both language and image generation tasks. Specifically, we propose a training-free pruning method that calculates a numerical score with Newton's method for the Attention and MLP modules, respectively. Besides, we further propose another compensation algorithm to recover the pruned model for better performance. To verify the effectiveness of our method, we provide both theoretical support and extensive experiments. Our experiments show that our method achieves state-of-the-art performance with reduced memory usage and faster generation speeds on GPUs.

ECAI Conference 2025 Conference Paper

Towards Mitigation of Hallucination for LLM-Empowered Agents: Progressive Generalization Bound Exploration and Watchdog Monitor

  • Siyuan Liu
  • Wenjing Liu
  • Zhiwei Xu
  • Xin Wang
  • Bo Chen
  • Tao Li

Empowered by large language models (LLMs), intelligent agents have become a popular paradigm for interacting with open environments to facilitate AI deployment. However, hallucinations generated by LLMs—where outputs are inconsistent with facts—pose a significant challenge, undermining the credibility of intelligent agents. Only if hallucinations can be mitigated, the intelligent agents can be used in real-world without any catastrophic risk. Therefore, effective detection and mitigation of hallucinations are crucial to ensure the dependability of agents. Unfortunately, the related approaches either depend on white-box access to LLMs or fail to accurately identify hallucinations. To address the challenge posed by hallucinations of intelligent agents, we present HalMit, a novel black-box watchdog framework that models the generalization bound of LLM-empowered agents and thus detect hallucinations without requiring internal knowledge of the LLM’s architecture. Specifically, a probabilistic fractal sampling technique is proposed to generate a sufficient number of queries to trigger the incredible responses in parallel, efficiently identifying the generalization bound of the target agent. Experimental evaluations demonstrate that HalMit significantly outperforms existing approaches in hallucination monitoring. Its black-box nature and superior performance make HalMit a promising solution for enhancing the dependability of LLM-powered systems.

IJCAI Conference 2025 Conference Paper

WenyanGPT: A Large Language Model for Classical Chinese Tasks

  • Xinyu Yao
  • Mengdi Wang
  • Bo Chen
  • Xiaobing Zhao

Classical Chinese, as the core carrier of Chinese culture, plays a crucial role in the inheritance and study of ancient literature. However, existing natural language processing models primarily optimize for Modern Chinese, resulting in inadequate performance on Classical Chinese. This paper presents a comprehensive solution for Classical Chinese language processing. By continuing pre-training and instruction fine-tuning on the LLaMA3-8B-Chinese model, we construct a large language model, WenyanGPT, which is specifically designed for Classical Chinese tasks. Additionally, we develop an evaluation benchmark dataset, WenyanBENCH. Experimental results on WenyanBENCH demonstrate that WenyanGPT significantly outperforms current advanced LLMs in various Classical Chinese tasks. We make the model's training data, instruction fine-tuning data, and evaluation benchmark dataset publicly available to promote further research and development in the field of Classical Chinese processing.

AAAI Conference 2024 Conference Paper

Considering Nonstationary within Multivariate Time Series with Variational Hierarchical Transformer for Forecasting

  • Muyao Wang
  • Wenchao Chen
  • Bo Chen

The forecasting of Multivariate Time Series (MTS) has long been an important but challenging task. Due to the non-stationary problem across long-distance time steps, previous studies primarily adopt stationarization method to attenuate the non-stationary problem of original series for better predictability. However, existed methods always adopt the stationarized series, which ignore the inherent non-stationarity, and have difficulty in modeling MTS with complex distributions due to the lack of stochasticity. To tackle these problems, we first develop a powerful hierarchical probabilistic generative module to consider the non-stationarity and stochastity characteristics within MTS, and then combine it with transformer for a well-defined variational generative dynamic model named Hierarchical Time series Variational Transformer (HTV-Trans), which recovers the intrinsic non-stationary information into temporal dependencies. Being an powerful probabilistic model, HTV-Trans is utilized to learn expressive representations of MTS and applied to the forecasting tasks. Extensive experiments on diverse datasets show the efficiency of HTV-Trans on MTS forecasting tasks.

IROS Conference 2024 Conference Paper

MQE: Unleashing the Power of Interaction with Multi-agent Quadruped Environment

  • Ziyan Xiong
  • Bo Chen
  • Shiyu Huang 0001
  • Wei-Wei Tu
  • Zhaofeng He 0001
  • Yang Gao

The advent of deep reinforcement learning (DRL) has significantly advanced the field of robotics, particularly in the control and coordination of quadruped robots. However, the complexity of real-world tasks often necessitates the deployment of multi-robot systems capable of sophisticated interaction and collaboration. To address this need, we introduce the Multi-agent Quadruped Environment (MQE), a novel platform designed to facilitate the development and evaluation of multi-agent reinforcement learning (MARL) algorithms in realistic and dynamic scenarios. MQE emphasizes complex interactions between robots and objects, hierarchical policy structures, and challenging evaluation scenarios that reflect real-world applications. We present a series of collaborative and competitive tasks within MQE, ranging from simple coordination to complex adversarial interactions, and benchmark state-of-the-art MARL algorithms. Our findings indicate that hierarchical reinforcement learning can simplify task learning, but also highlight the need for advanced algorithms capable of handling the intricate dynamics of multi-agent interactions. MQE serves as a stepping stone towards bridging the gap between simulation and practical deployment, offering a rich environment for future research in multi-agent systems and robot learning. For open-sourced code and more details of MQE, please refer to https://ziyanx02.github.io/multiagent-quadruped-environment/.

NeurIPS Conference 2024 Conference Paper

MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training

  • Bo Chen
  • Zhilei Bei
  • Xingyi Cheng
  • Pan Li
  • Jie Tang
  • Le Song

Multiple Sequence Alignment (MSA) plays a pivotal role in unveiling the evolutionary trajectories of protein families. The accuracy of protein structure predictions is often compromised for protein sequences that lack sufficient homologous information to construct high-quality MSA. Although various methods have been proposed to generate high-quality MSA under these conditions, they fall short in comprehensively capturing the intricate co-evolutionary patterns within MSA or require guidance from external oracle models. Here we introduce MSAGPT, a novel approach to prompt protein structure predictions via MSA generative pre-training in a low-MSA regime. MSAGPT employs a simple yet effective 2D evolutionary positional encoding scheme to model the complex evolutionary patterns. Endowed by this, the flexible 1D MSA decoding framework facilitates zero- or few-shot learning. Moreover, we demonstrate leveraging the feedback from AlphaFold2 (AF2) can further enhance the model’s capacity via Rejective Fine-tuning (RFT) and Reinforcement Learning from AF2 Feedback (RLAF). Extensive experiments confirm the efficacy of MSAGPT in generating faithful and informative MSA (up to +8. 5% TM-Score on few-shot scenarios). The transfer learning also demonstrates its great potential for the wide range of tasks resorting to the quality of MSA.

NeurIPS Conference 2024 Conference Paper

Training Compute-Optimal Protein Language Models

  • Xingyi Cheng
  • Bo Chen
  • Pan Li
  • Jing Gong
  • Jie Tang
  • Le Song

We explore optimally training protein language models, an area of significant interest in biological research where guidance on best practices is limited. Most models are trained with extensive compute resources until performance gains plateau, focusing primarily on increasing model sizes rather than optimizing the efficient compute frontier that balances performance and compute budgets. Our investigation is grounded in a massive dataset consisting of 939 million protein sequences. We trained over 300 models ranging from 3. 5 million to 10. 7 billion parameters on 5 to 200 billion unique tokens, to investigate the relations between model sizes, training token numbers, and objectives. First, we observed the effect of diminishing returns for the Causal Language Model (CLM) and that of overfitting for Masked Language Model (MLM) when repeating the commonly used Uniref database. To address this, we included metagenomic protein sequences in the training set to increase the diversity and avoid the plateau or overfitting effects. Second, we obtained the scaling laws of CLM and MLM on Transformer, tailored to the specific characteristics of protein sequence data. Third, we observe a transfer scaling phenomenon from CLM to MLM, further demonstrating the effectiveness of transfer through scaling behaviors based on estimated Effectively Transferred Tokens. Finally, to validate our scaling laws, we compare the large-scale versions of ESM-2 and PROGEN2 on downstream tasks, encompassing evaluations of protein generation as well as structure- and function-related tasks, all within less or equivalent pre-training compute budgets.

ICLR Conference 2024 Conference Paper

Weaker MVI Condition: Extragradient Methods with Multi-Step Exploration

  • Yifeng Fan
  • Yongqiang Li
  • Bo Chen

This paper proposes a new framework of algorithms that is extended from the celebrated extragradient algorithm. The min-max problem has attracted increasing attention because of its applications in machine learning tasks such as generative adversarial networks (GANs) training. While there has been exhaustive research on convex-concave setting, problem of nonconvex-nonconcave setting faces many challenges, such as convergence to limit cycles. Given that general min-max optimization has been found to be intractable, recent research efforts have shifted towards tackling structured problems. One of these follows the weak Minty variational inequality (weak MVI), which is motivated by relaxing Minty variational inequality (mvi) without compromising convergence guarantee of extragradient algorithm. Existing extragradient-type algorithms involve one exploration step and one update step per iteration. We analyze the algorithms with multiple exploration steps and show that current assumption can be further relaxed when more exploration is introduced. Furthermore, we design an adaptive algorithm that explores until the optimal improvement is achieved. This process exploits information from the whole trajectory and effectively tackles cyclic behaviors.

NeurIPS Conference 2023 Conference Paper

Context-guided Embedding Adaptation for Effective Topic Modeling in Low-Resource Regimes

  • Yishi Xu
  • Jianqiao Sun
  • Yudi Su
  • Xinyang Liu
  • Zhibin Duan
  • Bo Chen
  • Mingyuan Zhou

Embedding-based neural topic models have turned out to be a superior option for low-resourced topic modeling. However, current approaches consider static word embeddings learnt from source tasks as general knowledge that can be transferred directly to the target task, discounting the dynamically changing nature of word meanings in different contexts, thus typically leading to sub-optimal results when adapting to new tasks with unfamiliar contexts. To settle this issue, we provide an effective method that centers on adaptively generating semantically tailored word embeddings for each task by fully exploiting contextual information. Specifically, we first condense the contextual syntactic dependencies of words into a semantic graph for each task, which is then modeled by a Variational Graph Auto-Encoder to produce task-specific word representations. On this basis, we further impose a learnable Gaussian mixture prior on the latent space of words to efficiently learn topic representations from a clustering perspective, which contributes to diverse topic discovery and fast adaptation to novel tasks. We have conducted a wealth of quantitative and qualitative experiments, and the results show that our approach comprehensively outperforms established topic models.

AAAI Conference 2023 Conference Paper

Dialogue Rewriting via Skeleton-Guided Generation

  • Chunlei Xin
  • Hongyu Lin
  • Shan Wu
  • Xianpei Han
  • Bo Chen
  • Wen Dai
  • Shuai Chen
  • Bin Wang

Dialogue rewriting aims to transform multi-turn, context-dependent dialogues into well-formed, context-independent text for most NLP systems. Previous dialogue rewriting benchmarks and systems assume a fluent and informative utterance to rewrite. Unfortunately, dialogue utterances from real-world systems are frequently noisy and with various kinds of errors that can make them almost uninformative. In this paper, we first present Real-world Dialogue Rewriting Corpus (RealDia), a new benchmark to evaluate how well current dialogue rewriting systems can deal with real-world noisy and uninformative dialogue utterances. RealDia contains annotated multi-turn dialogues from real scenes with ASR errors, spelling errors, redundancies and other noises that are ignored by previous dialogue rewriting benchmarks. We show that previous dialogue rewriting approaches are neither effective nor data-efficient to resolve RealDia. Then this paper presents Skeleton-Guided Rewriter (SGR), which can resolve the task of dialogue rewriting via a skeleton-guided generation paradigm. Experiments show that RealDia is a much more challenging benchmark for real-world dialogue rewriting, and SGR can effectively resolve the task and outperform previous approaches by a large margin.

UAI Conference 2023 Conference Paper

Differential Privacy in Cooperative Multiagent Planning

  • Bo Chen
  • Calvin Hawkins
  • Mustafa O. Karabag
  • Cyrus Neary
  • Matthew T. Hale
  • Ufuk Topcu

Privacy-aware multiagent systems must protect agents’ sensitive data while simultaneously ensuring that agents accomplish their shared objectives. Towards this goal, we propose a framework to privatize inter-agent communications in cooperative multiagent decision-making problems. We study sequential decision-making problems formulated as cooperative Markov games with reach-avoid objectives. We apply a differential privacy mechanism to privatize agents’ communicated symbolic state trajectories, and analyze tradeoffs between the strength of privacy and the team’s performance. For a given level of privacy, this tradeoff is shown to depend critically upon the total correlation among agents’ state-action processes. We synthesize policies that are robust to privacy by reducing the value of the total correlation. Numerical experiments demonstrate that the team’s performance under these policies decreases by only 6 percent when comparing private versus non-private implementations of communication. By contrast, the team’s performance decreases by 88 percent when using baseline policies that ignore total correlation and only optimize team performance.

AAMAS Conference 2023 Conference Paper

Equitability and Welfare Maximization for Allocating Indivisible Items

  • Ankang Sun
  • Bo Chen
  • Xuan Vinh Doan

We study fair allocations of indivisible goods and chores in conjunction with system efficiency, measured by two social welfare functions, namely utilitarian and egalitarian welfare. To model preference, each agent is associated with a cardinal and additive valuation function. The fairness criteria we are concerned with are equitability up to any item (EQX) and equitability up to one item (EQ1). For the trade-off between fairness and efficiency, we investigate efficiency loss under these fairness constraints and establish the price of fairness. From the computational perspective, we provide an almost complete picture of the computational complexity of (i) deciding the existence of an EQX/EQ1 and welfare-maximizing allocation; (ii) computing a welfare maximizer among all EQX/EQ1 allocations.

JAAMAS Journal 2023 Journal Article

Fairness criteria for allocating indivisible chores: connections and efficiencies

  • Ankang Sun
  • Bo Chen
  • Xuan Vinh Doan

Abstract We study several fairness notions in allocating indivisible chores (i. e. , items with disutilities) to agents who have additive and submodular cost functions. The fairness criteria we are concerned with are envy-free up to any item, envy-free up to one item, maximin share (MMS), and pairwise maximin share (PMMS), which are proposed as relaxations of envy-freeness in the setting of additive cost functions. For allocations under each fairness criterion, we establish their approximation guarantee for other fairness criteria. Under the additive setting, our results show strong connections between these fairness criteria and, at the same time, reveal intrinsic differences between goods allocation and chores allocation. However, such strong relationships cannot be inherited by the submodular setting, under which PMMS and MMS are no longer relaxations of envy-freeness and, even worse, few non-trivial guarantees exist. We also investigate efficiency loss under these fairness constraints and establish their prices of fairness.

NeurIPS Conference 2023 Conference Paper

Few-shot Generation via Recalling Brain-Inspired Episodic-Semantic Memory

  • Zhibin Duan
  • Zhiyi Lv
  • Chaojie Wang
  • Bo Chen
  • Bo An
  • Mingyuan Zhou

Aimed at adapting a generative model to a novel generation task with only a few given data samples, the capability of few-shot generation is crucial for many real-world applications with limited data, \emph{e. g. }, artistic domains. Instead of training from scratch, recent works tend to leverage the prior knowledge stored in previous datasets, which is quite similar to the memory mechanism of human intelligence, but few of these works directly imitate the memory-recall mechanism that humans make good use of in accomplishing creative tasks, \emph{e. g. }, painting and writing. Inspired by the memory mechanism of human brain, in this work, we carefully design a variational structured memory module (VSM), which can simultaneously store both episodic and semantic memories to assist existing generative models efficiently recall these memories during sample generation. Meanwhile, we introduce a bionic memory updating strategy for the conversion between episodic and semantic memories, which can also model the uncertainty during conversion. Then, we combine the developed VSM with various generative models under the Bayesian framework, and evaluate these memory-augmented generative models with few-shot generation tasks, demonstrating the effectiveness of our methods.

IJCAI Conference 2023 Conference Paper

GPLight: Grouped Multi-agent Reinforcement Learning for Large-scale Traffic Signal Control

  • Yilin Liu
  • Guiyang Luo
  • Quan Yuan
  • Jinglin Li
  • Lei Jin
  • Bo Chen
  • Rui Pan

The use of multi-agent reinforcement learning (MARL) methods in coordinating traffic lights (CTL) has become increasingly popular, treating each intersection as an agent. However, existing MARL approaches either treat each agent absolutely homogeneous, i. e. , same network and parameter for each agent, or treat each agent completely heterogeneous, i. e. , different networks and parameters for each agent. This creates a difficult balance between accuracy and complexity, especially in large-scale CTL. To address this challenge, we propose a grouped MARL method named GPLight. We first mine the similarity between agent environment considering both real-time traffic flow and static fine-grained road topology. Then we propose two loss functions to maintain a learnable and dynamic clustering, one that uses mutual information estimation for better stability, and the other that maximizes separability between groups. Finally, GPLight enforces the agents in a group to share the same network and parameters. This approach reduces complexity by promoting cooperation within the same group of agents while reflecting differences between groups to ensure accuracy. To verify the effectiveness of our method, we conduct experiments on both synthetic and real-world datasets, with up to 1, 089 intersections. Compared with state-of-the-art methods, experiment results demonstrate the superiority of our proposed method, especially in large-scale CTL.

NeurIPS Conference 2023 Conference Paper

Hierarchical Vector Quantized Transformer for Multi-class Unsupervised Anomaly Detection

  • Ruiying Lu
  • YuJie Wu
  • Long Tian
  • Dongsheng Wang
  • Bo Chen
  • Xiyang Liu
  • Ruimin Hu

Unsupervised image Anomaly Detection (UAD) aims to learn robust and discriminative representations of normal samples. While separate solutions per class endow expensive computation and limited generalizability, this paper focuses on building a unified framework for multiple classes. Under such a challenging setting, popular reconstruction-based networks with continuous latent representation assumption always suffer from the "identical shortcut" issue, where both normal and abnormal samples can be well recovered and difficult to distinguish. To address this pivotal issue, we propose a hierarchical vector quantized prototype-oriented Transformer under a probabilistic framework. First, instead of learning the continuous representations, we preserve the typical normal patterns as discrete iconic prototypes, and confirm the importance of Vector Quantization in preventing the model from falling into the shortcut. The vector quantized iconic prototypes are integrated into the Transformer for reconstruction, such that the abnormal data point is flipped to a normal data point. Second, we investigate an exquisite hierarchical framework to relieve the codebook collapse issue and replenish frail normal patterns. Third, a prototype-oriented optimal transport method is proposed to better regulate the prototypes and hierarchically evaluate the abnormal score. By evaluating on MVTec-AD and VisA datasets, our model surpasses the state-of-the-art alternatives and possesses good interpretability. The code is available at https: //github. com/RuiyingLu/HVQ-Trans.

IROS Conference 2023 Conference Paper

Magnetically Controlled Cell Robots with Immune-Enhancing Potential

  • Hongyan Sun
  • Yuguo Dai
  • Jiaying Zhang
  • Junjie Xu
  • Lina Jia
  • Chutian Wang
  • Luyao Wang
  • Chan Li

Magnetic microrobots exhibit enormous potential in targeted drug delivery owing to the remote wireless manipulation and minimum invasion for medical treatment. High degree of freedom offers the magnetic propelled robots extraordinary application prospect since they can be controlled precisely when different magnetic fields sources working cooperatively. However, the biocompatibility of microrobots have attracted sustained and general concern. Therefore, it is highly necessary to develop a promising carrier with high biocompatibility and investigate the mechanism of drug loading-release triggered by special microenvironment in the targeted region. In this paper, we proposed a magnetically controlled cell robots (MCRs) based on macrophages propelled by a rotating magnetic field. The innovative MCRs exhibit good biocompatibility and low toxicity by optimizing the concentration of polylysine-coated Fe nanoparticles (PLL@FeNPs) to 40 µg/mL. These MCRs loaded with murine interleukin-12 (IL-12), murine chemokine (C-C motif) ligand 5 (CCL-5), and murine C-X-C motif chemokine ligand 10 (CXCL-10) which can stimulate T cell differentiation and recruitment of monocytes, respectively. The macrophages showed an obvious M1-polarization tendency of macrophages to phagocytose intracellular pathogens and resist the growth of tumor cells. Under the control of a magnetic propelling system composed of 3 pairs of Helmholtz coil, the cell robot can be propelled wirelessly and moved along a predefined path with high accuracy. Moreover, the MCRs could approach to cancer cells and stop at places of interest in vitro. In conclusion, we have accomplished the preliminary construction of a targeted drug delivery system which displays great immune-enhancing potential for targeted drug delivery.

NeurIPS Conference 2023 Conference Paper

Tuning Multi-mode Token-level Prompt Alignment across Modalities

  • Dongsheng Wang
  • Miaoge Li
  • Xinyang Liu
  • MingSheng Xu
  • Bo Chen
  • Hanwang Zhang

Advancements in prompt tuning of vision-language models have underscored their potential in enhancing open-world visual concept comprehension. However, prior works only primarily focus on single-mode (only one prompt for each modality) and holistic level (image or sentence) semantic alignment, which fails to capture the sample diversity, leading to sub-optimal prompt discovery. To address the limitation, we propose a multi-mode token-level tuning framework that leverages the optimal transportation to learn and align a set of prompt tokens across modalities. Specifically, we rely on two essential factors: 1) multi-mode prompts discovery, which guarantees diverse semantic representations, and 2) token-level alignment, which helps explore fine-grained similarity. Consequently, the similarity can be calculated as a hierarchical transportation problem between the modality-specific sets. Extensive experiments on popular image recognition benchmarks show the superior generalization and few-shot abilities of our approach. The qualitative analysis demonstrates that the learned prompt tokens have the ability to capture diverse visual concepts.

NeurIPS Conference 2022 Conference Paper

A Variational Edge Partition Model for Supervised Graph Representation Learning

  • Yilin He
  • Chaojie Wang
  • Hao Zhang
  • Bo Chen
  • Mingyuan Zhou

Graph neural networks (GNNs), which propagate the node features through the edges and learn how to transform the aggregated features under label supervision, have achieved great success in supervised feature extraction for both node-level and graph-level classification tasks. However, GNNs typically treat the graph structure as given and ignore how the edges are formed. This paper introduces a graph generative process to model how the observed edges are generated by aggregating the node interactions over a set of overlapping node communities, each of which contributes to the edges via a logical OR mechanism. Based on this generative model, we partition each edge into the summation of multiple community-specific weighted edges and use them to define community-specific GNNs. A variational inference framework is proposed to jointly learn a GNN-based inference network that partitions the edges into different communities, these community-specific GNNs, and a GNN-based predictor that combines community-specific GNNs for the end classification task. Extensive evaluations on real-world graph datasets have verified the effectiveness of the proposed method in learning discriminative representations for both node-level and graph-level classification tasks.

NeurIPS Conference 2022 Conference Paper

Alleviating "Posterior Collapse'' in Deep Topic Models via Policy Gradient

  • Yewen Li
  • Chaojie Wang
  • Zhibin Duan
  • Dongsheng Wang
  • Bo Chen
  • Bo An
  • Mingyuan Zhou

Deep topic models have been proven as a promising way to extract hierarchical latent representations from documents represented as high-dimensional bag-of-words vectors. However, the representation capability of existing deep topic models is still limited by the phenomenon of "posterior collapse", which has been widely criticized in deep generative models, resulting in the higher-level latent representations exhibiting similar or meaningless patterns. To this end, in this paper, we first develop a novel deep-coupling generative process for existing deep topic models, which incorporates skip connections into the generation of documents, enforcing strong links between the document and its multi-layer latent representations. After that, utilizing data augmentation techniques, we reformulate the deep-coupling generative process as a Markov decision process and develop a corresponding Policy Gradient (PG) based training algorithm, which can further alleviate the information reduction at higher layers. Extensive experiments demonstrate that our developed methods can effectively alleviate "posterior collapse" in deep topic models, contributing to providing higher-quality latent document representations.

AAAI Conference 2022 Conference Paper

CODE: Contrastive Pre-training with Adversarial Fine-Tuning for Zero-Shot Expert Linking

  • Bo Chen
  • Jing Zhang
  • Xiaokang Zhang
  • Xiaobin Tang
  • lingfan cai
  • Hong Chen
  • Cuiping Li
  • Peng Zhang

Expert finding, a popular service provided by many online websites such as Expertise Finder, LinkedIn, and AMiner, is beneficial to seeking candidate qualifications, consultants, and collaborators. However, its quality is suffered from lack of ample sources of expert information. This paper employs AMiner as the basis with an aim at linking any external experts to the counterparts on AMiner. As it is infeasible to acquire sufficient linkages from arbitrary external sources, we explore the problem of zero-shot expert linking. In this paper, we propose CODE, which first pre-trains an expert linking model by contrastive learning on AMiner such that it can capture the representation and matching patterns of experts without supervised signals, then it is fine-tuned between AMiner and external sources to enhance the model’s transferability in an adversarial manner. For evaluation, we first design two intrinsic tasks, author identification and paper clustering, to validate the representation and matching capability endowed by contrastive learning. Then the final external expert linking performance on two genres of external sources also implies the superiority of the adversarial fine-tuning method. Additionally, we show the online deployment of CODE, and continuously improve its online performance via active learning.

JAAMAS Journal 2022 Journal Article

Equitability and welfare maximization for allocating indivisible items

  • Ankang Sun
  • Bo Chen
  • Xuan Vinh Doan

Abstract We study fair allocations of indivisible goods and chores in conjunction with system efficiency, measured by two social welfare functions, namely utilitarian and egalitarian welfare. To model preference, each agent is associated with a cardinal and additive valuation function. The fairness criteria we are concerned with are equitability up to any item (EQX) and equitability up to one item (EQ1). For the trade-off between fairness and efficiency, we investigate efficiency loss under these fairness constraints and establish the price of fairness. From the computational perspective, we provide a complete picture of the computational complexity of (i) deciding the existence of an EQX/EQ1 and welfare-maximizing allocation; (ii) computing a welfare maximizer among all EQX/EQ1 allocations.

NeurIPS Conference 2022 Conference Paper

HyperMiner: Topic Taxonomy Mining with Hyperbolic Embedding

  • Yi. shi Xu
  • Dongsheng Wang
  • Bo Chen
  • Ruiying Lu
  • Zhibin Duan
  • Mingyuan Zhou

Embedded topic models are able to learn interpretable topics even with large and heavy-tailed vocabularies. However, they generally hold the Euclidean embedding space assumption, leading to a basic limitation in capturing hierarchical relations. To this end, we present a novel framework that introduces hyperbolic embeddings to represent words and topics. With the tree-likeness property of hyperbolic space, the underlying semantic hierarchy among words and topics can be better exploited to mine more interpretable topics. Furthermore, due to the superiority of hyperbolic geometry in representing hierarchical data, tree-structure knowledge can also be naturally injected to guide the learning of a topic hierarchy. Therefore, we further develop a regularization term based on the idea of contrastive learning to inject prior structural knowledge efficiently. Experiments on both topic taxonomy discovery and document representation demonstrate that the proposed framework achieves improved performance against existing embedded topic models.

NeurIPS Conference 2022 Conference Paper

Knowledge-Aware Bayesian Deep Topic Model

  • Dongsheng Wang
  • Yi. shi Xu
  • Miaoge Li
  • Zhibin Duan
  • Chaojie Wang
  • Bo Chen
  • Mingyuan Zhou

We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling. Although embedded topic models (ETMs) and its variants have gained promising performance in text analysis, they mainly focus on mining word co-occurrence patterns, ignoring potentially easy-to-obtain prior topic hierarchies that could help enhance topic coherence. While several knowledge-based topic models have recently been proposed, they are either only applicable to shallow hierarchies or sensitive to the quality of the provided prior knowledge. To this end, we develop a novel deep ETM that jointly models the documents and the given prior knowledge by embedding the words and topics into the same space. Guided by the provided domain knowledge, the proposed model tends to discover topic hierarchies that are organized into interpretable taxonomies. Moreover, with a technique for adapting a given graph, our extended version allows the structure of the prior knowledge to be fine-tuned to match the target corpus. Extensive experiments show that our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.

IJCAI Conference 2022 Conference Paper

Neural Re-ranking in Multi-stage Recommender Systems: A Review

  • Weiwen Liu
  • Yunjia Xi
  • Jiarui Qin
  • Fei Sun
  • Bo Chen
  • Weinan Zhang
  • Rui Zhang
  • Ruiming Tang

As the final stage of the multi-stage recommender system (MRS), re-ranking directly affects users’ experience and satisfaction by rearranging the input ranking lists, and thereby plays a critical role in MRS. With the advances in deep learning, neural re-ranking has become a trending topic and been widely adopted in industrial applications. This review aims at integrating re-ranking algorithms into a broader picture, and paving ways for more comprehensive solutions for future research. For this purpose, we first present a taxonomy of current methods on neural re-ranking. Then we give a description of these methods along with the historic development according to their objectives. The network structure, personalization, and complexity are also discussed and compared. Next, we provide a benchmark for the major neural re-ranking models and quantitatively analyze their re-ranking performance. Finally, the review concludes with a discussion on future prospects of this field. A list of papers discussed in this review, the benchmark datasets, our re-ranking library LibRerank, and detailed parameter settings are publicly available at https: //github. com/LibRerank-Community/LibRerank.

NeurIPS Conference 2021 Conference Paper

A Prototype-Oriented Framework for Unsupervised Domain Adaptation

  • Korawat Tanwisuth
  • Xinjie Fan
  • Huangjie Zheng
  • Shujian Zhang
  • Hao Zhang
  • Bo Chen
  • Mingyuan Zhou

Existing methods for unsupervised domain adaptation often rely on minimizing some statistical distance between the source and target samples in the latent space. To avoid the sampling variability, class imbalance, and data-privacy concerns that often plague these methods, we instead provide a memory and computation-efficient probabilistic framework to extract class prototypes and align the target features with them. We demonstrate the general applicability of our method on a wide range of scenarios, including single-source, multi-source, class-imbalance, and source-private domain adaptation. Requiring no additional model parameters and having a moderate increase in computation over the source model alone, the proposed method achieves competitive performance with state-of-the-art methods.

AAAI Conference 2021 Conference Paper

Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation

  • Ning Bian
  • Xianpei Han
  • Bo Chen
  • Le Sun

A fundamental ability of humans is to utilize commonsense knowledge in language understanding and question answering. In recent years, many knowledge-enhanced Commonsense Question Answering (CQA) approaches have been proposed. However, it remains unclear: (1) How far can we get by exploiting external knowledge for CQA? (2) How much potential of knowledge has been exploited in current CQA models? (3) Which are the most promising directions for future CQA? To answer these questions, we benchmark knowledge-enhanced CQA by conducting extensive experiments on multiple standard CQA datasets using a simple and effective knowledgeto-text transformation framework. Experiments show that: (1) Our knowledge-to-text framework is effective and achieves state-of-the-art performance on CommonsenseQA dataset, providing a simple and strong knowledge-enhanced baseline for CQA; (2) The potential of knowledge is still far from being fully exploited in CQA — there is a significant performance gap from current models to our models with golden knowledge; and (3) Context-sensitive knowledge selection, heterogeneous knowledge exploitation, and commonsense-rich language models are promising CQA directions.

AAMAS Conference 2021 Conference Paper

Connections between Fairness Criteria and Efficiency for Allocating Indivisible Chores

  • Ankang Sun
  • Bo Chen
  • Xuan Vinh Doan

We study several fairness notions in allocating indivisible chores (i. e. , items with non-positive values): envy-freeness and its relaxations. For allocations under each fairness criterion, we establish their approximation guarantees for other fairness criteria. Under the setting of additive cost functions, our results show strong connections between these fairness criteria and, at the same time, reveal intrinsic differences between goods allocation and chores allocation. Furthermore, we investigate the efficiency loss under these fairness constraints and establish their prices of fairness.

JBHI Journal 2021 Journal Article

Quantifying Axial Spine Images Using Object-Specific Bi-Path Network

  • Liyan Lin
  • Xi Tao
  • Wei Yang
  • Shumao Pang
  • Zhihai Su
  • Hai Lu
  • Shuo Li
  • Qianjin Feng

Automatic estimation of indices from medical images is the main goal of computer-aided quantification (CADq), which speeds up diagnosis and lightens the workload of radiologists. Deep learning technique is a good choice for implementing CADq. Usually, to acquire high-accuracy quantification, specific network architecture needs to be designed for a given CADq task. In this study, considering that the target organs are the intervertebral disc and the dural sac, we propose an object-specific bi-path network (OSBP-Net) for axial spine image quantification. Each path of the OSBP-Net comprises a shallow feature extraction layer (SFE) and a deep feature extraction sub-network (DFE). The SFEs use different convolution strides because the two target organs have different anatomical sizes. The DFEs use average pooling for downsampling based on the observation that the target organs have lower intensity than the background. In addition, an inter-path dissimilarity constraint is proposed and applied to the output of the SFEs, taking into account that the activated regions in the feature maps of two paths should be different theoretically. An inter-index correlation regularization is introduced and applied to the output of the DFEs based on the observation that the diameter and area of the same object express an approximately linear relation. The prediction results of OSBP-Net are compared to several state-of-the-art machine learning-based CADq methods. The comparison reveals that the proposed methods precede other competing methods extensively, indicating its great potential for spine CADq.

IROS Conference 2021 Conference Paper

Simultaneous Prediction of Pedestrian Trajectory and Actions based on Context Information Iterative Reasoning

  • Bo Chen
  • Decai Li
  • Yuqing He

Pedestrian trajectories and actions prediction in complex environment is challenging due to the complexity of human behavior and a variety of internal and external stimuli. Much works has gone towards predicting trajectories and actions separately without mining the coupling relationships between them, which is an important information for our humans to reason and predict. Inspired by this, we propose an end-to-end joint context information iterative reasoning network (CIR-Net). Specifically, a novel heterogeneous spatiotemporal graph module (HST-Graph) is proposed to encode and aggregate multiple types of context information of the motion pattern and the scene. And an action-trajectory hybrid guidance module is proposed to enhance the ability of long-time prediction by utilizing the internal coupling between actions and trajectory. Moreover, an iterative reasoning structure is designed to iteratively correcting the trajectory and actions prediction error. Experimental results on the ETH&UCY and VIRAT datasets demonstrate the favorable performance of the framework.

NeurIPS Conference 2021 Conference Paper

STEP: Segmenting and Tracking Every Pixel

  • Mark Weber
  • Jun Xie
  • Yukun Zhu
  • Paul Voigtlaender
  • Bo Chen
  • Bradley Green
  • Andreas Geiger
  • Bastian Leibe

The task of assigning semantic classes and track identities to every pixel in a video is called video panoptic segmentation. Our work is the first that targets this task in a real-world setting requiring dense interpretation in both spatial and temporal domains. As the ground-truth for this task is difficult and expensive to obtain, existing datasets are either constructed synthetically or only sparsely annotated within short video clips. To overcome this, we introduce a new benchmark encompassing two datasets, KITTI-STEP, and MOTChallenge-STEP. The datasets contain long video sequences, providing challenging examples and a test-bed for studying long-term pixel-precise segmentation and tracking under real-world conditions. We further propose a novel evaluation metric Segmentation and Tracking Quality (STQ) that fairly balances semantic and tracking aspects of this task and is more appropriate for evaluating sequences of arbitrary length. Finally, we provide several baselines to evaluate the status of existing methods on this new challenging dataset. We have made our datasets, metric, benchmark servers, and baselines publicly available, and hope this will inspire future research.

NeurIPS Conference 2021 Conference Paper

TopicNet: Semantic Graph-Guided Topic Discovery

  • Zhibin Duan
  • Yi. shi Xu
  • Bo Chen
  • Dongsheng Wang
  • Chaojie Wang
  • Mingyuan Zhou

Existing deep hierarchical topic models are able to extract semantically meaningful topics from a text corpus in an unsupervised manner and automatically organize them into a topic hierarchy. However, it is unclear how to incorporate prior belief such as knowledge graph to guide the learning of the topic hierarchy. To address this issue, we introduce TopicNet as a deep hierarchical topic model that can inject prior structural knowledge as inductive bias to influence the learning. TopicNet represents each topic as a Gaussian-distributed embedding vector, projects the topics of all layers into a shared embedding space, and explores both the symmetric and asymmetric similarities between Gaussian embedding vectors to incorporate prior semantic hierarchies. With a variational auto-encoding inference network, the model parameters are optimized by minimizing the evidence lower bound and supervised loss via stochastic gradient descent. Experiments on widely used benchmark show that TopicNet outperforms related deep topic models on discovering deeper interpretable topics and mining better document representations.

IJCAI Conference 2020 Conference Paper

Alleviate Dataset Shift Problem in Fine-grained Entity Typing with Virtual Adversarial Training

  • Haochen Shi
  • Siliang Tang
  • Xiaotao Gu
  • Bo Chen
  • Zhigang Chen
  • Jian Shao
  • Xiang Ren

The recent success of Distant Supervision (DS) brings abundant labeled data for the task of fine-grained entity typing (FET) without human annotation. However, the heuristically generated labels inevitably bring a significant distribution gap, namely dataset shift, between the distantly labeled training set and the manually curated test set. Considerable efforts have been made to alleviate this problem from the label perspective by either intelligently denoising the training labels, or designing noise-aware loss functions. Despite their progress, the dataset shift can hardly be eliminated completely. In this work, complementary to the label perspective, we reconsider this problem from the model perspective: Can we learn a more robust typing model with the existence of dataset shift? To this end, we propose a novel regularization module based on virtual adversarial training (VAT). The proposed approach first uses a self-paced sample selection function to select suitable samples for VAT, then constructs virtual adversarial perturbations based on the selected samples, and finally regularizes the model to be robust to such perturbations. Experiments on two benchmarks demonstrate the effectiveness of the proposed method, with an average 3. 8%, 2. 5%, and 3. 2% improvement in accuracy, Macro F1 and Micro F1 respectively compared to the next best method.

NeurIPS Conference 2020 Conference Paper

Bayesian Attention Modules

  • Xinjie Fan
  • Shujian Zhang
  • Bo Chen
  • Mingyuan Zhou

Attention modules, as simple and effective tools, have not only enabled deep neural networks to achieve state-of-the-art results in many domains, but also enhanced their interpretability. Most current models use deterministic attention modules due to their simplicity and ease of optimization. Stochastic counterparts, on the other hand, are less popular despite their potential benefits. The main reason is that stochastic attention often introduces optimization issues or requires significant model changes. In this paper, we propose a scalable stochastic version of attention that is easy to implement and optimize. We construct simplex-constrained attention distributions by normalizing reparameterizable distributions, making the training process differentiable. We learn their parameters in a Bayesian framework where a data-dependent prior is introduced for regularization. We apply the proposed stochastic attention modules to various attention-based models, with applications to graph node classification, visual question answering, image captioning, machine translation, and language understanding. Our experiments show the proposed method brings consistent improvements over the corresponding baselines.

IJCAI Conference 2020 Conference Paper

BERT-INT: A BERT-based Interaction Model For Knowledge Graph Alignment

  • Xiaobin Tang
  • Jing Zhang
  • Bo Chen
  • Yang Yang
  • Hong Chen
  • Cuiping Li

Knowledge graph alignment aims to link equivalent entities across different knowledge graphs. To utilize both the graph structures and the side information such as name, description and attributes, most of the works propagate the side information especially names through linked entities by graph neural networks. However, due to the heterogeneity of different knowledge graphs, the alignment accuracy will be suffered from aggregating different neighbors. This work presents an interaction model to only leverage the side information. Instead of aggregating neighbors, we compute the interactions between neighbors which can capture fine-grained matches of neighbors. Similarly, the interactions of attributes are also modeled. Experimental results show that our model significantly outperforms the best state-of-the-art methods by 1. 9-9. 7% in terms of HitRatio@1 on the dataset DBP15K.

NeurIPS Conference 2020 Conference Paper

Bidirectional Convolutional Poisson Gamma Dynamical Systems

  • Wenchao Chen
  • Chaojie Wang
  • Bo Chen
  • Yicheng Liu
  • Hao Zhang
  • Mingyuan Zhou

Incorporating the natural document-sentence-word structure into hierarchical Bayesian modeling, we propose convolutional Poisson gamma dynamical systems (PGDS) that introduce not only word-level probabilistic convolutions, but also sentence-level stochastic temporal transitions. With word-level convolutions capturing phrase-level topics and sentence-level transitions capturing how the topic usages evolve over consecutive sentences, we aggregate the topic proportions of all sentences of a document as its feature representation. To consider not only forward but also backward sentence-level information transmissions, we further develop a bidirectional convolutional PGDS to incorporate the full contextual information to represent each sentence. For efficient inference, we construct a convolutional-recurrent inference network, which provides both sentence-level and document-level representations, and introduce a hybrid Bayesian inference scheme combining stochastic-gradient MCMC and amortized variational inference. Experimental results on a variety of document corpora demonstrate that the proposed models can extract expressive multi-level latent representations, including interpretable phrase-level topics and sentence-level temporal transitions as well as discriminative document-level features, achieving state-of-the-art document categorization performance while being memory and computation efficient.

NeurIPS Conference 2020 Conference Paper

Deep Relational Topic Modeling via Graph Poisson Gamma Belief Network

  • Chaojie Wang
  • Hao Zhang
  • Bo Chen
  • Dongsheng Wang
  • Zhengjue Wang
  • Mingyuan Zhou

To analyze a collection of interconnected documents, relational topic models (RTMs) have been developed to describe both the link structure and document content, exploring their underlying relationships via a single-layer latent representation with limited expressive capability. To better utilize the document network, we first propose graph Poisson factor analysis (GPFA) that constructs a probabilistic model for interconnected documents and also provides closed-form Gibbs sampling update equations, moving beyond sophisticated approximate assumptions of existing RTMs. Extending GPFA, we develop a novel hierarchical RTM named graph Poisson gamma belief network (GPGBN), and further introduce two different Weibull distribution based variational graph auto-encoders for efficient model inference and effective network information aggregation. Experimental results demonstrate that our models extract high-quality hierarchical latent document representations, leading to improved performance over baselines on various graph analytic tasks.

AAAI Conference 2020 Conference Paper

Learning to Map Frequent Phrases to Sub-Structures of Meaning Representation for Neural Semantic Parsing

  • Bo Chen
  • Xianpei Han
  • Ben He
  • Le Sun

Neural semantic parsers usually generate meaning representation tokens from natural language tokens via an encoderdecoder model. However, there is often a vocabularymismatch problem between natural language utterances and logical forms. That is, one word maps to several atomic logical tokens, which need to be handled as a whole, rather than individual logical tokens at multiple steps. In this paper, we propose that the vocabulary-mismatch problem can be effectively resolved by leveraging appropriate logical tokens. Specifically, we exploit macro actions, which are of the same granularity of words/phrases, and allow the model to learn mappings from frequent phrases to corresponding substructures of meaning representation. Furthermore, macro actions are compact, and therefore utilizing them can significantly reduce the search space, which brings a great benefit to weakly supervised semantic parsing. Experiments show that our method leads to substantial performance improvement on three benchmarks, in both supervised and weakly supervised settings.

JBHI Journal 2020 Journal Article

Multiple Axial Spine Indices Estimation via Dense Enhancing Network With Cross-Space Distance-Preserving Regularization

  • Liyan Lin
  • Xi Tao
  • Shumao Pang
  • Zhihai Su
  • Hai Lu
  • Shuo Li
  • Qianjin Feng
  • Bo Chen

Automatic estimation of axial spine indices is clinically desired for various spine computer aided procedures, such as disease diagnosis, therapeutic evaluation, pathophysiological understanding, risk assessment, and biomechanical modeling. Currently, the spine indices are manually measured by physicians, which is time-consuming and laborious. Even worse, the tedious manual procedure might result in inaccurate measurement. To deal with this problem, in this paper, we aim at developing an automatic method to estimate multiple indices from axial spine images. Inspired by the success of deep learning for regression problems and the densely connected network for image classification, we propose a dense enhancing network (DE-Net) which uses the dense enhancing blocks (DEBs) as its main body, where a feature enhancing layer is added to each of the bypass in a dense block. The DEB is designed to enhance discriminative feature embedding from the intervertebral disc and the dural sac areas. In addition, the cross-space distance-preserving regularization (CSDPR), which enforces consistent inter-sample distances between the output and the label spaces, is proposed to regularize the loss function of the DE-Net. To train and validate the proposed method, we collected 895 axial spine MRI images from 143 subjects and manually measured the indices as the ground truth. The results show that all deep learning models obtain very small prediction errors, and the proposed DE-Net with CSDPR acquires the smallest error among all methods, indicating that our method has great potential for spine computer aided procedures.

AAAI Conference 2020 Conference Paper

PEIA: Personality and Emotion Integrated Attentive Model for Music Recommendation on Social Media Platforms

  • Tiancheng Shen
  • Jia Jia
  • Yan Li
  • Yihui Ma
  • Yaohua Bu
  • Hanjie Wang
  • Bo Chen
  • Tat-Seng Chua

With the rapid expansion of digital music formats, it’s indispensable to recommend users with their favorite music. For music recommendation, users’ personality and emotion greatly affect their music preference, respectively in a longterm and short-term manner, while rich social media data provides effective feedback on these information. In this paper, aiming at music recommendation on social media platforms, we propose a Personality and Emotion Integrated Attentive model (PEIA), which fully utilizes social media data to comprehensively model users’ long-term taste (personality) and short-term preference (emotion). Specifically, it takes full advantage of personality-oriented user features, emotionoriented user features and music features of multi-faceted attributes. Hierarchical attention is employed to distinguish the important factors when incorporating the latent representations of users’ personality and emotion. Extensive experiments on a large real-world dataset of 171, 254 users demonstrate the effectiveness of our PEIA model which achieves an NDCG of 0. 5369, outperforming the state-of-the-art methods. We also perform detailed parameter analysis and feature contribution analysis, which further verify our scheme and demonstrate the significance of co-modeling of user personality and emotion in music recommendation.

IJCAI Conference 2020 Conference Paper

Switching Poisson Gamma Dynamical Systems

  • Wenchao Chen
  • Bo Chen
  • Yicheng Liu
  • Qianru Zhao
  • Mingyuan Zhou

We propose Switching Poisson gamma dynamical systems (SPGDS) to model sequentially observed multivariate count data. Different from previous models, SPGDS assigns its latent variables into mixture of gamma distributed parameters to model complex sequences and describe the nonlinear dynamics, meanwhile, capture various temporal dependencies. For efficient inference, we develop a scalable hybrid stochastic gradient-MCMC and switching recurrent autoencoding variational inference, which is scalable to large scale sequences and fast in out-of-sample prediction. Experiments on both unsupervised and supervised tasks demonstrate that the proposed model not only has excellent fitting and prediction performance on complex dynamic sequences, but also separates different dynamical patterns within them.

IJCAI Conference 2019 Conference Paper

CFM: Convolutional Factorization Machines for Context-Aware Recommendation

  • Xin Xin
  • Bo Chen
  • Xiangnan He
  • Dong Wang
  • Yue Ding
  • Joemon Jose

Factorization Machine (FM) is an effective solution for context-aware recommender systems (CARS) which models second-order feature interactions by inner product. However, it is insufficient to capture high-order and nonlinear interaction signals. While several recent efforts have enhanced FM with neural networks, they assume the embedding dimensions are independent from each other and model high-order interactions in a rather implicit manner. In this paper, we propose Convolutional Factorization Machine (CFM) to address above limitations. Specifically, CFM models second-order interactions with outer product, resulting in ''images'' which capture correlations between embedding dimensions. Then all generated ''images'' are stacked, forming an interaction cube. 3D convolution is applied above it to learn high-order interaction signals in an explicit approach. Besides, we also leverage a self-attention mechanism to perform the pooling of features to reduce time complexity. We conduct extensive experiments on three real-world datasets, demonstrating significant improvement of CFM over competing methods for context-aware top-k recommendation.

IJCAI Conference 2019 Conference Paper

End-to-End Multi-Perspective Matching for Entity Resolution

  • Cheng Fu
  • Xianpei Han
  • Le Sun
  • Bo Chen
  • Wei Zhang
  • Suhui Wu
  • Hao Kong

Entity resolution (ER) aims to identify data records referring to the same real-world entity. Due to the heterogeneity of entity attributes and the diversity of similarity measures, one main challenge of ER is how to select appropriate similarity measures for different attributes. Previous ER methods usually employ heuristic similarity selection algorithms, which are highly specialized to specific ER problems and are hard to be generalized to other situations. Furthermore, previous studies usually perform similarity learning and similarity selection independently, which often result in error propagation and are hard to be optimized globally. To resolve the above problems, this paper proposes an end-to-end multi-perspective entity matching model, which can adaptively select optimal similarity measures for heterogenous attributes by jointly learning and selecting similarity measures in an end-to-end way. Experiments on two real-world datasets show that our method significantly outperforms previous ER methods.

AAAI Conference 2019 Conference Paper

Hierarchical Reinforcement Learning for Course Recommendation in MOOCs

  • Jing Zhang
  • Bowen Hao
  • Bo Chen
  • Cuiping Li
  • Hong Chen
  • Jimeng Sun

The proliferation of massive open online courses (MOOCs) demands an effective way of personalized course recommendation. The recent attention-based recommendation models can distinguish the effects of different historical courses when recommending different target courses. However, when a user has interests in many different courses, the attention mechanism will perform poorly as the effects of the contributing courses are diluted by diverse historical courses. To address such a challenge, we propose a hierarchical reinforcement learning algorithm to revise the user profiles and tune the course recommendation model on the revised profiles. Systematically, we evaluate the proposed model on a real dataset consisting of 1, 302 courses, 82, 535 users and 458, 454 user enrolled behaviors, which were collected from XuetangX—one of the largest MOOCs in China. Experimental results show that the proposed model significantly outperforms the state-of-the-art recommendation models (improving 5. 02% to 18. 95% in terms of HR@10).

AAAI Conference 2018 Conference Paper

Conversational Model Adaptation via KL Divergence Regularization

  • Juncen Li
  • Ping Luo
  • Fen Lin
  • Bo Chen

In this study we formulate the problem of conversational model adaptation, where we aim to build a generative conversational model for a target domain based on a limited amount of dialogue data from this target domain and some existing dialogue models from related source domains. This model facilitates the fast building of a chatbot platform, where a new vertical chatbot with only a small number of conversation data can be supported by other related mature chatbots. Previous studies on model adaptation and transfer learning mostly focus on classification and recommendation problems, however, how these models work for conversation generation are still un-explored. To this end, we leverage a KL divergence (KLD) regularization to adapt the existing conversational models. Specifically, it employs the KLD to measure the distance between source and target domain. Adding KLD as a regularization to the objective function allows the proposed method to utilize the information from source domains effectively. We also evaluate the performance of this adaptation model for the online chatbots in Wechat platform of public accounts using both the BLEU metric and human judgement. The experiments empirically show that the proposed method visibly improves these evaluation metrics.

NeurIPS Conference 2018 Conference Paper

Deep Poisson gamma dynamical systems

  • DanDan Guo
  • Bo Chen
  • Hao Zhang
  • Mingyuan Zhou

We develop deep Poisson-gamma dynamical systems (DPGDS) to model sequentially observed multivariate count data, improving previously proposed models by not only mining deep hierarchical latent structure from the data, but also capturing both first-order and long-range temporal dependencies. Using sophisticated but simple-to-implement data augmentation techniques, we derived closed-form Gibbs sampling update equations by first backward and upward propagating auxiliary latent counts, and then forward and downward sampling latent variables. Moreover, we develop stochastic gradient MCMC inference that is scalable to very long multivariate count time series. Experiments on both synthetic and a variety of real-world data demonstrate that the proposed model not only has excellent predictive performance, but also provides highly interpretable multilayer latent structure to represent hierarchical and temporal information propagation.

AAAI Conference 2018 Conference Paper

Elastic Responding Machine for Dialog Generation with Dynamically Mechanism Selecting

  • Ganbin Zhou
  • Ping Luo
  • Yijun Xiao
  • Fen Lin
  • Bo Chen
  • Qing He

Neural models aiming at generating meaningful and diverse response is attracting increasing attention over recent years. For a given post, the conventional encoder-decoder models tend to learn high-frequency but trivial responses, or are dif- ficult to determine which speaking styles are suitable to generate responses. To address this issue, we propose the elastic responding machine (ERM), which is based on a proposed encoder-diverter-filter-decoder framework. ERM models the multiple responding mechanisms to not only generate acceptable responses for a given post but also improve the diversity of responses. Here, the mechanisms could be regraded as some latent variables, and for a given post different responses may be generated by different mechanisms. The experiments demonstrate the quality and diversity of the generated responses, intuitively show how the learned model controls response mechanism when responding, and reveal some underlying relationship between mechanism and language style.

AAAI Conference 2018 Conference Paper

Multimodal Poisson Gamma Belief Network

  • Chaojie Wang
  • Bo Chen
  • Mingyuan Zhou

To learn a deep generative model of multimodal data, we propose a multimodal Poisson gamma belief network (mPGBN) that tightly couple the data of different modalities at multiple hidden layers. The mPGBN unsupervisedly extracts a nonnegative latent representation using an upward-downward Gibbs sampler. It imposes sparse connections between different layers, making it simple to visualize the generative process and the relationships between the latent features of different modalities. Our experimental results on bi-modal data consisting of images and tags show that the mPGBN can easily impute a missing modality and hence is useful for both image annotation and retrieval. We further demonstrate that the mPGBN achieves state-of-the-art results on unsupervisedly extracting latent features from multimodal data.

AAAI Conference 2018 Conference Paper

Personalized Privacy-Preserving Social Recommendation

  • Xuying Meng
  • Suhang Wang
  • Kai Shu
  • Jundong Li
  • Bo Chen
  • Huan Liu
  • Yujun Zhang

Privacy leakage is an important issue for social recommendation. Existing privacy preserving social recommendation approaches usually allow the recommender to fully control users’ information. This may be problematic since the recommender itself may be untrusted, leading to serious privacy leakage. Besides, building social relationships requires sharing interests as well as other private information, which may lead to more privacy leakage. Although sometimes users are allowed to hide their sensitive private data using privacy settings, the data being shared can still be abused by the adversaries to infer sensitive private information. Supporting social recommendation with least privacy leakage to untrusted recommender and other users (i. e. , friends) is an important yet challenging problem. In this paper, we aim to address the problem of achieving privacy-preserving social recommendation under personalized privacy settings. We propose PrivSR, a novel framework for privacy-preserving social recommendation, in which users can model ratings and social relationships privately. Meanwhile, by allocating different noise magnitudes to personalized sensitive and non-sensitive ratings, we can protect users’ privacy against the untrusted recommender and friends. Theoretical analysis and experimental evaluation on real-world datasets demonstrate that our framework can protect users’ privacy while being able to retain effectiveness of the underlying recommender system.

AAAI Conference 2018 Conference Paper

Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation

  • Ganbin Zhou
  • Ping Luo
  • Rongyu Cao
  • Yijun Xiao
  • Fen Lin
  • Bo Chen
  • Qing He

Different from other sequential data, sentences in natural language are structured by linguistic grammars. Previous generative conversational models with chain-structured decoder ignore this structure in human language and might generate plausible responses with less satisfactory relevance and fluency. In this study, we aim to incorporate the results from linguistic analysis into the process of sentence generation for high-quality conversation generation. Specifically, we use a dependency parser to transform each response sentence into a dependency tree and construct a training corpus of sentencetree pairs. A tree-structured decoder is developed to learn the mapping from a sentence to its tree, where different types of hidden states are used to depict the local dependencies from an internal tree node to its children. For training acceleration, we propose a tree canonicalization method, which transforms trees into equivalent ternary trees. Then, with a proposed tree-structured search method, the model is able to generate the most probable responses in the form of dependency trees, which are finally flattened into sequences as the system output. Experimental results demonstrate that the proposed X2TREE framework outperforms baseline methods over 11. 15% increase of acceptance ratio.

AAAI Conference 2017 Conference Paper

Mechanism-Aware Neural Machine for Dialogue Response Generation

  • Ganbin Zhou
  • Ping Luo
  • Rongyu Cao
  • Fen Lin
  • Bo Chen
  • Qing He

To the same utterance, people’s responses in everyday dialogue may be diverse largely in terms of content semantics, speaking styles, communication intentions and so on. Previous generative conversational models ignore these 1-to-n relationships between a post to its diverse responses, and tend to return high-frequency but meaningless responses. In this study we propose a mechanism-aware neural machine for dialogue response generation. It assumes that there exists some latent responding mechanisms, each of which can generate different responses for a single input post. With this assumption we model different responding mechanisms as latent embeddings, and develop a encoder-diverter-decoder framework to train its modules in an end-to-end fashion. With the learned latent mechanisms, for the first time these decomposed modules can be used to encode the input into mechanism-aware context, and decode the responses with the controlled generation styles and topics. Finally, the experiments with human judgements, intuitive examples, detailed discussions demonstrate the quality and diversity of the generated responses with 9. 80% increase of acceptable ratio over the best of six baseline methods.

JMLR Journal 2016 Journal Article

Augmentable Gamma Belief Networks

  • Mingyuan Zhou
  • Yulai Cong
  • Bo Chen

To infer multilayer deep representations of high-dimensional discrete and nonnegative real vectors, we propose an augmentable gamma belief network (GBN) that factorizes each of its hidden layers into the product of a sparse connection weight matrix and the nonnegative real hidden units of the next layer. The GBN's hidden layers are jointly trained with an upward-downward Gibbs sampler that solves each layer with the same subroutine. The gamma-negative binomial process combined with a layer-wise training strategy allows inferring the width of each layer given a fixed budget on the width of the first layer. Example results illustrate interesting relationships between the width of the first layer and the inferred network structure, and demonstrate that the GBN can add more layers to improve its performance in both unsupervisedly extracting features and predicting heldout data. For exploratory data analysis, we extract trees and subnetworks from the learned deep network to visualize how the very specific factors discovered at the first hidden layer and the increasingly more general factors discovered at deeper hidden layers are related to each other, and we generate synthetic data by propagating random variables through the deep network from the top hidden layer back to the bottom data layer. [abs] [ pdf ][ bib ] &copy JMLR 2016. ( edit, beta )

NeurIPS Conference 2015 Conference Paper

The Poisson Gamma Belief Network

  • Mingyuan Zhou
  • Yulai Cong
  • Bo Chen

To infer a multilayer representation of high-dimensional count vectors, we propose the Poisson gamma belief network (PGBN) that factorizes each of its layers into the product of a connection weight matrix and the nonnegative real hidden units of the next layer. The PGBN's hidden layers are jointly trained with an upward-downward Gibbs sampler, each iteration of which upward samples Dirichlet distributed connection weight vectors starting from the first layer (bottom data layer), and then downward samples gamma distributed hidden units starting from the top hidden layer. The gamma-negative binomial process combined with a layer-wise training strategy allows the PGBN to infer the width of each layer given a fixed budget on the width of the first layer. The PGBN with a single hidden layer reduces to Poisson factor analysis. Example results on text analysis illustrate interesting relationships between the width of the first layer and the inferred network structure, and demonstrate that the PGBN, whose hidden units are imposed with correlated gamma priors, can add more layers to increase its performance gains over Poisson factor analysis, given the same limit on the width of the first layer.

IS Journal 2011 Journal Article

Agent Recommendation for Agent-Based Urban-Transportation Systems

  • Cheng Chen
  • Shuang Shuang Li
  • Bo Chen
  • Ding Wen

Mobile-agent technology has been adopted in many transportation fields to take advantages of different agents to deal with dynamic changes and uncertainty in traffic environments. However, few research studies have been conducted in urban-transportation systems on decision making about what kind of agents to be used in coping with a specific traffic states. With the increasing availability of control and service agents for agent-based urban-transportation systems, an agent recommendation system is necessary to manage and select those agents so original objectives can be fulfilled. In this article, the authors address issues related to the creation of such a platform.

NeurIPS Conference 2011 Conference Paper

On the Analysis of Multi-Channel Neural Spike Data

  • Bo Chen
  • David Carlson
  • Lawrence Carin

Nonparametric Bayesian methods are developed for analysis of multi-channel spike-train data, with the feature learning and spike sorting performed jointly. The feature learning and sorting are performed simultaneously across all channels. Dictionary learning is implemented via the beta-Bernoulli process, with spike sorting performed via the dynamic hierarchical Dirichlet process (dHDP), with these two models coupled. The dHDP is augmented to eliminate refractoryperiod violations, it allows the “appearance” and “disappearance” of neurons over time, and it models smooth variation in the spike statistics.

NeurIPS Conference 2011 Conference Paper

Predicting response time and error rates in visual search

  • Bo Chen
  • Vidhya Navalpakkam
  • Pietro Perona

A model of human visual search is proposed. It predicts both response time (RT) and error rates (RT) as a function of image parameters such as target contrast and clutter. The model is an ideal observer, in that it optimizes the Bayes ratio of tar- get present vs target absent. The ratio is computed on the firing pattern of V1/V2 neurons, modeled by Poisson distributions. The optimal mechanism for integrat- ing information over time is shown to be a ‘soft max’ of diffusions, computed over the visual field by ‘hypercolumns’ of neurons that share the same receptive field and have different response properties to image features. An approximation of the optimal Bayesian observer, based on integrating local decisions, rather than diffusions, is also derived; it is shown experimentally to produce very similar pre- dictions. A psychophyisics experiment is proposed that may discriminate between which mechanism is used in the human brain.