Arrow Research search

Author name cluster

Kun He

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

29 papers
2 author rows

Possible papers

29

ICML Conference 2025 Conference Paper

CAN: Leveraging Clients As Navigators for Generative Replay in Federated Continual Learning

  • Xuankun Rong
  • Jianshu Zhang 0003
  • Kun He
  • Mang Ye

Generative replay (GR) has been extensively validated in continual learning as a mechanism to synthesize data and replay past knowledge to mitigate forgetting. By leveraging synthetic rather than real data for the replay, GR has been adopted in some federated continual learning (FCL) approaches to ensure the privacy of client-side data. While existing GR-based FCL approaches have introduced improvements, none of their enhancements specifically take into account the unique characteristics of federated learning settings. Beyond privacy constraints, what other fundamental aspects of federated learning should be explored in the context of FCL? In this work, we explore the potential benefits that come from emphasizing the role of clients throughout the process. We begin by highlighting two key observations: (a) Client Expertise Superiority, where clients, rather than the server, act as domain experts, and (b) Client Forgetting Variance, where heterogeneous data distributions across clients lead to varying levels of forgetting. Building on these insights, we propose CAN (Clients As Navigators), highlighting the pivotal role of clients in both data synthesis and data replay. Extensive evaluations demonstrate that this client-centric approach achieves state-of-the-art performance. Notably, it requires a smaller buffer size, reducing storage overhead and enhancing computational efficiency.

IJCAI Conference 2025 Conference Paper

Exact Algorithms with New Upper Bounds for the Maximum k-plex Problem

  • Jiongzhi Zheng
  • Mingming Jin
  • Kun He

The Maximum k-plex Problem (MKP) is a degree relaxation of the widely known Maximum Clique Problem. As a practical NP-hard problem, MKP has many important real-world applications, such as the analysis of various complex networks. Branch-and-bound (BnB) algorithms are a type of well-studied and effective exact algorithms for MKP, and the key for BnB algorithms is the bound design. Recent BnB MKP algorithms involve two kinds of upper bounds based on graph coloring and partition, respectively, that work in different perspectives and thus are complementary with each other. We first propose a new coloring-based upper bound, termed Relaxed Graph Color Bound (RelaxGCB), that significantly outperforms the previous coloring-based upper bound. Then we further propose another new upper bound, termed RelaxPUB, that incorporates RelaxGCB and a partition-based upper bound in a novel way, making use of their complementarity. We apply RelaxGCB and RelaxPUB to state-of-the-art BnB MKP algorithms and produce eight new BnB algorithms. Extensive experiments using diverse k values on hundreds of instances based on dense or massive sparse graphs demonstrate the excellent performance and robustness of our proposed methods.

AAAI Conference 2025 Conference Paper

Neural Collapse Inspired Knowledge Distillation

  • Shuoxi Zhang
  • Zijian Song
  • Kun He

Existing knowledge distillation (KD) methods have demonstrated their ability to achieve student network performance on par with their teachers. However, the knowledge gap between the teacher and student remains significant and may hinder the effectiveness of the distillation process. In this work, we introduce the structure of Neural Collapse (NC) into the KD framework. NC typically occurs in the final phase of training, resulting in a graceful geometric structure where the last-layer features form a simplex equiangular tight frame. We hypothesize that NC can alleviate the knowledge gap in distillation, thereby enhancing student performance. This paper begins with an empirical analysis to bridge the connection between KD and NC. Through this analysis, we establish that transferring the teacher's NC structure to the student benefits the distillation process. Therefore, instead of merely transferring instance-level logits or features, as done by existing distillation methods, we encourage students to learn the teacher's NC structure. We propose the new distillation paradigm termed Neural Collapse-inspired Knowledge Distillation (NCKD). Comprehensive experiments demonstrate that NCKD is simple yet effective, improving the generalization of all distilled student models and achieving state-of-the-art accuracy performance.

NeurIPS Conference 2025 Conference Paper

Rethinking Tokenized Graph Transformers for Node Classification

  • Jinsong Chen
  • Chenyang Li
  • Gaichao Li
  • John Hopcroft
  • Kun He

Node tokenized graph Transformers (GTs) have shown promising performance in node classification. The generation of token sequences is the key module in existing tokenized GTs which transforms the input graph into token sequences, facilitating the node representation learning via Transformer. In this paper, we observe that the generations of token sequences in existing GTs only focus on the first-order neighbors on the constructed similarity graphs, which leads to the limited usage of nodes to generate diverse token sequences, further restricting the potential of tokenized GTs for node classification. To this end, we propose a new method termed SwapGT. SwapGT first introduces a novel token swapping operation based on the characteristics of token sequences that fully leverages the semantic relevance of nodes to generate more informative token sequences. Then, SwapGT leverages a Transformer-based backbone to learn node representations from the generated token sequences. Moreover, SwapGT develops a center alignment loss to constrain the representation learning from multiple token sequences, further enhancing the model performance. Extensive empirical results on various datasets showcase the superiority of SwapGT for node classification. Code is available at https: //github. com/JHL-HUST/SwapGT.

AAAI Conference 2024 Conference Paper

KD-Club: An Efficient Exact Algorithm with New Coloring-Based Upper Bound for the Maximum k-Defective Clique Problem

  • Mingming Jin
  • Jiongzhi Zheng
  • Kun He

The Maximum k-Defective Clique Problem (MDCP) aims to find a maximum k-defective clique in a given graph, where a k-defective clique is a relaxation clique missing at most k edges. MDCP is NP-hard and finds many real-world applications in analyzing dense but not necessarily complete subgraphs. Exact algorithms for MDCP mainly follow the Branch-and-bound (BnB) framework, whose performance heavily depends on the quality of the upper bound on the cardinality of a maximum k-defective clique. The state-of-the-art BnB MDCP algorithms calculate the upper bound quickly but conservatively as they ignore many possible missing edges. In this paper, we propose a novel CoLoring-based Upper Bound (CLUB) that uses graph coloring techniques to detect independent sets so as to detect missing edges ignored by the previous methods. We then develop a new BnB algorithm for MDCP, called KD-Club, using CLUB in both the preprocessing stage for graph reduction and the BnB searching process for branch pruning. Extensive experiments show that KD-Club significantly outperforms state-of-the-art BnB MDCP algorithms on the number of solved instances within the cut-off time, having much smaller search tree and shorter solving time on various benchmarks.

NeurIPS Conference 2024 Conference Paper

Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers

  • Jinsong Chen
  • Hanpeng Liu
  • John E. Hopcroft
  • Kun He

While tokenized graph Transformers have demonstrated strong performance in node classification tasks, their reliance on a limited subset of nodes with high similarity scores for constructing token sequences overlooks valuable information from other nodes, hindering their ability to fully harness graph information for learning optimal node representations. To address this limitation, we propose a novel graph Transformer called GCFormer. Unlike previous approaches, GCFormer develops a hybrid token generator to create two types of token sequences, positive and negative, to capture diverse graph information. And a tailored Transformer-based backbone is adopted to learn meaningful node representations from these generated token sequences. Additionally, GCFormer introduces contrastive learning to extract valuable information from both positive and negative token sequences, enhancing the quality of learned node representations. Extensive experimental results across various datasets, including homophily and heterophily graphs, demonstrate the superiority of GCFormer in node classification, when compared to representative graph neural networks (GNNs) and graph Transformers.

NeurIPS Conference 2024 Conference Paper

Long-range Meta-path Search on Large-scale Heterogeneous Graphs

  • Chao Li
  • Zijie Guo
  • Qiuting He
  • Kun He

Utilizing long-range dependency, a concept extensively studied in homogeneous graphs, remains underexplored in heterogeneous graphs, especially on large ones, posing two significant challenges: Reducing computational costs while maximizing effective information utilization in the presence of heterogeneity, and overcoming the over-smoothing issue in graph neural networks. To address this gap, we investigate the importance of different meta-paths and introduce an automatic framework for utilizing long-range dependency on heterogeneous graphs, denoted as Long-range Meta-path Search through Progressive Sampling (LMSPS). Specifically, we develop a search space with all meta-paths related to the target node type. By employing a progressive sampling algorithm, LMSPS dynamically shrinks the search space with hop-independent time complexity. Through a sampling evaluation strategy, LMSPS conducts a specialized and effective meta-path selection, leading to retraining with only effective meta-paths, thus mitigating costs and over-smoothing. Extensive experiments across diverse heterogeneous datasets validate LMSPS's capability in discovering effective long-range meta-paths, surpassing state-of-the-art methods. Our code is available at https: //github. com/JHL-HUST/LMSPS.

IJCAI Conference 2024 Conference Paper

Rethinking the Soft Conflict Pseudo Boolean Constraint on MaxSAT Local Search Solvers

  • Jiongzhi Zheng
  • Zhuo Chen
  • Chu-Min Li
  • Kun He

MaxSAT is an optimization version of the famous NP-complete Satisfiability problem (SAT). Algorithms for MaxSAT mainly include complete solvers and local search incomplete solvers. In many complete solvers, once a better solution is found, a Soft conflict Pseudo Boolean (SPB) constraint will be generated to enforce the algorithm to find better solutions. In many local search algorithms, clause weighting is a key technique for effectively guiding the search directions. In this paper, we propose to transfer the SPB constraint into the clause weighting system of the local search method, leading the algorithm to better solutions. We further propose an adaptive clause weighting strategy that breaks the tradition of using constant values to adjust clause weights. Based on the above methods, we propose a new local search algorithm called SPB-MaxSAT that provides new perspectives for clause weighting on MaxSAT local search solvers. Extensive experiments demonstrate the excellent performance of the proposed methods.

AAAI Conference 2023 Conference Paper

Differentiable Meta Multigraph Search with Partial Message Propagation on Heterogeneous Information Networks

  • Chao Li
  • Hao Xu
  • Kun He

Heterogeneous information networks (HINs) are widely employed for describing real-world data with intricate entities and relationships. To automatically utilize their semantic information, graph neural architecture search has recently been developed for various tasks of HINs. Existing works, on the other hand, show weaknesses in instability and inflexibility. To address these issues, we propose a novel method called Partial Message Meta Multigraph search (PMMM) to automatically optimize the neural architecture design on HINs. Specifically, to learn how graph neural networks (GNNs) propagate messages along various types of edges, PMMM adopts an efficient differentiable framework to search for a meaningful meta multigraph, which can capture more flexible and complex semantic relations than a meta graph. The differentiable search typically suffers from performance instability, so we further propose a stable algorithm called partial message search to ensure that the searched meta multigraph consistently surpasses the manually designed meta-structures, i.e., meta-paths. Extensive experiments on six benchmark datasets over two representative tasks, including node classification and recommendation, demonstrate the effectiveness of the proposed method. Our approach outperforms the state-of-the-art heterogeneous GNNs, finds out meaningful meta multigraphs, and is significantly more stable. Our code is available at https://github.com/JHL-HUST/PMMM.

NeurIPS Conference 2023 Conference Paper

FABind: Fast and Accurate Protein-Ligand Binding

  • Qizhi Pei
  • Kaiyuan Gao
  • Lijun Wu
  • Jinhua Zhu
  • Yingce Xia
  • Shufang Xie
  • Tao Qin
  • Kun He

Modeling the interaction between proteins and ligands and accurately predicting their binding structures is a critical yet challenging task in drug discovery. Recent advancements in deep learning have shown promise in addressing this challenge, with sampling-based and regression-based methods emerging as two prominent approaches. However, these methods have notable limitations. Sampling-based methods often suffer from low efficiency due to the need for generating multiple candidate structures for selection. On the other hand, regression-based methods offer fast predictions but may experience decreased accuracy. Additionally, the variation in protein sizes often requires external modules for selecting suitable binding pockets, further impacting efficiency. In this work, we propose FABind, an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding. FABind incorporates a unique ligand-informed pocket prediction module, which is also leveraged for docking pose estimation. The model further enhances the docking process by incrementally integrating the predicted pocket to optimize protein-ligand binding, reducing discrepancies between training and inference. Through extensive experiments on benchmark datasets, our proposed FABind demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods. Our code is available at https: //github. com/QizhiPei/FABind.

AAAI Conference 2023 Conference Paper

Farsighted Probabilistic Sampling: A General Strategy for Boosting Local Search MaxSAT Solvers

  • Jiongzhi Zheng
  • Kun He
  • Jianrong Zhou

Local search has been demonstrated as an efficient approach for two practical generalizations of the MaxSAT problem, namely Partial MaxSAT (PMS) and Weighted PMS (WPMS). In this work, we observe that most local search (W)PMS solvers usually flip a single variable per iteration. Such a mechanism may lead to relatively low-quality local optimal solutions, and may limit the diversity of search directions to escape from local optima. To address this issue, we propose a general strategy, called farsighted probabilistic sampling (FPS), to replace the single flipping mechanism so as to boost the local search (W)PMS algorithms. FPS considers the benefit of continuously flipping a pair of variables in order to find higher-quality local optimal solutions. Moreover, FPS proposes an effective approach to escape from local optima by preferring the best to flip among the best sampled single variable and the best sampled variable pair. Extensive experiments demonstrate that our proposed FPS strategy significantly improves the state-of-the-art (W)PMS solvers, and FPS has an excellent generalization capability to various local search MaxSAT solvers.

AAAI Conference 2023 Conference Paper

Hybrid Learning with New Value Function for the Maximum Common Induced Subgraph Problem

  • Yanli Liu
  • Jiming Zhao
  • Chu-Min Li
  • Hua Jiang
  • Kun He

Maximum Common Induced Subgraph (MCIS) is an important NP-hard problem with wide real-world applications. An efficient class of MCIS algorithms uses Branch-and-Bound (BnB), consisting in successively selecting vertices to match and pruning when it is discovered that a solution better than the best solution found so far does not exist. The method of selecting the vertices to match is essential for the performance of BnB. In this paper, we propose a new value function and a hybrid selection strategy used in reinforcement learning to define a new vertex selection method, and propose a new BnB algorithm, called McSplitDAL, for MCIS. Extensive experiments show that McSplitDAL significantly improves the current best BnB algorithms, McSplit+LL and McSplit+RL. An empirical analysis is also performed to illustrate why the new value function and the hybrid selection strategy are effective.

AAAI Conference 2023 Conference Paper

Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem

  • Yan Jin
  • Yuandong Ding
  • Xuanhao Pan
  • Kun He
  • Li Zhao
  • Tao Qin
  • Lei Song
  • Jiang Bian

Traveling Salesman Problem (TSP), as a classic routing optimization problem originally arising in the domain of transportation and logistics, has become a critical task in broader domains, such as manufacturing and biology. Recently, Deep Reinforcement Learning (DRL) has been increasingly employed to solve TSP due to its high inference efficiency. Nevertheless, most of existing end-to-end DRL algorithms only perform well on small TSP instances and can hardly generalize to large scale because of the drastically soaring memory consumption and computation time along with the enlarging problem scale. In this paper, we propose a novel end-to-end DRL approach, referred to as Pointerformer, based on multi-pointer Transformer. Particularly, Pointerformer adopts both reversible residual network in the encoder and multi-pointer network in the decoder to effectively contain memory consumption of the encoder-decoder architecture. To further improve the performance of TSP solutions, Pointerformer employs a feature augmentation method to explore the symmetries of TSP at both training and inference stages as well as an enhanced context embedding approach to include more comprehensive context information in the query. Extensive experiments on a randomly generated benchmark and a public benchmark have shown that, while achieving comparative results on most small-scale TSP instances as state-of-the-art DRL approaches do, Pointerformer can also well generalize to large-scale TSPs.

NeurIPS Conference 2023 Conference Paper

Rethinking the Backward Propagation for Adversarial Transferability

  • Wang Xiaosen
  • Kangheng Tong
  • Kun He

Transfer-based attacks generate adversarial examples on the surrogate model, which can mislead other black-box models without access, making it promising to attack real-world applications. Recently, several works have been proposed to boost adversarial transferability, in which the surrogate model is usually overlooked. In this work, we identify that non-linear layers (e. g. , ReLU, max-pooling, etc. ) truncate the gradient during backward propagation, making the gradient w. r. t. input image imprecise to the loss function. We hypothesize and empirically validate that such truncation undermines the transferability of adversarial examples. Based on these findings, we propose a novel method called Backward Propagation Attack (BPA) to increase the relevance between the gradient w. r. t. input image and loss function so as to generate adversarial examples with higher transferability. Specifically, BPA adopts a non-monotonic function as the derivative of ReLU and incorporates softmax with temperature to smooth the derivative of max-pooling, thereby mitigating the information loss during the backward propagation of gradients. Empirical results on the ImageNet dataset demonstrate that not only does our method substantially boost the adversarial transferability, but it is also general to existing transfer-based attacks. Code is available at https: //github. com/Trustworthy-AI-Group/RPA.

IJCAI Conference 2022 Conference Paper

A Strengthened Branch and Bound Algorithm for the Maximum Common (Connected) Subgraph Problem

  • Jianrong Zhou
  • Kun He
  • Jiongzhi Zheng
  • Chu-Min Li
  • Yanli Liu

We propose a new and strengthened Branch-and-Bound (BnB) algorithm for the maximum common (connected) induced subgraph problem based on two new operators, Long-Short Memory (LSM) and Leaf vertex Union Match (LUM). Given two graphs for which we search for the maximum common (connected) induced subgraph, the first operator of LSM maintains a score for the branching node using the short-term reward of each vertex of the first graph and the long-term reward of each vertex pair of the two graphs. In this way, the BnB process learns to reduce the search tree size significantly and boost the algorithm performance. The second operator of LUM further improves the performance by simultaneously matching the leaf vertices connected to the current matched vertices, and allows the algorithm to match multiple vertex pairs without affecting the optimality of solution. We incorporate the two operators into the state-of-the-art BnB algorithm McSplit, and denote the resulting algorithm as McSplit+LL. Experiments show that McSplit+LL outperforms McSplit+RL, a more recent variant of McSplit using reinforcement learning that is superior than McSplit.

IJCAI Conference 2022 Conference Paper

BandMaxSAT: A Local Search MaxSAT Solver with Multi-armed Bandit

  • Jiongzhi Zheng
  • Kun He
  • Jianrong Zhou
  • Yan Jin
  • Chu-Min Li
  • Felip Manyà

We address Partial MaxSAT (PMS) and Weighted PMS (WPMS), two practical generalizations of the MaxSAT problem, and propose a local search algorithm called BandMaxSAT, that applies a multi-armed bandit to guide the search direction, for these problems. The bandit in our method is associated with all the soft clauses in the input (W)PMS instance. Each arm corresponds to a soft clause. The bandit model can help BandMaxSAT to select a good direction to escape from local optima by selecting a soft clause to be satisfied in the current step, that is, selecting an arm to be pulled. We further propose an initialization method for (W)PMS that prioritizes both unit and binary clauses when producing the initial solutions. Extensive experiments demonstrate that BandMaxSAT significantly outperforms the state-of-the-art (W)PMS local search algorithm SATLike3. 0. Specifically, the number of instances in which BandMaxSAT obtains better results is about twice that obtained by SATLike3. 0. We further combine BandMaxSAT with the complete solver TT-Open-WBO-Inc. The resulting solver BandMaxSAT-c also outperforms some of the best state-of-the-art complete (W)PMS solvers, including SATLike-c, Loandra and TT-Open-WBO-Inc.

IJCAI Conference 2022 Conference Paper

Combining Clause Learning and Branch and Bound for MaxSAT (Extended Abstract)

  • Chu-Min Li
  • Zhenxing Xu
  • Jordi Coll
  • Felip Manyà
  • Djamal Habet
  • Kun He

Branch and Bound (BnB) has been successfully used to solve many combinatorial optimization problems. However, BnB MaxSAT solvers perform poorly when solving real-world and academic optimization problems. They are only competitive for random and some crafted instances. Thus, it is a prevailing opinion in the community that BnB is not really useful for practical MaxSAT solving. We refute this opinion by presenting a new BnB MaxSAT solver, called MaxCDCL, which combines clause learning and an efficient bounding procedure. MaxCDCL is among the top 5 out of a total of 15 exact solvers that participated in the 2020 MaxSAT Evaluation, solving several instances that other solvers cannot solve. Furthermore, MaxCDCL solves the highest number of instances from different MaxSAT Evaluations when combined with the best existing solvers.

JAIR Journal 2022 Journal Article

Two-phase Multi-document Event Summarization on Core Event Graphs

  • Zengjian Chen
  • Jin Xu
  • Meng Liao
  • Tong Xue
  • Kun He

Succinct event description based on multiple documents is critical to news systems as well as search engines. Different from existing summarization or event tasks, Multi-document Event Summarization (MES) aims at the query-level event sequence generation, which has extra constraints on event expression and conciseness. Identifying and summarizing the key event from a set of related articles is a challenging task that has not been sufficiently studied, mainly because online articles exhibit characteristics of redundancy and sparsity, and a perfect event summarization needs high level information fusion among diverse sentences and articles. To address these challenges, we propose a two-phase framework for the MES task, that first performs event semantic graph construction and dominant event detection via graph-sequence matching, then summarizes the extracted key event by an event-aware pointer generator. For experiments in the new task, we construct two large-scale real-world datasets for training and assessment. Extensive evaluations show that the proposed framework significantly outperforms the related baseline methods, with the most dominant event of the articles effectively identified and correctly summarized.

AAAI Conference 2021 Conference Paper

Adversarial Training with Fast Gradient Projection Method against Synonym Substitution Based Text Attacks

  • Xiaosen Wang
  • Yichen Yang
  • Yihe Deng
  • Kun He

Adversarial training is the most empirically successful approach in improving the robustness of deep neural networks for image classification. For text classification, however, existing synonym substitution based adversarial attacks are effective but not very efficient to be incorporated into practical text adversarial training. Gradient-based attacks, which are very efficient for images, are hard to be implemented for synonym substitution based text attacks due to the lexical, grammatical and semantic constraints and the discrete text input space. Thereby, we propose a fast text adversarial attack method called Fast Gradient Projection Method (FGPM) based on synonym substitution, which is about 20 times faster than existing text attack methods and could achieve similar attack performance. We then incorporate FGPM with adversarial training and propose a text defense method called Adversarial Training with FGPM enhanced by Logit pairing (ATFL). Experiments show that ATFL could significantly improve the model robustness and block the transferability of adversarial examples.

AAAI Conference 2021 Conference Paper

Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem

  • Jiongzhi Zheng
  • Kun He
  • Jianrong Zhou
  • Yan Jin
  • Chu-Min Li

We address the Traveling Salesman Problem (TSP), a famous NP-hard combinatorial optimization problem. And we propose a variable strategy reinforced approach, denoted as VSR-LKH, which combines three reinforcement learning methods (Q-learning, Sarsa and Monte Carlo) with the well-known TSP algorithm, called Lin-Kernighan-Helsgaun (LKH). VSR-LKH replaces the inflexible traversal operation in LKH, and lets the program learn to make choice at each search step by reinforcement learning. Experimental results on 111 TSP benchmarks from the TSPLIB with up to 85, 900 cities demonstrate the excellent performance of the proposed method.

AAAI Conference 2020 Conference Paper

A Learning Based Branch and Bound for Maximum Common Subgraph Related Problems

  • Yanli Liu
  • Chu-Min Li
  • Hua Jiang
  • Kun He

The performance of a branch-and-bound (BnB) algorithm for maximum common subgraph (MCS) problem and its related problems, like maximum common connected subgraph (MCCS) and induced Subgraph Isomorphism (SI), crucially depends on the branching heuristic. We propose a branching heuristic inspired from reinforcement learning with a goal of reaching a tree leaf as early as possible to greatly reduce the search tree size. Experimental results show that the proposed heuristic consistently and significantly improves the current best BnB algorithm for the MCS, MCCS and SI problems. An analysis is carried out to give insight on why and how reinforcement learning is useful in the new branching heuristic.

AAAI Conference 2019 Conference Paper

Multilevel Language and Vision Integration for Text-to-Clip Retrieval

  • Huijuan Xu
  • Kun He
  • Bryan A. Plummer
  • Leonid Sigal
  • Stan Sclaroff
  • Kate Saenko

We address the problem of text-based activity retrieval in video. Given a sentence describing an activity, our task is to retrieve matching clips from an untrimmed video. To capture the inherent structures present in both text and video, we introduce a multilevel model that integrates vision and language features earlier and more tightly than prior work. First, we inject text features early on when generating clip proposals, to help eliminate unlikely clips and thus speed up processing and boost performance. Second, to learn a fine-grained similarity metric for retrieval, we use visual features to modulate the processing of query sentences at the word level in a recurrent neural network. A multi-task loss is also employed by adding query re-generation as an auxiliary task. Our approach significantly outperforms prior work on two challenging benchmarks: Charades-STA and ActivityNet Captions.

NeurIPS Conference 2018 Conference Paper

Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation

  • Liwei Wang
  • Lunjia Hu
  • Jiayuan Gu
  • Zhiqiang Hu
  • Yue Wu
  • Kun He
  • John Hopcroft

It is widely believed that learning good representations is one of the main reasons for the success of deep neural networks. Although highly intuitive, there is a lack of theory and systematic approach quantitatively characterizing what representations do deep neural networks learn. In this work, we move a tiny step towards a theory and better understanding of the representations. Specifically, we study a simpler problem: How similar are the representations learned by two networks with identical architecture but trained from different initializations. We develop a rigorous theory based on the neuron activation subspace match model. The theory gives a complete characterization of the structure of neuron activation subspace matches, where the core concepts are maximum match and simple match which describe the overall and the finest similarity between sets of neurons in two networks respectively. We also propose efficient algorithms to find the maximum match and simple matches. Finally, we conduct extensive experiments using our algorithms. Experimental results suggest that, surprisingly, representations learned by the same convolutional layers of networks trained from different initializations are not as similar as prevalently expected, at least in terms of subspace match.

NeurIPS Conference 2016 Conference Paper

A Powerful Generative Model Using Random Weights for the Deep Image Representation

  • Kun He
  • Yan Wang
  • John Hopcroft

To what extent is the success of deep visualization due to the training? Could we do deep visualization using untrained, random weight networks? To address this issue, we explore new and powerful generative models for three popular deep visualization tasks using untrained, random weight convolutional neural networks. First we invert representations in feature spaces and reconstruct images from white noise inputs. The reconstruction quality is statistically higher than that of the same method applied on well trained networks with the same architecture. Next we synthesize textures using scaled correlations of representations in multiple layers and our results are almost indistinguishable with the original natural texture and the synthesized textures based on the trained network. Third, by recasting the content of an image in the style of various artworks, we create artistic images with high perceptual quality, highly competitive to the prior work of Gatys et al. on pretrained networks. To our knowledge this is the first demonstration of image representations using untrained deep neural networks. Our work provides a new and fascinating tool to study the representation of deep network architecture and sheds light on new understandings on deep visualization. It may possibly lead to a way to compare network architectures without training.