Arrow Research search

Author name cluster

Ke Sun

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

26 papers
1 author row

Possible papers

26

JBHI Journal 2026 Journal Article

$\text{P}^\text{2}$RS: A Quantitative Rating Scale for Pain Assessment based on Pulse Wave Characterization

  • Yue He
  • Yi Sun
  • Ke Sun
  • Wei Bin
  • Quan Wang
  • Heng Yang
  • Xinxin Li

For pain intensity assessment, currently there are mainly 11 rating scales, from primitive Visual Analog Scale (VAS) to elaborate Measure of Intermittent and Constant Osteoarthritis Pain (ICOAP). However, they all depend on a self-report mechanism, making their results so subjective that the consistency, comparability and reference value are barely satisfactory. Inspired by the phenomenon that discomfort may give rise to the throbbing of radial artery, we develop an objective rating scale innovatively, quantifying the severity of pain by the degree of “lateral instability” of an arterial pulse wave. In attempting to monitor this lateral instability, a sort of ultra-small piezoresistive pressure sensor is fabricated in an area of 0. 4 × 0. 4 $\text{mm}^\text{2}$. With 18 of such sensors, we build a flexible tactile sensing dense-array with a pitch of only 0. 65 mm. Overlying the radial artery perpendicularly to the blood flow direction, the dense-array succeeds in observing the cross-section of a pulse wave. The barycenter of the cross-section of each wave cycle is taken as the feature point to represent its lateral shape and drift. The standard deviation of the barycenters' horizontal coordinates is thereby calculated as the pulsatile perceptual rating scale ( $\text{P}^\text{2}$ RS) to reflect the degree of lateral instability, that is, our scale of pain intensity. Among 86 clinical samples, the pain threshold is 0. 11, which is concluded by a binary classification model based on a support vector machine. In terms of its consistency with previous rating scales, the average correlation coefficient reaches 0. 804 among 43 pain samples.

AAAI Conference 2026 Conference Paper

Leveraging Sparse Observations to Predict Species Abundance Across Space and Time

  • Md Zahidul Islam
  • Cameron S. Fletcher
  • Ke Sun
  • Amir Dezfouli
  • Iadine Chades

Biodiversity is declining globally at an unprecedented rate. Managers urgently need to allocate limited resources to control pest species where interventions have the highest ecological impact. However, many species are hard to detect, and data collection is often expensive, irregular, and incomplete, thus posing significant challenges for machine learning models that traditionally require large and regular datasets. We present a novel deep learning architecture that estimates the spatiotemporal abundance of hard-to-detect species from sparse, zero-inflated, and irregular data. Our method combines Graph Convolutional Networks (GCNs) to model spatial dependencies across monitoring sites with Recurrent Neural Networks (RNNs) to capture long-range temporal dynamics explicitly addresses the challenges of data sparsity, heterogeneity, and irregular sampling. We apply our model to the Crown-of-Thorns Starfish (COTS) on Australia's Great Barrier Reef, a species with devastating impact on coral reefs and a major target of pest control programs. Our method significantly outperforms baseline approaches and the current resource-intensive approach, manta-tow surveillance, in both accuracy and detectability. Simulations indicate a 20% increase in starfish removal efficiency over a year, enabling more effective coral protection. This work demonstrates how tailored deep learning methods can overcome ecological data limitations and substantially improve conservation outcomes.

NeurIPS Conference 2025 Conference Paper

Intrinsic Benefits of Categorical Distributional Loss: Uncertainty-aware Regularized Exploration in Reinforcement Learning

  • Ke Sun
  • Yingnan Zhao
  • Enze Shi
  • Yafei Wang
  • Xiaodong Yan
  • Bei Jiang
  • Linglong Kong

The remarkable empirical performance of distributional reinforcement learning~(RL) has garnered increasing attention to understanding its theoretical advantages over classical RL. By decomposing the categorical distributional loss commonly employed in distributional RL, we find that the potential superiority of distributional RL can be attributed to a derived distribution-matching entropy regularization. This less-studied entropy regularization aims to capture additional knowledge of return distribution beyond only its expectation, contributing to an augmented reward signal in policy optimization. In contrast to the vanilla entropy regularization in MaxEnt RL, which explicitly encourages exploration by promoting diverse actions, the novel entropy regularization derived from categorical distributional loss implicitly updates policies to align the learned policy with (estimated) environmental uncertainty. Finally, extensive experiments verify the significance of this uncertainty-aware regularization from distributional RL on the empirical benefits over classical RL. Our study offers an innovative exploration perspective to explain the intrinsic benefits of distributional learning in RL.

AAAI Conference 2025 Conference Paper

Security Attacks on LLM-based Code Completion Tools

  • Wen Cheng
  • Ke Sun
  • Xinyu Zhang
  • Wei Wang

The rapid development of large language models (LLMs) has significantly advanced code completion capabilities, giving rise to a new generation of LLM-based Code Completion Tools (LCCTs). Unlike general-purpose LLMs, these tools possess unique workflows, integrating multiple information sources as input and prioritizing code suggestions over natural language interaction, which introduces distinct security challenges. Additionally, LCCTs often rely on proprietary code datasets for training, raising concerns about the potential exposure of sensitive data. This paper exploits these distinct characteristics of LCCTs to develop targeted attack methodologies on two critical security risks: jailbreaking and training data extraction attacks. Our experimental results expose significant vulnerabilities within LCCTs, including a 99.4% success rate in jailbreaking attacks on GitHub Copilot and a 46.3% success rate on Amazon Q. Furthermore, We successfully extracted sensitive user data from GitHub Copilot, including 54 real email addresses and 314 physical addresses associated with GitHub usernames. Our study also demonstrates that these code-based attack methods are effective against general-purpose LLMs, highlighting a broader security misalignment in the handling of code by modern LLMs. These findings underscore critical security challenges associated with LCCTs and suggest essential directions for strengthening their security frameworks.

TMLR Journal 2024 Journal Article

A Distance-based Anomaly Detection Framework for Deep Reinforcement Learning

  • Hongming Zhang
  • Ke Sun
  • Bo Xu
  • Linglong Kong
  • Martin Müller

In deep reinforcement learning (RL) systems, abnormal states pose significant risks by potentially triggering unpredictable behaviors and unsafe actions, thus impeding the deployment of RL systems in real-world scenarios. It is crucial for reliable decision-making systems to have the capability to cast an alert whenever they encounter unfamiliar observations that they are not equipped to handle. In this paper, we propose a novel Mahalanobis distance-based (MD) anomaly detection framework, called \textit{MDX}, for deep RL algorithms. MDX simultaneously addresses random, adversarial, and out-of-distribution (OOD) state outliers in both offline and online settings. It utilizes Mahalanobis distance within class-conditional distributions for each action and operates within a statistical hypothesis testing framework under the Gaussian assumption. We further extend it to robust and distribution-free versions by incorporating Robust MD and conformal inference techniques. Through extensive experiments on classical control environments, Atari games and autonomous driving scenarios, we demonstrate the effectiveness of our MD-based detection framework. MDX offers a simple, unified, and practical tool for enhancing the safety and reliability of RL systems in real-world applications.

AAAI Conference 2024 Conference Paper

CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models

  • Zhongxi Chen
  • Ke Sun
  • Xianming Lin

Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings. Existing COD methods struggle with nuanced object boundaries and overconfident incorrect predictions. In response, we propose a new paradigm that treats COD as a conditional mask-generation task leveraging diffusion models. Our method, dubbed CamoDiffusion, employs the denoising process to progressively refine predictions while incorporating image conditions. Due to the stochastic sampling process of diffusion, our model is capable of sampling multiple possible predictions, avoiding the problem of overconfident point estimation. Moreover, we develop specialized network architecture, training, and sampling strategies, to enhance the model’s expressive power, refinement capabilities and suppress overconfident mis-segmentations, thus aptly tailoring the diffusion model to the demands of COD. Extensive experiments on three COD datasets attest to the superior performance of our model compared to existing state-of-the-art methods, particularly on the most challenging COD10K dataset, where our approach achieves 0.019 in terms of MAE. Codes and models are available at https://github.com/Rapisurazurite/CamoDiffusion.

AAAI Conference 2024 Conference Paper

Cross-Sentence Gloss Consistency for Continuous Sign Language Recognition

  • Qi Rao
  • Ke Sun
  • Xiaohan Wang
  • Qi Wang
  • Bang Zhang

Continuous sign language recognition (CSLR) aims to recognize gloss sequences from continuous sign videos. Recent works enhance the gloss representation consistency by mining correlations between visual and contextual modules within individual sentences. However, there still remain much richer correlations among glosses across different sentences. In this paper, we present a simple yet effective Cross-Sentence Gloss Consistency (CSGC), which enforces glosses belonging to a same category to be more consistent in representation than those belonging to different categories, across all training sentences. Specifically, in CSGC, a prototype is maintained for each gloss category and benefits the gloss discrimination in a contrastive way. Thanks to the well-distinguished gloss prototype, an auxiliary similarity classifier is devised to enhance the recognition clues, thus yielding more accurate results. Extensive experiments conducted on three CSLR datasets show that our proposed CSGC significantly boosts the performance of CSLR, surpassing existing state-of-the-art works by large margins (i.e., 1.6% on PHOENIX14, 2.4% on PHOENIX14-T, and 5.7% on CSL-Daily).

NeurIPS Conference 2024 Conference Paper

DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion

  • Ke Sun
  • Shen Chen
  • Taiping Yao
  • Hong Liu
  • Xiaoshuai Sun
  • Shouhong Ding
  • Rongrong Ji

The rapid progress of Deepfake technology has made face swapping highly realistic, raising concerns about the malicious use of fabricated facial content. Existing methods often struggle to generalize to unseen domains due to the diverse nature of facial manipulations. In this paper, we revisit the generation process and identify a universal principle: Deepfake images inherently contain information from both source and target identities, while genuine faces maintain a consistent identity. Building upon this insight, we introduce DiffusionFake, a novel plug-and-play framework that reverses the generative process of face forgeries to enhance the generalization of detection models. DiffusionFake achieves this by injecting the features extracted by the detection model into a frozen pre-trained Stable Diffusion model, compelling it to reconstruct the corresponding target and source images. This guided reconstruction process constrains the detection network to capture the source and target related features to facilitate the reconstruction, thereby learning rich and disentangled representations that are more resilient to unseen forgeries. Extensive experiments demonstrate that DiffusionFake significantly improves cross-domain generalization of various detector architectures without introducing additional parameters during inference. The code are available in https: //github. com/skJack/DiffusionFake. git.

NeurIPS Conference 2024 Conference Paper

Distributional Reinforcement Learning with Regularized Wasserstein Loss

  • Ke Sun
  • Yingnan Zhao
  • Wulong Liu
  • Bei Jiang
  • Linglong Kong

The empirical success of distributional reinforcement learning (RL) highly relies on the choice of distribution divergence equipped with an appropriate distribution representation. In this paper, we propose \textit{Sinkhorn distributional RL (SinkhornDRL)}, which leverages Sinkhorn divergence—a regularized Wasserstein loss—to minimize the difference between current and target Bellman return distributions. Theoretically, we prove the contraction properties of SinkhornDRL, aligning with the interpolation nature of Sinkhorn divergence between Wasserstein distance and Maximum Mean Discrepancy (MMD). The introduced SinkhornDRL enriches the family of distributional RL algorithms, contributing to interpreting the algorithm behaviors compared with existing approaches by our investigation into their relationships. Empirically, we show that SinkhornDRL consistently outperforms or matches existing algorithms on the Atari games suite and particularly stands out in the multi-dimensional reward setting. \thanks{Code is available in \url{https: //github. com/datake/SinkhornDistRL}. }.

IJCAI Conference 2024 Conference Paper

DVPE: Divided View Position Embedding for Multi-View 3D Object Detection

  • Jiasen Wang
  • Zhenglin Li
  • Ke Sun
  • Xianyuan Liu
  • Yang Zhou

Sparse query-based paradigms have achieved significant success in multi-view 3D detection for autonomous vehicles. Current research faces challenges in balancing between enlarging receptive fields and reducing interference when aggregating multi-view features. Moreover, different poses of cameras present challenges in training global attention models. To address these problems, this paper proposes a divided view method, in which features are modeled globally via the visibility cross-attention mechanism, but interact only with partial features in a divided local virtual space. This effectively reduces interference from other irrelevant features and alleviates the training difficulties of the transformer by decoupling the position embedding from camera poses. Additionally, 2D historical RoI features are incorporated into the object-centric temporal modeling to utilize high-level visual semantic information. The model is trained using a one-to-many assignment strategy to facilitate stability. Our framework, named DVPE, achieves state-of-the-art performance (57. 2% mAP and 64. 5% NDS) on the nuScenes test set. Codes will be available at https: //github. com/dop0/DVPE.

IS Journal 2024 Journal Article

From Electroencephalogram Data to Brain Networks: Graph-Learning-Based Brain Disease Diagnosis

  • Ke Sun
  • Ciyuan Peng
  • Shuo Yu
  • Zhuoyang Han
  • Feng Xia

Brain networks are built according to the structures or neural activities of different brain regions, which can be modeled as complex networks. Many studies exploit brains from the perspective of graph learning to diagnose the nerve diseases of brains. However, many of these algorithms are unable to automatically construct brain function topology based on electroencephalogram (EEG) and fail to capture the global features of multichannel EEG signals for whole-graph embedding. To address these challenging issues, we propose an attention-based whole-graph learning model for the diagnosis of brain diseases, namely, MAINS, which can adaptively construct brain functional topology from EEG signals and effectively embed multiple node features and the global structural features of brain networks into the whole-graph representations. We validated the model by conducting classification (diagnosis) experiments on real EEG datasets. Comprehensive experimental results demonstrate the superiority of the proposed approach over state-of-the-art methods.

AAAI Conference 2024 Conference Paper

ModWaveMLP: MLP-Based Mode Decomposition and Wavelet Denoising Model to Defeat Complex Structures in Traffic Forecasting

  • Ke Sun
  • Pei Liu
  • Pengfei Li
  • Zhifang Liao

Traffic prediction is the core issue of Intelligent Transportation Systems. Recently, researchers have tended to use complex structures, such as transformer-based structures, for tasks such as traffic prediction. Notably, traffic data is simpler to process compared to text and images, which raises questions about the necessity of these structures. Additionally, when handling traffic data, researchers tend to manually design the model structure based on the data features, which makes the structure of traffic prediction redundant and the model generalizability limited. To address the above, we introduce the ‘ModWaveMLP’—A multilayer perceptron (MLP) based model designed according to mode decomposition and wavelet noise reduction information learning concepts. The model is based on simple MLP structure, which achieves the separation and prediction of different traffic modes and does not depend on additional features introduced such as the topology of the traffic network. By performing experiments on real-world datasets METR-LA and PEMS-BAY, our model achieves SOTA, outperforms GNN and transformer-based models, and outperforms those that introduce additional feature data with better generalizability, and we further demonstrate the effectiveness of the various parts of the model through ablation experiments. This offers new insights to subsequent researchers involved in traffic model design. The code is available at: https://github.com/Kqingzheng/ModWaveMLP.

NeurIPS Conference 2024 Conference Paper

Trade-Offs of Diagonal Fisher Information Matrix Estimators

  • Alexander Soen
  • Ke Sun

The Fisher information matrix can be used to characterize the local geometry ofthe parameter space of neural networks. It elucidates insightful theories anduseful tools to understand and optimize neural networks. Given its highcomputational cost, practitioners often use random estimators and evaluate onlythe diagonal entries. We examine two popular estimators whose accuracy and samplecomplexity depend on their associated variances. We derive bounds of thevariances and instantiate them in neural networks for regression andclassification. We navigate trade-offs for both estimators based on analyticaland numerical studies. We find that the variance quantities depend on thenon-linearity w. r. t. different parameter groups and should not be neglected whenestimating the Fisher information.

AAAI Conference 2022 Conference Paper

Dual Contrastive Learning for General Face Forgery Detection

  • Ke Sun
  • Taiping Yao
  • Shen Chen
  • Shouhong Ding
  • Jilin Li
  • Rongrong Ji

With various facial manipulation techniques arising, face forgery detection has drawn growing attention due to security concerns. Previous works always formulate face forgery detection as a classification problem based on cross-entropy loss, which emphasizes category-level differences rather than the essential discrepancies between real and fake faces, limiting model generalization in unseen domains. To address this issue, we propose a novel face forgery detection framework, named Dual Contrastive Learning (DCL), which specially constructs positive and negative paired data and performs designed contrastive learning at different granularities to learn generalized feature representation. Concretely, combined with the hard sample selection strategy, Inter-Instance Contrastive Learning (Inter-ICL) is first proposed to promote task-related discriminative features learning by especially constructing instance pairs. Moreover, to further explore the essential discrepancies, Intra-Instance Contrastive Learning (Intra-ICL) is introduced to focus on the local content inconsistencies prevalent in the forged faces by constructing localregion pairs inside instances. Extensive experiments and visualizations on several datasets demonstrate the generalization of our method against the state-of-the-art competitors. Our Code is available at https: //github. com/Tencent/TFace. git.

NeurIPS Conference 2022 Conference Paper

Fair Wrapping for Black-box Predictions

  • Alexander Soen
  • Ibrahim M. Alabdulmohsin
  • Sanmi Koyejo
  • Yishay Mansour
  • Nyalleng Moorosi
  • Richard Nock
  • Ke Sun
  • Lexing Xie

We introduce a new family of techniques to post-process (``wrap") a black-box classifier in order to reduce its bias. Our technique builds on the recent analysis of improper loss functions whose optimization can correct any twist in prediction, unfairness being treated as a twist. In the post-processing, we learn a wrapper function which we define as an $\alpha$-tree, which modifies the prediction. We provide two generic boosting algorithms to learn $\alpha$-trees. We show that our modification has appealing properties in terms of composition of $\alpha$-trees, generalization, interpretability, and KL divergence between modified and original predictions. We exemplify the use of our technique in three fairness notions: conditional value-at-risk, equality of opportunity, and statistical parity; and provide experiments on several readily available datasets.

NeurIPS Conference 2022 Conference Paper

Identification, Amplification and Measurement: A bridge to Gaussian Differential Privacy

  • Yi Liu
  • Ke Sun
  • Bei Jiang
  • Linglong Kong

Gaussian differential privacy (GDP) is a single-parameter family of privacy notions that provides coherent guarantees to avoid the exposure of sensitive individual information. Despite the extra interpretability and tighter bounds under composition GDP provides, many widely used mechanisms (e. g. , the Laplace mechanism) inherently provide GDP guarantees but often fail to take advantage of this new framework because their privacy guarantees were derived under a different background. In this paper, we study the asymptotic properties of privacy profiles and develop a simple criterion to identify algorithms with GDP properties. We propose an efficient method for GDP algorithms to narrow down possible values of an optimal privacy measurement, $\mu$ with an arbitrarily small and quantifiable margin of error. For non GDP algorithms, we provide a post-processing procedure that can amplify existing privacy guarantees to meet the GDP condition. As applications, we compare two single-parameter families of privacy notions, $\epsilon$-DP, and $\mu$-GDP, and show that all $\epsilon$-DP algorithms are intrinsically also GDP. Lastly, we show that the combination of our measurement process and the composition theorem of GDP is a powerful and convenient tool to handle compositions compared to the traditional standard and advanced composition theorems.

NeurIPS Conference 2021 Conference Paper

Contrastive Laplacian Eigenmaps

  • Hao Zhu
  • Ke Sun
  • Peter Koniusz

Graph contrastive learning attracts/disperses node representations for similar/dissimilar node pairs under some notion of similarity. It may be combined with a low-dimensional embedding of nodes to preserve intrinsic and structural properties of a graph. In this paper, we extend the celebrated Laplacian Eigenmaps with contrastive learning, and call them COntrastive Laplacian EigenmapS (COLES). Starting from a GAN-inspired contrastive formulation, we show that the Jensen-Shannon divergence underlying many contrastive graph embedding models fails under disjoint positive and negative distributions, which may naturally emerge during sampling in the contrastive setting. In contrast, we demonstrate analytically that COLES essentially minimizes a surrogate of Wasserstein distance, which is known to cope well under disjoint distributions. Moreover, we show that the loss of COLES belongs to the family of so-called block-contrastive losses, previously shown to be superior compared to pair-wise losses typically used by contrastive methods. We show on popular benchmarks/backbones that COLES offers favourable accuracy/scalability compared to DeepWalk, GCN, Graph2Gauss, DGI and GRACE baselines.

NeurIPS Conference 2021 Conference Paper

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

  • Ke Sun
  • Yafei Wang
  • Yi Liu
  • Yingnan Zhao
  • Bo Pan
  • Shangling Jui
  • Bei Jiang
  • Linglong Kong

Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL. Despite its heuristic improvement of convergence, a rigorous mathematical justification for the benefits of Anderson mixing in RL has not yet been put forward. In this paper, we provide deeper insights into a class of acceleration schemes built on Anderson mixing that improve the convergence of deep RL algorithms. Our main results establish a connection between Anderson mixing and quasi-Newton methods and prove that Anderson mixing increases the convergence radius of policy iteration schemes by an extra contraction factor. The key focus of the analysis roots in the fixed-point iteration nature of RL. We further propose a stabilization strategy by introducing a stable regularization term in Anderson mixing and a differentiable, non-expansive MellowMax operator that can allow both faster convergence and more stable behavior. Extensive experiments demonstrate that our proposed method enhances the convergence, stability, and performance of RL algorithms.

AAAI Conference 2021 Conference Paper

Domain General Face Forgery Detection by Learning to Weight

  • Ke Sun
  • Hong Liu
  • Qixiang Ye
  • Yue Gao
  • Jianzhuang Liu
  • Ling Shao
  • Rongrong Ji

In this paper, we propose a domain-general model, termed learning-to-weight (LTW), that guarantees face detection performance across multiple domains, particularly the target domains that are never seen before. However, various face forgery methods cause complex and biased data distributions, making it challenging to detect fake faces in unseen domains. We argue that different faces contribute differently to a detection model trained on multiple domains, making the model likely to fit domain-specific biases. As such, we propose the LTW approach based on the meta-weight learning algorithm, which configures different weights for face images from different domains. The LTW network can balance the model’s generalizability across multiple domains. Then, the meta-optimization calibrates the source domain’s gradient enabling more discriminative features to be learned. The detection ability of the network is further improved by introducing an intra-class compact loss. Extensive experiments on several commonly used deepfake datasets to demonstrate the effectiveness of our method in detecting synthetic faces. Code and supplemental material are available at https: //github. com/skJack/LTW.

NeurIPS Conference 2021 Conference Paper

On the Variance of the Fisher Information for Deep Learning

  • Alexander Soen
  • Ke Sun

In the realm of deep learning, the Fisher information matrix (FIM) gives novel insights and useful tools to characterize the loss landscape, perform second-order optimization, and build geometric learning theories. The exact FIM is either unavailable in closed form or too expensive to compute. In practice, it is almost always estimated based on empirical samples. We investigate two such estimators based on two equivalent representations of the FIM --- both unbiased and consistent. Their estimation quality is naturally gauged by their variance given in closed form. We analyze how the parametric structure of a deep neural network can affect the variance. The meaning of this variance measure and its upper bounds are then discussed in the context of deep learning.

AAAI Conference 2020 Conference Paper

Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes

  • Ke Sun
  • Zhouchen Lin
  • Zhanxing Zhu

Graph Convolutional Networks (GCNs) play a crucial role in graph learning tasks, however, learning graph embedding with few supervised signals is still a difficult problem. In this paper, we propose a novel training algorithm for Graph Convolutional Network, called Multi-Stage Self- Supervised (M3S) Training Algorithm, combined with selfsupervised learning approach, focusing on improving the generalization performance of GCNs on graphs with few labeled nodes. Firstly, a Multi-Stage Training Framework is provided as the basis of M3S training method. Then we leverage DeepCluster technique, a popular form of self-supervised learning, and design corresponding aligning mechanism on the embedding space to refine the Multi-Stage Training Framework, resulting in M3S Training Algorithm. Finally, extensive experimental results verify the superior performance of our algorithm on graphs with few labeled nodes under different label rates compared with other state-of-theart approaches.

AAAI Conference 2020 Conference Paper

Where to Go Next: Modeling Long- and Short-Term User Preferences for Point-of-Interest Recommendation

  • Ke Sun
  • Tieyun Qian
  • Tong Chen
  • Yile Liang
  • Quoc Viet Hung Nguyen
  • Hongzhi Yin

Point-of-Interest (POI) recommendation has been a trending research topic as it generates personalized suggestions on facilities for users from a large number of candidate venues. Since users’ check-in records can be viewed as a long sequence, methods based on recurrent neural networks (RNNs) have recently shown promising applicability for this task. However, existing RNN-based methods either neglect users’ long-term preferences or overlook the geographical relations among recently visited POIs when modeling users’ shortterm preferences, thus making the recommendation results unreliable. To address the above limitations, we propose a novel method named Long- and Short-Term Preference Modeling (LSTPM) for next-POI recommendation. In particular, the proposed model consists of a nonlocal network for longterm preference modeling and a geo-dilated RNN for shortterm preference learning. Extensive experiments on two realworld datasets demonstrate that our model yields significant improvements over the state-of-the-art methods.

NeurIPS Conference 2018 Conference Paper

Representation Learning of Compositional Data

  • Marta Avalos
  • Richard Nock
  • Cheng Soon Ong
  • Julien Rouar
  • Ke Sun

We consider the problem of learning a low dimensional representation for compositional data. Compositional data consists of a collection of nonnegative data that sum to a constant value. Since the parts of the collection are statistically dependent, many standard tools cannot be directly applied. Instead, compositional data must be first transformed before analysis. Focusing on principal component analysis (PCA), we propose an approach that allows low dimensional representation learning directly from the original data. Our approach combines the benefits of the log-ratio transformation from compositional data analysis and exponential family PCA. A key tool in its derivation is a generalization of the scaled Bregman theorem, that relates the perspective transform of a Bregman divergence to the Bregman divergence of a perspective transform and a remainder conformal divergence. Our proposed approach includes a convenient surrogate (upper bound) loss of the exponential family PCA which has an easy to optimize form. We also derive the corresponding form for nonlinear autoencoders. Experiments on simulated data and microbiome data show the promise of our method.

NeurIPS Conference 2015 Conference Paper

Space-Time Local Embeddings

  • Ke Sun
  • Jun Wang
  • Alexandros Kalousis
  • Stephane Marchand-Maillet

Space-time is a profound concept in physics. This concept was shown to be useful for dimensionality reduction. We present basic definitions with interesting counter-intuitions. We give theoretical propositions to show that space-time is a more powerful representation than Euclidean space. We apply this concept to manifold learning for preserving local information. Empirical results on non-metric datasets show that more information can be preserved in space-time.