Arrow Research search

Author name cluster

Junyu Dong

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

17 papers
1 author row

Possible papers

17

AAAI Conference 2026 Conference Paper

SEA-PACE: Semi-Supervised Underwater Image Enhancement via Gaussian Process–Assisted Self-Paced Learning

  • Jingyang Wang
  • Hengyue Bi
  • Jingchao Cao
  • Feng Gao
  • Junyu Dong

The scarcity of paired data severely limits the performance and generalization of learning-based underwater image enhancement (UIE) methods. This challenge is particularly prominent in scenes with complex degradations. Semi-supervised learning has emerged as a promising solution by enabling the utilization of large-scale unlabeled data. However, its effectiveness is limited by the use of static, model-agnostic metrics for pseudo-label reliability assessment. To address this, we propose SEA-PACE, a novel semi-supervised framework that integrates model-aware uncertainty modeling and self-paced consistency learning to fully exploit unlabeled data for UIE. Specifically, we design a Model-Aware Reliability Estimator (MARE) that quantifies the uncertainty of the teacher model's predictions through Gaussian Process Regression in latent feature space. The resulting uncertainty is then transformed into reliability weights via a rank-based mapping. Additionally, we apply the Self-Paced Consistency Learning (SPCL) strategy that employs a loss-aware schedule to dynamically prioritize high-confidence pseudo-labels, gradually incorporating more challenging samples during training. Extensive experiments on several public UIE benchmarks demonstrate that SEA-PACE consistently surpasses state-of-the-art methods in both visual quality and generalization capability.

TIST Journal 2025 Journal Article

An Underwater Imaging Generative Adversarial Network by Simulating the Mechanism of Light Propagation in Water

  • Yujuan Sun
  • Xing Huang
  • Yanfang Cui
  • Junyu Dong
  • Xiaofeng Zhang
  • Tao Yao

Since capturing underwater images without degradation is challenging, there are few real image datasets with paired ground truth for underwater image enhancement. In this article, we propose a generative adversarial network (UIGAN) for underwater imaging; the network can convert images and their corresponding depth maps captured in air into images in water. The underwater imaging mechanism relies on many intrinsic parameters in water, which are difficult to estimate without field calibration. As the strong modeling capability of deep neural networks, this article uses the deep learning model to extract parameters from the real underwater environment. Then the proposed UIGAN simulates the light propagation process (direct attenuation, backscattering, and forward scattering) in water by using three modules with different constraints. We can generate a large training dataset with paired images in air and real water environment. The generated UIGAN dataset serves as input to a forward-attention transfer underwater enhancement model (FATUECNN), and it can output the restored images with appearance like those captured in air. The proposed pipeline is verified both qualitatively and quantitatively by extensive experiments and comparison evaluation with the existing state-of-the-art methods. The source code and the pre-trained model are made publicly available.

AAAI Conference 2025 Conference Paper

Correlation-Attention Masked Temporal Transformer for User Identity Linkage Using Heterogeneous Mobility Data

  • Ziang Yan
  • Xingyu Zhao
  • Hanqing Ma
  • Wei Chen
  • Jianpeng Qi
  • Yanwei Yu
  • Junyu Dong

With the rise of social media and Location-Based Social Networks (LBSN), check-in data across platforms has become crucial for User Identity Linkage (UIL). These data not only reveal users' spatio-temporal information but also provide insights into their behavior patterns and interests. However, cross-platform identity linkage faces challenges like poor data quality, high sparsity, and noise interference, which hinder existing methods from extracting cross-platform user information. To address these issues, we propose a Correlation-Attention Masked Transformer for User Identity Link age Network (MT-Link), a transformer-based framework to enhance model performance by learning spatio-temporal co-occurrence patterns of cross-platform users. Our model effectively captures spatio-temporal co-occurrence in cross-platform user check-in sequences. It employs a correlation attention mechanism to detect the spatio-temporal co-occurrence between user check-in sequences. Guided by attention weight maps, the model focuses on co-occurrence points while filtering out noise, ultimately improving classification performance. Experimental results show that our model significantly outperforms state-of-the-art baselines by 12.92%-17.76% and 5.80%-8.38% improvements in terms of Macro-F1 and Area Under Curve (AUC).

IJCAI Conference 2025 Conference Paper

DGraFormer: Dynamic Graph Learning Guided Multi-Scale Transformer for Multivariate Time Series Forecasting

  • Han Yan
  • Dongliang Chen
  • Guiyuan Jiang
  • Bin Wang
  • Lei Cao
  • Junyu Dong
  • Yanwei Yu

Multivariate time series forecasting is a critical focus across many fields. Existing transformer-based models have overlooked the explicit modeling of inter-variable correlations. Similarly, the graph-based methods have also failed to address the dynamic nature of multivariate correlations and the noise in correlation modeling. To overcome these challenges, we propose a novel Dynamic Graph Learning Guided Multi-Scale Transformer (DGraFormer) for multivariate time series forecasting. Specifically, our method consists of two main components: Dynamic correlation-aware graph Learning (DCGL) and multi-scale temporal transformer (MTT). The former aims to capture dynamic correlations across different time windows, filters out noise, and selects key weights to guide the aggregation of relevant feature representations. The latter can effectively extract temporal patterns from patch data at varying scales. Finally, the proposed method can capture rich local correlation graph structures and multi-scale global temporal features. Experimental results demonstrate that DGraformer significantly outperforms existing state-of-the-art models on ten real-world datasets, achieving the best performance across multiple evaluation metrics. The source code of our model is available at \url{https: //anonymous. 4open. science/r/DGraFormer}.

AAAI Conference 2025 Conference Paper

PersonaMagic: Stage-Regulated High-Fidelity Face Customization with Tandem Equilibrium

  • Xinzhe Li
  • Jiahui Zhan
  • Shengfeng He
  • Yangyang Xu
  • Junyu Dong
  • Huaidong Zhang
  • Yong Du

Personalized image generation has made significant strides in adapting content to novel concepts. However, a persistent challenge remains: balancing the accurate reconstruction of unseen concepts with the need for editability according to the prompt, especially when dealing with the complex nuances of facial features. In this study, we delve into the temporal dynamics of the text-to-image conditioning process, emphasizing the crucial role of stage partitioning in introducing new concepts. We present PersonaMagic, a stage-regulated generative technique designed for high-fidelity face customization. Using a simple MLP network, our method learns a series of embeddings within a specific timestep interval to capture face concepts. Additionally, we develop a Tandem Equilibrium mechanism that adjusts self-attention responses in the text encoder, balancing text description and identity preservation, improving both areas. Extensive experiments confirm the superiority of PersonaMagic over state-of-the-art methods in both qualitative and quantitative evaluations. Moreover, its robustness and flexibility are validated in non-facial domains, and it can also serve as a valuable plug-in for enhancing the performance of pretrained personalization models.

AAAI Conference 2025 Conference Paper

Spatiotemporal-aware Trend-Seasonality Decomposition Network for Traffic Flow Forecasting

  • Lingxiao Cao
  • Bin Wang
  • Guiyuan Jiang
  • Yanwei Yu
  • Junyu Dong

Traffic prediction is critical for optimizing travel scheduling and enhancing public safety, yet the complex spatial and temporal dynamics within traffic data present significant challenges for accurate forecasting. In this paper, we introduce a novel model, the Spatiotemporal-aware Trend-Seasonality Decomposition Network (STDN). This model begins by constructing a dynamic graph structure to represent traffic flow and incorporates novel spatio-temporal embeddings to jointly capture global traffic dynamics. The representations learned are further refined by a specially designed trend-seasonality decomposition module, which disentangles the trend-cyclical component and seasonal component for each traffic node at different times within the graph. These components are subsequently processed through an encoder-decoder network to generate the final predictions. Extensive experiments conducted on real-world traffic datasets demonstrate that STDN achieves superior performance with remarkable computation cost. Furthermore, we have released a new traffic dataset named JiNan, which features unique inner-city dynamics, thereby enriching the scenario comprehensiveness in traffic prediction evaluation.

IJCAI Conference 2024 Conference Paper

Exploring Cross-Domain Few-Shot Classification via Frequency-Aware Prompting

  • Tiange Zhang
  • Qing Cai
  • Feng Gao
  • Lin Qi
  • Junyu Dong

Cross-Domain Few-Shot Learning has witnessed great stride with the development of meta-learning. However, most existing methods pay more attention to learning domain-adaptive inductive bias (meta-knowledge) through feature-wise manipulation or task diversity improvement while neglecting the phenomenon that deep networks tend to rely more on high-frequency cues to make the classification decision, which thus degenerates the robustness of learned inductive bias since high-frequency information is vulnerable and easy to be disturbed by noisy information. Hence in this paper, we make one of the first attempts to propose a Frequency-Aware Prompting method with mutual attention for Cross-Domain Few-Shot classification, which can let networks simulate the human visual perception of selecting different frequency cues when facing new recognition tasks. Specifically, a frequency-aware prompting mechanism is first proposed, in which high-frequency components of the decomposed source image are switched either with normal distribution sampling or zeroing to get frequency-aware augment samples. Then, a mutual attention module is designed to learn generalizable inductive bias under CD-FSL settings. More importantly, the proposed method is a plug-and-play module that can be directly applied to most off-the-shelf CD-FLS methods. Experimental results on CD-FSL benchmarks demonstrate the effectiveness of our proposed method as well as robustly improve the performance of existing CD-FLS methods. Resources at https: //github. com/tinkez/FAP_CDFSC.

NeurIPS Conference 2024 Conference Paper

FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge

  • Hanzhe Li
  • Jiaran Zhou
  • Yuezun Li
  • Baoyuan Wu
  • Bin Li
  • Junyu Dong

Generating synthetic fake faces, known as pseudo-fake faces, is an effective way to improve the generalization of DeepFake detection. Existing methods typically generate these faces by blending real or fake faces in spatial domain. While these methods have shown promise, they overlook the simulation of frequency distribution in pseudo-fake faces, limiting the learning of generic forgery traces in-depth. To address this, this paper introduces {\em FreqBlender}, a new method that can generate pseudo-fake faces by blending frequency knowledge. Concretely, we investigate the major frequency components and propose a Frequency Parsing Network to adaptively partition frequency components related to forgery traces. Then we blend this frequency knowledge from fake faces into real faces to generate pseudo-fake faces. Since there is no ground truth for frequency components, we describe a dedicated training strategy by leveraging the inner correlations among different frequency knowledge to instruct the learning process. Experimental results demonstrate the effectiveness of our method in enhancing DeepFake detection, making it a potential plug-and-play strategy for other methods.

TIST Journal 2024 Journal Article

MHGCN+: Multiplex Heterogeneous Graph Convolutional Network

  • Chaofan Fu
  • Pengyang Yu
  • Yanwei Yu
  • Chao Huang
  • Zhongying Zhao
  • Junyu Dong

Heterogeneous graph convolutional networks have gained great popularity in tackling various network analytical tasks on heterogeneous graph data, ranging from link prediction to node classification. However, most existing works ignore the relation heterogeneity with multiplex networks between multi-typed nodes and the different importance of relations in meta-paths for node embedding, which can hardly capture the heterogeneous structure signals across different relations. To tackle this challenge, this work proposes a M ultiplex H eterogeneous G raph C onvolutional N etwork (MHGCN+) for multiplex heterogeneous network embedding. Our MHGCN+ can automatically learn the useful heterogeneous meta-path interactions of different lengths with different importance in multiplex heterogeneous networks through multi-layer convolution aggregation. Additionally, we effectively integrate both multi-relation structural signals and attribute semantics into the learned node embeddings with both unsupervised and semi-supervised learning paradigms. Extensive experiments on seven real-world datasets with various network analytical tasks demonstrate the significant superiority of MHGCN+ against state-of-the-art embedding baselines in terms of all evaluation metrics. The source code of our method is available at: https://github.com/FuChF/MHGCN-plus.

IJCAI Conference 2024 Conference Paper

Multi-Relational Graph Attention Network for Social Relationship Inference from Human Mobility Data

  • Guangming Qin
  • Jianpeng Qi
  • Bin Wang
  • Guiyuan Jiang
  • Yanwei Yu
  • Junyu Dong

Inferring social relationships from human mobility data holds significant value in real-life spatio-temporal applications, which inspires the development of a series of graph-based methods for inferring social relationships. Despite their effectiveness, we argue that previous methods either rely solely on direct relations between users, neglecting valuable user mobility patterns, or have not fully harnessed the indirect interactions, thereby struggling to capture users' mobility preferences. To address these issues, in this work, we propose the Multi-Relational Graph Attention Network (MRGAN), a novel graph attention network, which is able to explicitly model indirect relations and effectively capture their different impact. Specifically, we first extract a multi-relational graph from heterogeneous mobility graph to explicitly model the direct and indirect relations, and then utilize influence attention and cross-relation attention to further capture the different influence between users, and different importance of relations for each user. Comprehensive experiments on three real-world mobile datasets demonstrate that the proposed model significantly outperforms state-of-the-art models in predicting social relationships between users. The source code of our model is available at https: //github. com/qinguangming1999/MRGAN_IJCAI.

AAAI Conference 2024 Conference Paper

Spherical Pseudo-Cylindrical Representation for Omnidirectional Image Super-resolution

  • Qing Cai
  • Mu Li
  • Dongwei Ren
  • Jun Lyu
  • Haiyong Zheng
  • Junyu Dong
  • Yee-Hong Yang

Omnidirectional images have attracted significant attention in recent years due to the rapid development of virtual reality technologies. Equirectangular projection (ERP), a naive form to store and transfer omnidirectional images, however, is challenging for existing two-dimensional (2D) image super-resolution (SR) methods due to its inhomogeneous distributed sampling density and distortion across latitude. In this paper, we make one of the first attempts to design a spherical pseudo-cylindrical representation, which not only allows pixels at different latitudes to adaptively adopt the best distinct sampling density but also is model-agnostic to most off-the-shelf SR methods, enhancing their performances. Specifically, we start by upsampling each latitude of the input ERP image and design a computationally tractable optimization algorithm to adaptively obtain a (sub)-optimal sampling density for each latitude of the ERP image. Addressing the distortion of ERP, we introduce a new viewport-based training loss based on the original 3D sphere format of the omnidirectional image, which inherently lacks distortion. Finally, we present a simple yet effective recursive progressive omnidirectional SR network to showcase the feasibility of our idea. The experimental results on public datasets demonstrate the effectiveness of the proposed method as well as the consistently superior performance of our method over most state-of-the-art methods both quantitatively and qualitatively.

AAAI Conference 2023 Conference Paper

Graph Structure Learning on User Mobility Data for Social Relationship Inference

  • Guangming Qin
  • Lexue Song
  • Yanwei Yu
  • Chao Huang
  • Wenzhe Jia
  • Yuan Cao
  • Junyu Dong

With the prevalence of smart mobile devices and location-based services, uncovering social relationships from human mobility data is of great value in real-world spatio-temporal applications ranging from friend recommendation, advertisement targeting to transportation scheduling. While a handful of sophisticated graph embedding techniques are developed for social relationship inference, they are significantly limited to the sparse and noisy nature of user mobility data, as they all ignore the essential problem of the existence of a large amount of noisy data unrelated to social activities in such mobility data. In this work, we present Social Relationship Inference Network (SRINet), a novel Graph Neural Network (GNN) framework, to improve inference performance by learning to remove noisy data. Specifically, we first construct a multiplex user meeting graph to model the spatial-temporal interactions among users in different semantic contexts. Our proposed SRINet tactfully combines the representation learning ability of Graph Convolutional Networks (GCNs) with the power of removing noisy edges of graph structure learning, which can learn effective user embeddings on the multiplex user meeting graph in a semi-supervised manner. Extensive experiments on three real-world datasets demonstrate the superiority of SRINet against state-of-the-art techniques in inferring social relationships from user mobility data. The source code of our method is available at https://github.com/qinguangming1999/SRINet.

IJCAI Conference 2022 Conference Paper

Mutual Distillation Learning Network for Trajectory-User Linking

  • Wei Chen
  • ShuZhe Li
  • Chao Huang
  • Yanwei Yu
  • Yongguo Jiang
  • Junyu Dong

Trajectory-User Linking (TUL), which links trajectories to users who generate them, has been a challenging problem due to the sparsity in check-in mobility data. Existing methods ignore the utilization of historical data or rich contextual features in check-in data, resulting in poor performance for TUL task. In this paper, we propose a novel Mutual distillation learning network to solve the TUL problem for sparse check-in mobility data, named MainTUL. Specifically, MainTUL is composed of a Recurrent Neural Network (RNN) trajectory encoder that models sequential patterns of input trajectory and a temporal-aware Transformer trajectory encoder that captures long-term time dependencies for the corresponding augmented historical trajectories. Then, the knowledge learned on historical trajectories is transferred between the two trajectory encoders to guide the learning of both encoders to achieve mutual distillation of information. Experimental results on two real-world check-in mobility datasets demonstrate the superiority of \model against state-of-the-art baselines. The source code of our model is available at https: //github. com/Onedean/MainTUL.

AAAI Conference 2020 Conference Paper

Learning Deep Relations to Promote Saliency Detection

  • Changrui Chen
  • Xin Sun
  • Yang Hua
  • Junyu Dong
  • Hongwei Xv

Though saliency detectors has made stunning progress recently. The performances of the state-of-the-art saliency detectors are not acceptable in some confusing areas, e. g. , object boundary. We argue that the feature spatial independence should be one of the root cause. This paper explores the ubiquitous relations on the deep features to promote the existing saliency detectors efficiently. We establish the relation by maximizing the mutual information of the deep features of the same category via deep neural networks to break this independence. We introduce a threshold-constrained training pair construction strategy to ensure that we can accurately estimate the relations between different image parts in a selfsupervised way. The relation can be utilized to further excavate the salient areas and inhibit confusing backgrounds. The experiments demonstrate that our method can significantly boost the performance of the state-of-the-art saliency detectors on various benchmark datasets. Besides, our model is label-free and extremely efficient. The inference speed is 140 FPS on a single GTX1080 GPU.

IJCAI Conference 2020 Conference Paper

Pay Attention to Devils: A Photometric Stereo Network for Better Details

  • Yakun Ju
  • Kin-Man Lam
  • Yang Chen
  • Lin Qi
  • Junyu Dong

We present an attention-weighted loss in a photometric stereo neural network to improve 3D surface recovery accuracy in complex-structured areas, such as edges and crinkles, where existing learning-based methods often failed. Instead of using a uniform penalty for all pixels, our method employs the attention-weighted loss learned in a self-supervise manner for each pixel, avoiding blurry reconstruction result in such difficult regions. The network first estimates a surface normal map and an adaptive attention map, and then the latter is used to calculate a pixel-wise attention-weighted loss that focuses on complex regions. In these regions, the attention-weighted loss applies higher weights of the detail-preserving gradient loss to produce clear surface reconstructions. Experiments on real datasets show that our approach significantly outperforms traditional photometric stereo algorithms and state-of-the-art learning-based methods.

AAAI Conference 2019 Conference Paper

Network Structure and Transfer Behaviors Embedding via Deep Prediction Model

  • Xin Sun
  • Zenghui Song
  • Junyu Dong
  • Yongbo Yu
  • Claudia Plant
  • Christian Böhm

Network-structured data is becoming increasingly popular in many applications. However, these data present great challenges to feature engineering due to its high non-linearity and sparsity. The issue on how to transfer the link-connected nodes of the huge network into feature representations is critical. As basic properties of the real-world networks, the local and global structure can be reflected by dynamical transfer behaviors from node to node. In this work, we propose a deep embedding framework to preserve the transfer possibilities among the network nodes. We first suggest a degree-weight biased random walk model to capture the transfer behaviors of the network. Then a deep embedding framework is introduced to preserve the transfer possibilities among the nodes. A network structure embedding layer is added into the conventional Long Short-Term Memory Network to utilize its sequence prediction ability. To keep the local network neighborhood, we further perform a Laplacian supervised space optimization on the embedding feature representations. Experimental studies are conducted on various real-world datasets including social networks and citation networks. The results show that the learned representations can be effectively used as features in a variety of tasks, such as clustering, visualization and classification, and achieve promising performance compared with state-of-the-art models.