Author name cluster

Xiaojun Ye

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains

Chun Wang
Xiaojun Ye
Xiaoran Pan
Zihao Pan
Haofan Wang
Yiren Song

Recent advances in Visual Language Models (VLMs) have demonstrated exceptional performance in visual reasoning tasks. However, geo-localization presents unique challenges, requiring the extraction of multigranular visual cues from images and their integration with external world knowledge for systematic reasoning. Current approaches to geo-localization tasks often lack robust reasoning mechanisms and explainability, limiting their effectiveness. To address these limitations, we propose the Geo Reason Enhancement (GRE) Suite, a novel framework that augments VLMs with structured reasoning chains for accurate and interpretable location inference. The GRE Suite is systematically developed across three key dimensions: dataset, model, and benchmark. First, we introduce GRE30K, a high-quality geo-localization reasoning dataset designed to facilitate fine-grained visual and contextual analysis. Next, we present the GRE model, which employs a multi-stage reasoning strategy to progressively infer scene attributes, local details, and semantic features, thereby narrowing down potential geographic regions with enhanced precision. Finally, we construct the Geo Reason Evaluation Benchmark (GREval-Bench), a comprehensive evaluation framework that assesses VLMs across diverse urban, natural, and landmark scenes to measure both coarse-grained (e. g. , country, continent) and fine-grained (e. g. , city, street) localization performance. Experimental results demonstrate that GRE significantly outperforms existing methods across all granularities of geo-localization tasks, underscoring the efficacy of reasoning-augmented VLMs in complex geographic inference. Code and data will be released at https: //anonymous. 4open. science/r/GRE-74C0.

PDF Details

IJCAI Conference 2025 Conference Paper

M4Bench: A Benchmark of Multi-domain Multi-granularity Multi-image Understanding for Multi-modal Large Language Models

Xiaojun Ye
Guanbao Liang
Chun Wang
Liangcheng Li
Pengfei Ke
Rui Wang
Bingxin Jia
Gang Huang

The increasing demands in analyzing complex associated scenes pose necessities to researching multi-image understanding abilities. Compared with understanding individual images, both the alignments and differences between images are essential aspects of understanding the intricate relationships for multi-image inference tasks. However, existing benchmarks face difficulties in addressing both of these aspects simultaneously, resulting in obstacles to modeling relationships under various granularities and domains of images. In this paper, we introduce M4Bench to enhance the capability of aligning and distinguishing multi-images with multi-domain multi-granularity comparison. We carefully design five comparison tasks related to coarse and fine-grained granularities in single and multiple domains of images and evaluate them on 13 state-of-the-art multi-modal large language models with various sizes. Besides, we analyze the evaluation results and provide several observations and viewpoints for the multi-image understanding research. The data and evaluation code are available at https: //github. com/eaglelab-zju/M4Bench.

PDF Details DOI

ECAI Conference 2020 Conference Paper

OpenSMax: Unknown Domain Generation Algorithm Detection

Yao Lai
Guolou Ping
Yuexin Wu
Chenhui Lu
Xiaojun Ye

Botnet has become one of the most frequent attack patterns in cyberspace, and most of them are concerned with Domain Generation Algorithms (DGAs). Therefore, many researchers have proposed various machine learning models for DGA domain name detection, but how to detect unknown classes of DGA domain names (unknown DGAs) is still a challenging problem. In fact, the problem of detecting unknown classes is also called open set recognition problem. To tackle this issue, we propose a novel classification model OpenSMax which can not only detect various DGA domain names but also classify them into known and unknown classes of DGAs. In this model, we use the one-hot encoding method and the Long Short-Term Memory (LSTM) model to extract the features of the Top Level Domain (TLD) and the Second Level Domain (SLD) respectively. Then, these two feature categories are concatenated and propagated forwards by two fully connected layers for known DGA domain name detection and classification. Finally, both the openmax layer (the layer before the softmax layer) and the softmax layer are used to build One-Class Support Vector Machine (SVM) models for unknown classes recognition. In our experiments, OpenSMax model outperforms the state-of-art methods both in known and unknown DGA domain names detection tasks. Also, OpenSMax provides a bounded open space risk in theory, and therefore it formally provides an effective solution for unknown DGA domain name detection.

Details

AAAI Conference 2018 Conference Paper

RSDNE: Exploring Relaxed Similarity and Dissimilarity from Completely-Imbalanced Labels for Network Embedding

Zheng Wang
Xiaojun Ye
Chaokun Wang
Yuexin Wu
Changping Wang
Kaiwen Liang

Network embedding, aiming to project a network into a lowdimensional space, is increasingly becoming a focus of network research. Semi-supervised network embedding takes advantage of labeled data, and has shown promising performance. However, existing semi-supervised methods would get unappealing results in the completely-imbalanced label setting where some classes have no labeled nodes at all. To alleviate this, we propose a novel semi-supervised network embedding method, termed Relaxed Similarity and Dissimilarity Network Embedding (RSDNE). Speciﬁcally, to beneﬁt from the completely-imbalanced labels, RSDNE guarantees both intra-class similarity and inter-class dissimilarity in an approximate way. Experimental results on several real-world datasets demonstrate the superiority of the proposed method.

PDF Details

AAAI Conference 2017 Conference Paper

Multiple Source Detection without Knowing the Underlying Propagation Model

Zheng Wang
Chaokun Wang
Jisheng Pei
Xiaojun Ye

Information source detection, which is the reverse problem of information diffusion, has attracted considerable research effort recently. Most existing approaches assume that the underlying propagation model is ﬁxed and given as input, which may limit their application range. In this paper, we study the multiple source detection problem when the underlying propagation model is unknown. Our basic idea is source prominence, namely the nodes surrounded by larger proportions of infected nodes are more likely to be infection sources. As such, we propose a multiple source detection method called Label Propagation based Source Identiﬁcation (LPSI). Our method lets infection status iteratively propagate in the network as labels, and ﬁnally uses local peaks of the label propagation result as source nodes. In addition, both the convergent and iterative versions of LPSI are given. Extensive experiments are conducted on several real-world datasets to demonstrate the effectiveness of the proposed method.

PDF Details

IJCAI Conference 2016 Conference Paper

Causality Based Propagation History Ranking in Social Networks

Zheng Wang
Chaokun Wang
Jisheng Pei
Xiaojun Ye
Philip S. Yu

In social network sites (SNS), propagation histories which record the information diffusion process can be used to explain to users what happened in their networks. However, these histories easily grow in size and complexity, limiting their intuitive understanding by users. To reduce this information overload, in this paper, we present the problem of propagation history ranking. The goal is to rank participant edges/nodes by their contribution to the diffusion. Firstly, we discuss and adapt Difference of Causal Effects (DCE) as the ranking criterion. Then, to avoid the complex calculation of DCE, we propose a resp-cap ranking strategy by adopting two indicators. The first is responsibility which captures the necessary face of causal effects. We further give an approximate algorithm for this indicator. The second is capability which is defined to capture the sufficient face of causal effects. Finally, promising experimental results are presented to verify the feasibility of our method.

PDF Details