Author name cluster

Nan Zhang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

36 papers

2 author rows

EAAI Journal 2026 Journal Article

Graph channel receptive field transformer for multi-agent trajectory prediction

Jiankun Peng
Jiakang Wang
Nan Zhang
Di Wu
Chunye Ma

Details DOI

EAAI Journal 2026 Journal Article

Rethinking the local constraints: Geometric continuity regularization for image alignment

Yinqi Chen
Yangting Zheng
Peiwen Li
Weijian Luo
Shuo Kang
Xiang Gao
Chao Liu
Shuo Zhang

Details DOI

AAAI Conference 2026 Conference Paper

Vista: Scene-Aware Optimization for Streaming Video Question Answering Under Post-Hoc Queries

Haocheng Lu
Nan Zhang
Wei Tao
Xiaoyang Qu
Guokuan Li
Jiguang Wan
Jianzong Wang

Streaming video question answering (Streaming Video QA) poses distinct challenges for multimodal large language models (MLLMs), as video frames arrive sequentially and user queries can be issued at arbitrary timepoints. Existing solutions relying on fixed-size memory or naive compression often suffer from context loss or memory overflow, limiting their effectiveness in long-form, real-time scenarios.We present Vista, a novel framework for scene-aware streaming video QA that enables efficient and scalable reasoning over continuous video streams. The innovation of Vista can be summarized in three aspects: (1) Scene-aware segmentation. Vista dynamically clusters incoming frames into temporally and visually coherent scene units. (2) Scene-aware compression. Each scene is compressed into a compact token representation and stored in GPU memory for efficient index-based retrieval, while the full-resolution frames are offloaded to CPU memory. (3) Scene-aware recall. Upon receiving a question, relevant scenes are selectively recalled and reintegrated into the model’s input space, enabling both efficiency and completeness. Vista is model-agnostic and integrates seamlessly with a variety of vision-language backbones, enabling long-context reasoning without compromising latency or memory efficiency. Extensive experiments on StreamingBench demonstrate that Vista achieves state-of-the-art performance, establishing a strong baseline for real-world streaming video understanding.

PDF Details DOI

TCS Journal 2025 Journal Article

Improved SARSA and DQN algorithms for reinforcement learning

Guangyu Yao
Nan Zhang
Zhenhua Duan
Cong Tian

Details DOI

NeurIPS Conference 2025 Conference Paper

Mamba Only Glances Once (MOGO): A Lightweight Framework for Efficient Video Action Detection

Yunqing Liu
Nan Zhang
Fangjun Wang
Kengo Murata
Takuma Yamamoto
Osafumi Nakayama
Genta Suzuki
Zhiming Tan

Mamba, a lightweight sequence modeling framework offering near-linear complexity, presents a promising alternative to Transformers. In this work, we introduce MOGO (Mamba Only Glances Once), an end-to-end framework for efficient video action detection built entirely on the Mamba architecture. In MOGO, our newly designed Mamba-based decoder can even use just one Mamba layer to effectively perform action detection. It uses neither Transformer structures nor RCNN-like methods for proposal detection. Our framework introduces two key innovations. First, we propose a pure Mamba-based encoder-decoder architecture. The encoder processes cross-frame video information, while the decoder incorporates two novel Mamba-based structures that leverage Mamba’s intrinsic capabilities to detect actions. Theoretical analysis and ablation experiments confirm their synergy and the necessity of each structure. Second, we design a video token construction mechanism to improve the model's performance. The token importance block can ensure that the retained token information is highly relevant to the predicted targets. These two innovations make MOGO both efficient and accurate, as demonstrated on the JHMDB and UCF101-24 benchmark datasets. Compared to SOTA action detection methods, MOGO achieves superior performance in terms of GFLOPs, model parameters, and inference speed (latency) with comparable detection precision. Additionally, it requires significantly less GPU memory than some SOTA token reconstruction methods. Code is available at https: //github. com/YunqingLiu-ML/MOGO.

PDF Details

AAAI Conference 2025 Conference Paper

Point Cloud Semantic Segmentation with Sparse and Inhomogeneous Annotations

Zhiyi Pan
Nan Zhang
Wei Gao
Shan Liu
Ge Li

Utilizing uniformly distributed sparse annotations, weakly supervised learning alleviates the heavy reliance on fine-grained annotations in point cloud semantic segmentation tasks. However, few works discuss the inhomogeneity of sparse annotations, albeit it is common in real-world scenarios. Therefore, this work introduces the probability density function into the gradient sampling approximation method to qualitatively analyze the impact of annotation sparsity and inhomogeneity under weakly supervised learning. Based on our analysis, we propose an Adaptive Annotation Distribution Network (AADNet) capable of robust learning on arbitrarily distributed sparse annotations. Specifically, we propose a label-aware point cloud downsampling strategy to increase the proportion of annotations involved in the training stage. Furthermore, we design the multiplicative dynamic entropy as the gradient calibration function to mitigate the gradient bias caused by non-uniformly distributed sparse annotations and explicitly reduce the epistemic uncertainty. Without any prior restrictions and additional information, our proposed method achieves comprehensive performance improvements at multiple label rates and different annotation distributions.

PDF Details DOI

TCS Journal 2025 Journal Article

SAT-based bounded model checking for propositional projection temporal logic

Zhenhua Duan
Cong Tian
Nan Zhang
Chaofeng Yu
Mengfei Yang
Jia He

Details DOI

ICLR Conference 2025 Conference Paper

SiReRAG: Indexing Similar and Related Information for Multihop Reasoning

Nan Zhang
Prafulla Kumar Choubey
Alexander R. Fabbri
Gabriel Bernadett-Shapiro
Rui Zhang 0037
Prasenjit Mitra
Caiming Xiong
Chien-Sheng Wu

Indexing is an important step towards strong performance in retrieval-augmented generation (RAG) systems. However, existing methods organize data based on either semantic similarity (similarity) or related information (relatedness), but do not cover both perspectives comprehensively. Our analysis reveals that modeling only one perspective results in insufficient knowledge synthesis, leading to suboptimal performance on complex tasks requiring multihop reasoning. In this paper, we propose SiReRAG, a novel RAG indexing approach that explicitly considers both similar and related information. On the similarity side, we follow existing work and explore some variances to construct a similarity tree based on recursive summarization. On the relatedness side, SiReRAG extracts propositions and entities from texts, groups propositions via shared entities, and generates recursive summaries to construct a relatedness tree. We index and flatten both similarity and relatedness trees into a unified retrieval pool. Our experiments demonstrate that SiReRAG consistently outperforms state-of-the-art indexing methods on three multihop datasets (MuSiQue, 2WikiMultiHopQA, and HotpotQA), with an average 1.9% improvement in F1 scores. As a reasonably efficient solution, SiReRAG enhances existing reranking methods significantly, with up to 7.8% improvement in average F1 scores. Our code is available at https://github.com/SalesforceAIResearch/SiReRAG.

Details

UAI Conference 2024 Conference Paper

Bias-aware Boolean Matrix Factorization Using Disentangled Representation Learning

Xiao Wang 0099
Jia Wang
Tong Zhao 0002
Yijie Wang
Nan Zhang
Yong Zang
Sha Cao
Chi Zhang 0021

Boolean matrix factorization (BMF) has been widely utilized in fields such as recommendation systems, graph learning, text mining, and -omics data analysis. Traditional BMF methods decompose a binary matrix into the Boolean product of two lower-rank Boolean matrices plus homoscedastic random errors. However, real-world binary data typically involves biases arising from heterogeneous row- and column-wise signal distributions. Such biases can lead to suboptimal fitting and unexplainable predictions if not accounted for. In this study, we reconceptualize the binary data generation as the Boolean sum of three components: a binary pattern matrix, a background bias matrix influenced by heterogeneous row or column distributions, and random flipping errors. We introduce a novel Disentangled Representation Learning for Binary matrices (DRLB) method, which employs a dual auto-encoder network to reveal the true patterns. DRLB can be seamlessly integrated with existing BMF techniques to facilitate bias-aware BMF. Our experiments with both synthetic and real-world datasets show that DRLB significantly enhances the precision of traditional BMF methods while offering high scalability. Moreover, the bias matrix detected by DRLB accurately reflects the inherent biases in synthetic data, and the patterns identified in the bias-corrected real-world data exhibit enhanced interpretability.

Details

YNICL Journal 2024 Journal Article

Dysregulated cerebral blood flow, rather than gray matter Volume, exhibits stronger correlations with blood inflammatory and lipid markers in depression

Lijun Kang
Wei Wang
Zhaowen Nie
Qian Gong
Lihua Yao
Dan Xiang
Nan Zhang
Ning Tu

Details DOI

TCS Journal 2024 Journal Article

Generating Java code pairing with ChatGPT

Zelong Zhao
Nan Zhang
Bin Yu
Zhenhua Duan

Details DOI

AAAI Conference 2024 Conference Paper

Less Is More: Label Recommendation for Weakly Supervised Point Cloud Semantic Segmentation

Zhiyi Pan
Nan Zhang
Wei Gao
Shan Liu
Ge Li

Weak supervision has proven to be an effective strategy for reducing the burden of annotating semantic segmentation tasks in 3D space. However, unconstrained or heuristic weakly supervised annotation forms may lead to suboptimal label efficiency. To address this issue, we propose a novel label recommendation framework for weakly supervised point cloud semantic segmentation. Distinct from pre-training and active learning, the label recommendation framework consists of three stages: inductive bias learning, recommendations for points to be labeled, and point cloud semantic segmentation learning. In practice, we first introduce the point cloud upsampling task to induct inductive bias from structural information. During the recommendation stage, we present a cross-scene clustering strategy to generate centers of clustering as recommended points. Then we introduce a recommended point positions attention module LabelAttention to model the long-range dependency under sparse annotations. Additionally, we employ position encoding to enhance the spatial awareness of semantic features. Throughout the framework, the useful information obtained from inductive bias learning is propagated to subsequent semantic segmentation networks in the form of label positions. Experimental results demonstrate that our framework outperforms weakly supervised point cloud semantic segmentation methods and other methods for labeling efficiency on S3DIS and ScanNetV2, even at an extremely low label rate.

PDF Details DOI

EAAI Journal 2024 Journal Article

Mgformer: Multi-group transformer for multivariate time series classification

Jianfeng Wen
Nan Zhang
Xuzhe Lu
Zhongyi Hu
Hui Huang

Details DOI

TCS Journal 2023 Journal Article

A proof system for unified temporal logic

Nan Zhang
Chaofeng Yu
Zhenhua Duan
Cong Tian

Details DOI

IJCAI Conference 2023 Conference Paper

Null-Space Diffusion Sampling for Zero-Shot Point Cloud Completion

Xinhua Cheng
Nan Zhang
Jiwen Yu
Yinhuai Wang
Ge Li
Jian Zhang

Point cloud completion aims at estimating the complete data of objects from degraded observations. Despite existing completion methods achieving impressive performances, they rely heavily on degraded-complete data pairs for supervision. In this work, we propose a novel framework named Null-Space Diffusion Sampling (NSDS) to solve the point cloud completion task in a zero-shot manner. By leveraging a pre-trained point cloud diffusion model as the off-the-shelf generator, our sampling approach can generate desired completion outputs with the guidance of the observed degraded data without any extra training. Furthermore, we propose a tolerant loop mechanism to improve the quality of completion results for hard cases. Experimental results demonstrate our zero-shot framework achieves superior completion performance than unsupervised methods and comparable performance to supervised methods in various degraded situations.

PDF Details DOI

TCS Journal 2022 Journal Article

PPTL specification mining based on LNFG

Xinya Ning
Nan Zhang
Zhenhua Duan
Cong Tian

Details DOI

TCS Journal 2021 Journal Article

Temporal logic specification mining of programs

Nan Zhang
Bin Yu
Cong Tian
Zhenhua Duan
Xiaoshuai Yuan

Details DOI

TCS Journal 2021 Journal Article

Unified temporal logic

Nan Zhang
Zhenhua Duan
Cong Tian

Details DOI

TCS Journal 2020 Journal Article

A novel approach to verifying context free properties of programs

Nan Zhang
Zhenhua Duan
Cong Tian
Hongwei Du

Details DOI

TCS Journal 2020 Journal Article

A sound and complete proof system for a unified temporal logic

Liang Zhao
Xiaobing Wang
Xinfeng Shu
Nan Zhang

Details DOI

TCS Journal 2020 Journal Article

Efficient decision procedure for propositional projection temporal logic

Xinfeng Shu
Nan Zhang
Xiaobing Wang
Liang Zhao

Details DOI

TCS Journal 2020 Journal Article

Translating Xd-C programs to MSVL programs

Meng Wang
Cong Tian
Nan Zhang
Zhenhua Duan
Chenguang Yao

Details DOI

TCS Journal 2019 Journal Article

Index set expressions can represent temporal logic formulas

Zhenhua Duan
Cong Tian
Nan Zhang
Qian Ma
Hongwei Du

Details DOI

YNICL Journal 2019 Journal Article

Resting-state functional connectivity predicts individual language impairment of patients with left hemispheric gliomas involving language network

Binke Yuan
Nan Zhang
Jing Yan
Jingliang Cheng
Junfeng Lu
Jinsong Wu

Details DOI

TCS Journal 2018 Journal Article

A compiler for MSVL and its applications

Kai Yang
Zhenhua Duan
Cong Tian
Nan Zhang

Details DOI

EAAI Journal 2018 Journal Article

An evolving T–S fuzzy model identification approach based on a special membership function and its application on pump-turbine governing system

Chaoshun Li
Wen Zou
Nan Zhang
Xinjie Lai

Details DOI

ICRA Conference 2018 Conference Paper

Learning Place-and-Time-Dependent Binary Descriptors for Long-Term Visual Localization

Nan Zhang
Michael Warren
Tim D. Barfoot

Vision-based navigation is extremely susceptible to natural scene changes. This can result in localization failures in less than a few hours after map creation. To combat short-term illumination changes as well as long-term seasonal variations, we propose using a place-and-time-dependent binary descriptor that adapts to different scenarios in an online fashion. This is achieved by extending the GRIEF [6] evolution algorithm in two ways: correspondence generation using a known pose change and the inclusion of LATCH triplets in addition to BRIEF comparisons for descriptor generation. We show the adaptive descriptor outperforms a single descriptor scheme for localization within a single-experience Visual Teach and Repeat (VT&R) system while maintaining the efficiency of binary descriptors. By adapting the description function to different environmental conditions, it allows the system to operate for a longer period before a new experience is required. In the presence of extreme illumination changes from day to night, we obtain 40% more inlier matches compared to SURF. In the case of seasonal variations, a 70% increase is demonstrated. The increased correspondences result in more localizable sections along the paths, amounting to a 25% and 150% increase in the lighting and seasonal cases, respectively.

Details

TCS Journal 2016 Journal Article

A canonical form based decision procedure and model checking approach for propositional projection temporal logic

Zhenhua Duan
Cong Tian
Nan Zhang

Details DOI

TCS Journal 2016 Journal Article

A complete axiom system for propositional projection temporal logic with cylinder computation model

Nan Zhang
Zhenhua Duan
Cong Tian

Details DOI

TCS Journal 2016 Journal Article

A mechanism of function calls in MSVL

Nan Zhang
Zhenhua Duan
Cong Tian

Details DOI

EAAI Journal 2016 Journal Article

Parameter identification of a nonlinear model of hydraulic turbine governing system with an elastic water hammer based on a modified gravitational search algorithm

Chaoshun Li
Li Chang
Zhengjun Huang
Yi Liu
Nan Zhang

Details DOI

TCS Journal 2014 Journal Article

A formal proof of the deadline driven scheduler in PPTL axiomatic system

Nan Zhang
Zhenhua Duan
Cong Tian
Dingzhu Du

Details DOI

TCS Journal 2013 Journal Article

A complete proof system for propositional projection temporal logic

Zhenhua Duan
Nan Zhang
Maciej Koutny

Details DOI

TCS Journal 2013 Journal Article

A cylinder computation model for many-core parallel computing

Nan Zhang
Zhenhua Duan
Cong Tian

Details DOI

TCS Journal 2012 Journal Article

An efficient approach for abstraction-refinement in model checking

Cong Tian
Zhenhua Duan
Nan Zhang

Details DOI

IS Journal 2008 Journal Article

DynaCAS: Computational Experiments and Decision Support for ITS

Nan Zhang
Fei-Yue Wang
Fenghua Zhu
Dongbin Zhao
Shuming Tang

Accurate, reliable, and timely traffic information is critical for deployment and operation of intelligent transportation systems (ITSs). Traffic forecasting for travelers and traffic operators should become at least as useful and convenient as weather reports. In the US, the Federal Highway Administration (FHWA) has envisioned a real-time traffic estimation and prediction system (TrEPS) as an ITS support platform that resides at traffic management centers (TMCs) for dynamic route assignment (DRA) and other transportation operations.

Details DOI