Author name cluster

Dan Meng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

1 author row

AAAI Conference 2026 Conference Paper

Aligning Cross-View Visual Geometries in LVLMs Through Human-Like Reasoning Learning

Yuming Qiao
Liang Luo
Dan Meng
Yifan Yang
Qingyuan Wang
Juntuo Wang
Yuwei Zhang
Ru Zhen

Spatial understanding is a critical capability for LVLMs (Large Vision-Language Models) to advance embodied AI applications. Existing works primarily focus on enhancing spatial understanding within a single frame, i.e., injecting 3D spatial concepts into LVLMs under single coordinate system. However, such improvements struggle in real-world tasks that require consistent cross-view spatial reasoning. In this paper, we propose CVVG-Reasoner(Cross-View Visual Geometries) that lifts single-frame spatial comprehension to unified cross-view spatial understanding by mimicking human-like cross-view reasoning mechanisms. First, we introduce MV3DSR(Multi-View 3D Spatial Reasoning), a scalable pipeline for cross-view spatial reasoning data generation, and construct MV3DSR-Dataset, a large-scale dataset with diverse 3D cross-view reasoning tasks. Based on MV3DSR, we propose MV3DSR-Bench, a comprehensive benchmark for evaluating cross-view spatial reasoning capabilities. Second, we design a three-stage training strategy: the first two stages progressively equip the model with (1) fundamental spatial knowledge and (2) human-like cross-view reasoning patterns, while the final stage employs reinforcement learning to further boost its performance. Extensive experiments demonstrate that our CVVG-Reasoner significantly outperforms existing 3D LLMs(Large Language Models) and advanced LVLMs in cross-view tasks while maintaining robust performance on out-of-domain data. Ablations further reveal that injecting human-like reasoning patterns yields 44% performance gain, validating the effectiveness of our design.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

MsRAG: Knowledge Augumented Image Captioning with Object-level Multi-source RAG

Yuming Qiao
Yuechen Wang
Dan Meng
Haonan Lu
Zhenyu Yang
Xudong Zhang

Language-Visual Large Models (LVLMs) have made significant strides in enhancing visual understanding capabilities. However, these models often struggle with knowledge-based visual tasks due to constrains in their pre-training data scope and timeliness. Existing Retrieval-Augmented Generation (RAG) methods can effectively solve the problem but primarily rely on user queries, limiting their applicability in scenarios without explicit language input. To overcome these challenges, we introduce MsRAG, a knowledge-augmented captioning framework designed to effectively retrieve and utilize external real-world knowledge, particularly in the absence of user queries, and perform dense captioning for subjects. MsRAG comprises three key components: (1) Parallel Visual Search Module. It retrieves fine-grained object-level knowledge using both online visual search engines and offline domain-knowledge databases, enhancing the robustness and richness of retrieved information. (2) Prompt Templates Pool. The prompt pool dynamically assigns appropriate prompts based on retrieved information, optimizing LVLMs' ability to leverage relevant data under complex RAG conditions. (3) Visual-RAG Alignment Module, which employs a novel visual prompting method to bridge the modality gap between textual RAG content and corresponding visual objects, enabling precise alignment of visual elements with their text-format RAG content. To validate the effectiveness of MsRAG, we conducted a series of qualitative and quantitative experiments. The evaluation results demonstrate the superiority of MsRAG over other methods.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

UltraRE: Enhancing RecEraser for Recommendation Unlearning via Error Decomposition

Yuyuan Li
Chaochao Chen
Yizhao Zhang
Weiming Liu
Lingjuan Lyu
Xiaolin Zheng
Dan Meng
Jun Wang

With growing concerns regarding privacy in machine learning models, regulations have committed to granting individuals the right to be forgotten while mandating companies to develop non-discriminatory machine learning systems, thereby fueling the study of the machine unlearning problem. Our attention is directed toward a practical unlearning scenario, i. e. , recommendation unlearning. As the state-of-the-art framework, i. e. , RecEraser, naturally achieves full unlearning completeness, our objective is to enhance it in terms of model utility and unlearning efficiency. In this paper, we rethink RecEraser from an ensemble-based perspective and focus on its three potential losses, i. e. , redundancy, relevance, and combination. Under the theoretical guidance of the above three losses, we propose a new framework named UltraRE, which simplifies and powers RecEraser for recommendation tasks. Specifically, for redundancy loss, we incorporate transport weights in the clustering algorithm to optimize the equilibrium between collaboration and balance while enhancing efficiency; for relevance loss, we ensure that sub-models reach convergence on their respective group data; for combination loss, we simplify the combination estimator without compromising its efficacy. Extensive experiments on three real-world datasets demonstrate the effectiveness of UltraRE.

PDF Details

AAAI Conference 2022 Conference Paper

Distributed Randomized Sketching Kernel Learning

Rong Yin
Yong Liu
Dan Meng

We investigate the statistical and computational requirements for distributed kernel ridge regression with randomized sketching (DKRR-RS) and successfully achieve the optimal learning rates with only a fraction of computations. More precisely, the proposed DKRR-RS combines sparse randomized sketching, divide-and-conquer and KRR to scale up kernel methods and successfully derives the same learning rate as the exact KRR with greatly reducing computational costs in expectation, at the basic setting, which outperforms previous state of the art solutions. Then, for the sake of the gap between theory and experiments, we derive the optimal learning rate in probability for DKRR-RS to reflect its generalization performance. Finally, to further improve the learning performance, we construct an efficient communication strategy for DKRR-RS and demonstrate the power of communications via theoretical assessment. An extensive experiment validates the effectiveness of DKRR-RS and the communication strategy on real datasets.

PDF Details

NeurIPS Conference 2022 Conference Paper

Randomized Sketches for Clustering: Fast and Optimal Kernel $k$-Means

Rong Yin
Yong Liu
Weiping Wang
Dan Meng

Kernel $k$-means is arguably one of the most common approaches to clustering. In this paper, we investigate the efficiency of kernel $k$-means combined with randomized sketches in terms of both statistical analysis and computational requirements. More precisely, we propose a unified randomized sketches framework to kernel $k$-means and investigate its excess risk bounds, obtaining the state-of-the-art risk bound with only a fraction of computations. Indeed, we prove that it suffices to choose the sketch dimension $\Omega(\sqrt{n})$ to obtain the same accuracy of exact kernel $k$-means with greatly reducing the computational costs, for sub-Gaussian sketches, the randomized orthogonal system (ROS) sketches, and Nystr\"{o}m kernel $k$-means, where $n$ is the number of samples. To the best of our knowledge, this is the first result of this kind for unsupervised learning. Finally, the numerical experiments on simulated data and real-world datasets validate our theoretical analysis.

PDF Details

AAAI Conference 2020 Conference Paper

Divide-and-Conquer Learning with Nyström: Optimal Rate and Algorithm

Rong Yin
Yong Liu
Lijing Lu
Weiping Wang
Dan Meng

Kernel Regularized Least Squares (KRLS) is a fundamental learner in machine learning. However, due to the high time and space requirements, it has no capability to large scale scenarios. Therefore, we propose DC-NY, a novel algorithm that combines divide-and-conquer method, Nyström, conjugate gradient, and preconditioning to scale up KRLS, has the same accuracy of exact KRLS and the minimum time and space complexity compared to the state-of-the-art approximate KRLS estimates. We present a theoretical analysis of DC-NY, including a novel error decomposition with the optimal statistical accuracy guarantees. Extensive experimental results on several real-world large-scale datasets containing up to 1M data points show that DC-NY signiﬁcantly outperforms the state-of-the-art approximate KRLS estimates.

PDF Details

AAAI Conference 2019 Conference Paper

Community Focusing: Yet Another Query-Dependent Community Detection

Zhuo Wang
Weiping Wang
Chaokun Wang
Xiaoyan Gu
Bo Li
Dan Meng

As a major kind of query-dependent community detection, community search finds a densely connected subgraph containing a set of query nodes. As density is the major consideration of community search, most methods of community search often find a dense subgraph with many vertices far from the query nodes, which are not very related to the query nodes. Motivated by this, a new problem called community focusing (CF) is studied. It finds a community where the members are close and densely connected to the query nodes. A distance-sensitive dense subgraph structure called β-attention-core is proposed to remove the vertices loosely connected to or far from the query nodes, and a combinational density is designed to guarantee the density of a subgraph. Then CF is formalized as finding a subgraph with the largest combinational density among the β-attention-core subgraphs containing the query nodes with the largest β. Thereafter, effective methods are devised for CF. Furthermore, a speed-up strategy is developed to make the methods scalable to large networks. Extensive experimental results on real and synthetic networks demonstrate the performance of our methods.

PDF Details

AAAI Conference 2018 Conference Paper

Learning Sentiment-Specific Word Embedding via Global Sentiment Representation

Peng Fu
Zheng Lin
Fengcheng Yuan
Weiping Wang
Dan Meng

Context-based word embedding learning approaches can model rich semantic and syntactic information. However, it is problematic for sentiment analysis because the words with similar contexts but opposite sentiment polarities, such as good and bad, are mapped into close word vectors in the embedding space. Recently, some sentiment embedding learning methods have been proposed, but most of them are designed to work well on sentence-level texts. Directly applying those models to document-level texts often leads to unsatis- ﬁed results. To address this issue, we present a sentimentspeciﬁc word embedding learning architecture that utilizes local context information as well as global sentiment representation. The architecture is applicable for both sentencelevel and document-level texts. We take global sentiment representation as a simple average of word embeddings in the text, and use a corruption strategy as a sentiment-dependent regularization. Extensive experiments conducted on several benchmark datasets demonstrate that the proposed architecture outperforms the state-of-the-art methods for sentiment classiﬁcation.

PDF Details