Author name cluster

Ding Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

20 papers

2 author rows

TIST Journal 2026 Journal Article

Clue and Context Fusion for Sarcasm Detection with Large Multimodal Models

Qiuyu Li
Yushan Pan
Ding Wang
Wei Wang
Xiaowei Huang
Zhijie Xu

Detecting sarcasm in social media is fundamentally different from general VLM benchmarks: it is a pragmatic contradiction problem in which the literal signal in one modality is intentionally misaligned with the intended meaning, while dominant pre-training (e.g., CLIP-style contrastive agreement) biases models toward modality alignment rather than incongruity detection. We present SCARF, a contradiction-aware framework that equips large multimodal models with explicit sarcasm cues and context-sensitive retrieval. SCARF constructs coarse scene cues and fine localized evidence via tag-constrained QA, then distills them with visual tokens into a [FUSION] control vector for the LLM; a label-contrastive retriever supplies type- and context-matched exemplars, and a local multi-view encoder surfaces micro-cues. With the same backbone and training data, SCARF attains 87.92% Acc/86.67% F1 on MMSD2.0 and 77.14% Acc/76.44% F1 zero-shot on XDMSD, outperforming a comparably fine-tuned LLaVA-1.5. Ablations show sarcasm clue fusion is the main driver of gains, and tag-constrained QA improves rationale grounding and reduces hallucinations.

Details DOI

TMLR Journal 2026 Journal Article

Decoding Safety Feedback from Diverse Raters: A Data-driven Lens on Responsiveness to Severity

Pushkar Mishra
Charvi Rastogi
Stephen R Pfohl
Alicia Parrish
Tian Huey Teh
Roma Patel
Mark Diaz
Ding Wang

Ensuring the safety of Generative AI requires a nuanced understanding of pluralistic viewpoints. In this paper, we introduce a novel data-driven approach for analyzing ordinal safety ratings in pluralistic settings. Specifically, we address the challenge of interpreting nuanced differences in safety feedback from a diverse population expressed via ordinal scales (e.g., a Likert scale). We define non-parametric responsiveness metrics that quantify how raters convey broader distinctions and granular variations in the severity of safety violations. Leveraging publicly available datasets of pluralistic safety feedback as our case studies, we investigate how raters from different demographic groups use an ordinal scale to express their perceptions of the severity of violations. We apply our metrics across violation types, demonstrating their utility in extracting nuanced insights that are crucial for aligning AI systems reliably in multi-cultural contexts. We show that our approach can inform rater selection and feedback interpretation by capturing nuanced viewpoints across different demographic groups, hence improving the quality of pluralistic data collection and in turn contributing to more robust AI alignment.

PDF Details

AAAI Conference 2026 Conference Paper

Latent Knowledge-Guided Video Diffusion for Scientific Phenomena Generation from a Single Initial Frame

Qinglong Cao
Xirui Li
Ding Wang
Chao Ma
Yuntian Chen
Xiaokang Yang

Video diffusion models have achieved impressive results in natural scene generation, yet they struggle to generalize to scientific phenomena such as fluid simulations and meteorological processes, where underlying dynamics are governed by scientific laws. These tasks pose unique challenges, including severe domain gaps, limited training data, and the lack of descriptive language annotations. To handle this dilemma, we extracted the latent scientific phenomena knowledge and further proposed a fresh framework that teaches video diffusion models to generate scientific phenomena from a single initial frame. Particularly, static knowledge is extracted via pre-trained masked autoencoders, while dynamic knowledge is derived from pre-trained optical flow prediction. Subsequently, based on the aligned spatial relations between the CLIP vision and language encoders, the visual embeddings of scientific phenomena, guided by latent scientific phenomena knowledge, are projected to generate the pseudo-language prompt embeddings in both spatial and frequency domains. By incorporating these prompts and fine-tuning the video diffusion model, we enable the generation of videos that better adhere to scientific laws. Extensive experiments on both computational fluid dynamics simulations and real-world typhoon observations demonstrate the effectiveness of our approach, achieving superior fidelity and consistency across diverse scientific scenarios.

PDF Details DOI

AAAI Conference 2026 Conference Paper

LeanRAG: Knowledge-Graph-Based Generation with Semantic Aggregation and Hierarchical Retrieval

Yaoze Zhang
Rong Wu
Pinlong Cai
Xiaoman Wang
Guohang Yan
Song Mao
Ding Wang
Botian Shi

Retrieval-Augmented Generation (RAG) plays a crucial role in grounding Large Language Models by leveraging external knowledge, whereas the effectiveness is often compromised by the retrieval of contextually flawed or incomplete information. To address this, knowledge graph-based RAG methods have evolved towards hierarchical structures, organizing knowledge into multi-level summaries. However, these approaches still suffer from two critical, unaddressed challenges: high-level conceptual summaries exist as disconnected ``semantic islands'', lacking the explicit relations needed for cross-community reasoning; and the retrieval process itself remains structurally unaware, often degenerating into an inefficient flat search that fails to exploit the graph's rich topology. To overcome these limitations, we introduce LeanRAG, a framework that features a deeply collaborative design combining knowledge aggregation and retrieval strategies. LeanRAG first employs a novel semantic aggregation algorithm that forms entity clusters and constructs new explicit relations among aggregation-level summaries, creating a fully navigable semantic network. Then, a bottom-up, structure-guided retrieval strategy anchors queries to the most relevant fine-grained entities and then systematically traverses the graph's semantic pathways to gather concise yet contextually comprehensive evidence sets. The LeanRAG can mitigate the substantial overhead associated with path retrieval on graphs and minimize redundant information retrieval. Extensive experiments on four challenging QA benchmarks with different domains demonstrate that LeanRAG significantly outperforms existing methods in response quality while reducing 46% retrieval redundancy.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Hierarchical Mixture of Experts: Generalizable Learning for High-Level Synthesis

Weikai Li
Ding Wang
Zijian Ding
Atefeh Sohrabizadeh
Zongyue Qin
Jason Cong
Yizhou Sun

High-level synthesis (HLS) is a widely used tool in designing Field Programmable Gate Array (FPGA). HLS enables FPGA design with software programming languages by compiling the source code into an FPGA circuit. The source code includes a program (called ``kernel'') and several pragmas that instruct hardware synthesis, such as parallelization, pipeline, etc. While it is relatively easy for software developers to design the program, it heavily relies on hardware knowledge to design the pragmas, posing a big challenge for software developers. Recently, different machine learning algorithms, such as GNNs, have been proposed to automate the pragma design via performance prediction. However, when applying the trained model on new kernels, the significant domain shift often leads to unsatisfactory performance. We propose a more domain-generalizable model structure: a two-level hierarchical Mixture of Experts (MoE), that can be flexibly adapted to any GNN model. Different expert networks can learn to deal with different regions in the representation space, and they can utilize similar patterns between the old kernels and new kernels. In the low-level MoE, we apply MoE on three natural granularities of a program: node, basic block, and graph. The high-level MoE learns to aggregate the three granularities for the final decision. To stably train the hierarchical MoE, we further propose a two-stage training method. Extensive experiments verify the effectiveness of the hierarchical MoE.

PDF Details DOI

IROS Conference 2025 Conference Paper

Improved Calibration for Panoramic Annular Lens Systems with Angular Modulation

Ding Wang
Junhua Wang
Yuhan Tian
Min Xu
Lingbao Kong

This paper addresses the challenges of calibrating Panoramic Annular Lens (PAL) systems, which exhibit unique projection characteristics due to their imaging relationship designed to compress blind zones. Traditional camera calibration methods often fail to accurately capture these properties. To resolve this limitation, we propose a novel projection model that incorporates angular modulation, enabling a more accurate representation of the PAL system’s imaging process. This formulation significantly improves the model’s ability to describe the relationship between object space and image space. We evaluate our approach on both synthetic and real-world datasets tailored for PAL cameras. Experimental results demonstrate that the model achieves sub-pixel accuracy, with reprojection errors typically ranging from 0. 1 to 0. 3 pixels on 2048×2048 images when using five distortion terms. This level of precision surpasses existing calibration models for panoramic cameras, making our method particularly suitable for high-accuracy applications. The datasets used in this study are publicly available at https://github.com/wwendy233/PALcalib.

Details

NeurIPS Conference 2025 Conference Paper

Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models

Charvi Rastogi
Tian Huey Teh
Pushkar Mishra
Roma Patel
Ding Wang
Mark Díaz
Alicia Parrish
Aida Mostafazadeh Davani

Current text-to-image (T2I) models often fail to account for diverse human experiences, leading to misaligned systems. We advocate for pluralism in AI alignment, where an AI understands and is steerable towards diverse, and often conflicting, human values. Our work provides three core contributions to achieve this in T2I models. First, we introduce a novel dataset for Diverse Intersectional Visual Evaluation (DIVE) -- the first multimodal dataset for pluralistic alignment. It enables deep alignment to diverse safety perspectives through a large pool of demographically intersectional human raters who provided extensive feedback across 1000 prompts, with high replication, capturing nuanced safety perceptions. Second, we empirically confirm demographics as a crucial proxy for diverse viewpoints in this domain, revealing significant, context-dependent differences in harm perception that diverge from conventional evaluations. Finally, we discuss implications for building aligned T2I models, including efficient data collection strategies, LLM judgment capabilities, and model steerability towards diverse perspectives. This research offers foundational tools for more equitable and aligned T2I systems. Content Warning: The paper includes sensitive content that may be harmful.

PDF Details

EAAI Journal 2024 Journal Article

Adaptive critic design with weight allocation for intelligent learning control of wastewater treatment plants

Ding Wang
Hongyu Ma
Jin Ren
Ning Gao
Junfei Qiao

Details DOI

EAAI Journal 2024 Journal Article

An effective data-driven water quality modeling and water quality risk assessment method

Zhiyao Zhao
Bing Fan
Yuqin Zhou
Ding Wang

Details DOI

NeurIPS Conference 2024 Conference Paper

Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization

Dingshuo Chen
Zhixun Li
Yuyan Ni
Guibin Zhang
Ding Wang
Qiang Liu
Shu Wu
Jeffrey X. Yu

With the emergence of various molecular tasks and massive datasets, how to perform efficient training has become an urgent yet under-explored issue in the area. Data pruning (DP), as an oft-stated approach to saving training burdens, filters out less influential samples to form a coreset for training. However, the increasing reliance on pretrained models for molecular tasks renders traditional in-domain DP methods incompatible. Therefore, we propose a Mol ecular data P runing framework for e nhanced G eneralization ( MolPeg ), which focuses on the source-free data pruning scenario, where data pruning is applied with pretrained models. By maintaining two models with different updating paces during training, we introduce a novel scoring function to measure the informativeness of samples based on the loss discrepancy. As a plug-and-play framework, MolPeg realizes the perception of both source and target domain and consistently outperforms existing DP methods across four downstream tasks. Remarkably, it can surpass the performance obtained from full-dataset training, even when pruning up to 60-70% of the data on HIV and PCBA dataset. Our work suggests that the discovery of effective data-pruning metrics could provide a viable path to both enhanced efficiency and superior generalization in transfer learning.

PDF Details DOI

EAAI Journal 2024 Journal Article

Multilayer adaptive critic design with digital twin for data-driven optimal tracking control and industrial applications

Ding Wang
Hongyu Ma
Junfei Qiao

Details DOI

EAAI Journal 2024 Journal Article

Reinforcement learning control with n-step information for wastewater treatment systems

Xin Li
Ding Wang
Mingming Zhao
Junfei Qiao

Details DOI

EAAI Journal 2023 Journal Article

Data-driven tracking control design with reinforcement learning involving a wastewater treatment application

Ding Wang
Xin Li
Lingzhi Hu
Junfei Qiao

Details DOI

NeurIPS Conference 2023 Conference Paper

DICES Dataset: Diversity in Conversational AI Evaluation for Safety

Lora Aroyo
Alex Taylor
Mark Díaz
Christopher Homan
Alicia Parrish
Gregory Serapio-García
Vinodkumar Prabhakaran
Ding Wang

Machine learning approaches often require training and evaluation datasets with a clear separation between positive and negative examples. This requirement overly simplifies the natural subjectivity present in many tasks, and obscures the inherent diversity in human perceptions and opinions about many content items. Preserving the variance in content and diversity in human perceptions in datasets is often quite expensive and laborious. This is especially troubling when building safety datasets for conversational AI systems, as safety is socio-culturally situated in this context. To demonstrate this crucial aspect of conversational AI safety, and to facilitate in-depth model performance analyses, we introduce the DICES (Diversity In Conversational AI Evaluation for Safety) dataset that contains fine-grained demographics information about raters, high replication of ratings per item to ensure statistical power for analyses, and encodes rater votes as distributions across different demographics to allow for in-depth explorations of different aggregation strategies. The DICES dataset enables the observation and measurement of variance, ambiguity, and diversity in the context of safety for conversational AI. We further describe a set of metrics that show how rater diversity influences safety perception across different geographic regions, ethnicity groups, age groups, and genders. The goal of the DICES dataset is to be used as a shared resource and benchmark that respects diverse perspectives during safety evaluation of conversational AI systems.

PDF Details

EAAI Journal 2023 Journal Article

Event-based online learning control design with eligibility trace for discrete-time unknown nonlinear systems

Ding Wang
Jiangyu Wang
Lingzhi Hu
Mingming Zhao

Details DOI

EAAI Journal 2021 Journal Article

Adaptive-critic-based hybrid intelligent optimal tracking for a class of nonlinear discrete-time systems

Ding Wang
Mingming Zhao
Mingming Ha
Lingzhi Hu

Details DOI

ICML Conference 2021 Conference Paper

Learning Deep Neural Networks under Agnostic Corrupted Supervision

Boyang Liu
Mengying Sun
Ding Wang
Pang-Ning Tan
Jiayu Zhou

Training deep neural network models in the presence of corrupted supervision is challenging as the corrupted data points may significantly impact generalization performance. To alleviate this problem, we present an efficient robust algorithm that achieves strong guarantees without any assumption on the type of corruption and provides a unified framework for both classification and regression problems. Unlike many existing approaches that quantify the quality of the data points (e. g. , based on their individual loss values), and filter them accordingly, the proposed algorithm focuses on controlling the collective impact of data points on the average gradient. Even when a corrupted data point failed to be excluded by our algorithm, the data point will have a very limited impact on the overall loss, as compared with state-of-the-art filtering methods based on loss values. Extensive experiments on multiple benchmark datasets have demonstrated the robustness of our algorithm under different types of corruption. Our code is available at \url{https: //github. com/illidanlab/PRL}.

Details

IJCAI Conference 2021 Conference Paper

RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection

Boyang Liu
Ding Wang
Kaixiang Lin
Pang-Ning Tan
Jiayu Zhou

Unsupervised anomaly detection plays a crucial role in many critical applications. Driven by the success of deep learning, recent years have witnessed growing interests in applying deep neural networks (DNNs) to anomaly detection problems. A common approach is using autoencoders to learn a feature representation for the normal observations in the data. The reconstruction error of the autoencoder is then used as outlier scores to detect the anomalies. However, due to the high complexity brought upon by the over-parameterization of DNNs, the reconstruction error of the anomalies could also be small, which hampers the effectiveness of these methods. To alleviate this problem, we propose a robust framework using collaborative autoencoders to jointly identify normal observations from the data while learning its feature representation. We investigate the theoretical properties of the framework and empirically show its outstanding performance as compared to other DNN-based methods. Our experimental results also show the resiliency of the framework to missing values compared to other baseline methods.

PDF Details DOI

AAAI Conference 2020 Conference Paper

OMuLeT: Online Multi-Lead Time Location Prediction for Hurricane Trajectory Forecasting

Ding Wang
Boyang Liu
Pang-Ning Tan
Lifeng Luo

Hurricanes are powerful tropical cyclones with sustained wind speeds ranging from at least 74 mph (for category 1 storms) to more than 157 mph (for category 5 storms). Accurate prediction of the storm tracks is essential for hurricane preparedness and mitigation of storm impacts. In this paper, we cast the hurricane trajectory forecasting task as an online multi-lead time location prediction problem and present a framework called OMuLeT to improve path prediction by combining the 6-hourly and 12-hourly forecasts generated from an ensemble of dynamical (physical) hurricane models. OMuLeT employs an online learning with restart strategy to incrementally update the weights of the ensemble model combination as new observation data become available. It can also handle the varying dynamical models available for predicting the trajectories of different hurricanes. Experimental results using the Atlantic and Eastern Paciﬁc hurricane data showed that OMuLeT signiﬁcantly outperforms various baseline methods, including the ofﬁcial forecasts produced by the U. S. National Hurricane Center (NHC), by more than 10% in terms of its 48-hour lead time forecasts.

PDF Details

ICRA Conference 2007 Conference Paper

Behavior Based Adaptive Control for Autonomous Oceanographic Sampling

Donald P. Eickstedt
Michael R. Benjamin
Ding Wang
Joseph A. Curcio
Henrik Schmidt

This paper describes an investigation into the adaptive control of autonomous mobile sensor platforms for providing oceanographic sampling. Mobile sensor platforms provide an ability to rapidly sample oceanographic data of interest for real-time input into ocean environmental models with the goal of reducing the modeling uncertainty by introducing selected sampled data. The major objective of this paper is to describe the autonomy architecture developed to support adaptive sampling. This architecture consists of an open-source distributed autonomy architecture and an approach to behavior-based control of autonomous vehicles using multiple objective functions that allows reactive control in complex environments with multiple constraints. Experimental results are provided for an adaptive ocean thermal gradient tracking application performed by an autonomous surface craft in Monterey Bay. These results highlight not only the suitability of autonomous sensor platforms for providing adaptive sampling of the ocean environment but, also, the suitability of our behavior-based autonomy approach and distributed autonomy architecture for providing a simple, flexible, and scalable method for autonomous sensor platform control. The paper concludes with an overview of future adaptive sampling experiments planned with autonomous underwater sensor platforms using the same methodology.

Details