Author name cluster

Robert Sim

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

23 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Contextual Integrity in LLMs via Reasoning and Reinforcement Learning

Guangchen (Eric) Lan
Huseyin A. Inan
Sahar Abdelnabi
Janardhan Kulkarni
Lukas Wutschitz
Reza Shokri
Christopher Brinton
Robert Sim

As the era of autonomous agents making decisions on behalf of users unfolds, ensuring contextual integrity (CI) -- what is the appropriate information to share while carrying out a certain task -- becomes a central question to the field. We posit that CI demands a form of reasoning where the agent needs to reason about the context in which it is operating. To test this, we first prompt LLMs to reason explicitly about CI when deciding what information to disclose. We then extend this approach by developing a reinforcement learning (RL) framework that further instills in models the reasoning necessary to achieve CI. Using a synthetic, automatically created, dataset of only $\sim700$ examples but with diverse contexts and information disclosure norms, we show that our method substantially reduces inappropriate information disclosure while maintaining task performance across multiple model sizes and families. Importantly, improvements transfer from this synthetic dataset to established CI benchmarks such as PrivacyLens that has human annotations and evaluates privacy leakage of AI assistants in actions and tool calls.

PDF Details

NeurIPS Conference 2025 Conference Paper

Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings

Yehya Farhat
Hamza ElMokhtar Shili
Fangshuo Liao
Chen Dun
Mirian Hipolito Garcia
Guoqing Zheng
Ahmed Awadallah
Robert Sim

Mixture-of-Experts (MoEs) achieve scalability by dynamically activating subsets of their components. Yet, understanding how expertise emerges through joint training of gating mechanisms and experts remains incomplete, especially in scenarios without clear task partitions. Motivated by inference costs and data heterogeneity, we study how joint training of gating functions and experts can dynamically allocate domain-specific expertise across multiple underlying data distributions. As an outcome of our framework, we develop an instance tailored specifically to decentralized training scenarios, introducing Dynamically Decentralized Orchestration of MoEs or DDOME. DDOME leverages heterogeneity emerging from distributional shifts across decentralized data sources to specialize experts dynamically. By integrating a pretrained common expert to inform a gating function, DDOME achieves personalized expert subset selection on-the-fly, facilitating just-in-time personalization. We empirically validate DDOME within a Federated Learning (FL) context: DDOME attains from 4\% up to an 24\% accuracy improvement over state-of-the-art FL baselines in image and text classification tasks, while maintaining competitive zero-shot generalization capabilities. Furthermore, we provide theoretical insights confirming that the joint gating-experts training is critical for achieving meaningful expert specialization.

PDF Details

AAAI Conference 2025 Conference Paper

Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation

Chen Dun
Mirian Del Carmen Hipolito Garcia
Guoqing Zheng
Ahmed Hassan Awadallah
Robert Sim
Anastasios Kyrillidis

Prompt instruction tuning is a popular approach to better adjust pretrained LLMs for specific downstream tasks. How to extend this approach to simultaneously handle multiple tasks and data distributions is an interesting question. We propose Mixture of Prompts (MoPs) with smart gating functionality. Our proposed system identifies relevant skills embedded in different groups of prompts and dynamically weighs experts (i.e., collection of prompts) based on the target task. Experiments show that MoPs are resilient to model compression, data source, and task composition, making them highly versatile and applicable in various contexts. In practice, MoPs can simultaneously mitigate prompt training ``interference'' in multi-task, multi-source scenarios (e.g., task and data heterogeneity across sources) and possible implications from model approximations. Empirically, MoPs show particular effectiveness in compressed model scenarios, while maintaining favorable performance in uncompressed settings: MoPs can reduce final perplexity from 9% up to 70% in non-i.i.d. distributed cases and from 3% up to 30% in centralized cases, compared to baselines.

PDF Details DOI

ICLR Conference 2024 Conference Paper

Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing

Dujian Ding
Ankur Mallick
Chi Wang 0001
Robert Sim
Subhabrata Mukherjee
Victor Rühle
Laks V. S. Lakshmanan
Ahmed Hassan Awadallah

Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e.g., edge) devices, tend to lag behind in terms of response quality. Therefore in this work we propose a hybrid inference approach which combines their respective strengths to save cost and maintain quality. Our approach uses a router that assigns queries to the small or large model based on the predicted query difficulty and the desired quality level. The desired quality level can be tuned dynamically at test time to seamlessly trade quality for cost as per the scenario requirements. In experiments our approach allows us to make up to 40% fewer calls to the large model, with no drop in response quality.