Arrow Research search

Author name cluster

Richard Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
1 author row

Possible papers

7

NeurIPS Conference 2025 Conference Paper

HeavyWater and SimplexWater: Distortion-free LLM Watermarks for Low-Entropy Distributions

  • Dor Tsur
  • Carol Long
  • Claudio Mayrink Verdun
  • Sajani Vithana
  • Hsiang Hsu
  • Richard Chen
  • haim permuter
  • Flavio Calmon

Large language model (LLM) watermarks enable authentication of text provenance, curb misuse of machine-generated text, and promote trust in AI systems. Current watermarks operate by changing the next-token predictions output by an LLM. The updated (i. e. , watermarked) predictions depend on random side information produced, for example, by hashing previously generated tokens. LLM watermarking is particularly challenging in low-entropy generation tasks -- such as coding -- where next-token predictions are near-deterministic. In this paper, we propose an optimization framework for watermark design. Our goal is to understand how to most effectively use random side information in order to maximize the likelihood of watermark detection and minimize the distortion of generated text. Our analysis informs the design of two new watermarks: HeavyWater and SimplexWater. Both watermarks are tunable, gracefully trading-off between detection accuracy and text distortion. They can also be applied to any LLM and are agnostic to side information generation. We examine the performance of HeavyWater and SimplexWater through several benchmarks, demonstrating that they can achieve high watermark detection accuracy with minimal compromise of text generation quality, particularly in the low-entropy regime. Our theoretical analysis also reveals surprising new connections between LLM watermarking and coding theory.

AAAI Conference 2025 Conference Paper

Improving Model Probability Calibration by Integration of Large Data Sources with Biased Labels

  • Renat Sergazinov
  • Richard Chen
  • Cheng Ji
  • Jing Wu
  • Daniel Cociorva
  • Hakan Brunzell

Probability calibration transforms raw output of a classification model into empirically interpretable probability. When the model is purposed to detect rare event and only a small expensive data source has clean labels, it becomes extraordinarily challenging to obtain accurate probability calibration. Utilizing an additional large cheap data source is very helpful, however, such data sources oftentimes suffer from biased labels. To this end, we introduce an approximate expectation-maximization (EM) algorithm to extract useful information from the large data sources. For a family of calibration methods based on the logistic likelihood, we derive closed-form updates and call the resulting iterative algorithm CalEM. We show that CalEM inherits convergence guarantees from the approximate EM algorithm. We test the proposed model in simulation and on the real marketing datasets, where it shows significant performance increases.

NeurIPS Conference 2025 Conference Paper

The Unseen Threat: Residual Knowledge in Machine Unlearning under Perturbed Samples

  • Hsiang Hsu
  • Pradeep Niroula
  • Zichang He
  • Ivan Brugere
  • Freddy Lecue
  • Richard Chen

Machine unlearning offers a practical alternative to avoid full model re-training by approximately removing the influence of specific user data. While existing methods certify unlearning via statistical indistinguishability from re-trained models, these guarantees do not naturally extend to model outputs when inputs are adversarially perturbed. In particular, slight perturbations of forget samples may still be correctly recognized by the unlearned model---even when a re-trained model fails to do so---revealing a novel privacy risk: information about the forget samples may persist in their local neighborhood. In this work, we formalize this vulnerability as residual knowledge and show that it is inevitable in high-dimensional settings. To mitigate this risk, we propose a fine-tuning strategy, named RURK, that penalizes the model’s ability to re-recognize perturbed forget samples. Experiments on vision benchmarks with deep neural networks demonstrate that residual knowledge is prevalent across existing unlearning methods and that our approach effectively prevents residual knowledge.

IJCAI Conference 2024 Conference Paper

Enabling Sustainable Freight Forwarding Network via Collaborative Games

  • Pang-Jin Tan
  • Shih-Fen Cheng
  • Richard Chen

Freight forwarding plays a crucial role in facilitating global trade and logistics. However, as the freight forwarding market is extremely fragmented, freight forwarders often face the issue of not being able to fill the available shipping capacity. This recurrent issue motivates the creation of various freight forwarding networks that aim at exchanging capacities and demands so that the resource utilization of individual freight forwarders can be maximized. In this paper, we focus on how to design such a collaborative network based on collaborative game theory, with the Shapley value representing a fair scheme for profit sharing. Noting that the exact computation of Shapley values is intractable for large-scale real-world scenarios, we incorporate the observation that collaboration among two forwarders is only possible if their service routes and demands overlap. This leads to a new class of collaborative games called the Locally Collaborative Games (LCGs), where agents can only collaborate with their neighbors. We propose an efficient approach to compute Shapley values for LCGs, and numerically demonstrate that our approach significantly outperforms the state-of-the-art approach for a wide variety of network structures.

NeurIPS Conference 2023 Conference Paper

Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework

  • Paul Pu Liang
  • Yun Cheng
  • Xiang Fan
  • Chun Kai Ling
  • Suzanne Nie
  • Richard Chen
  • Zihao Deng
  • Nicholas Allen

The recent explosion of interest in multimodal applications has resulted in a wide selection of datasets and methods for representing and integrating information from different modalities. Despite these empirical advances, there remain fundamental research questions: How can we quantify the interactions that are necessary to solve a multimodal task? Subsequently, what are the most suitable multimodal models to capture these interactions? To answer these questions, we propose an information-theoretic approach to quantify the degree of redundancy, uniqueness, and synergy relating input modalities with an output task. We term these three measures as the PID statistics of a multimodal distribution (or PID for short), and introduce two new estimators for these PID statistics that scale to high-dimensional distributions. To validate PID estimation, we conduct extensive experiments on both synthetic datasets where the PID is known and on large-scale multimodal benchmarks where PID estimations are compared with human annotations. Finally, we demonstrate their usefulness in (1) quantifying interactions within multimodal datasets, (2) quantifying interactions captured by multimodal models, (3) principled approaches for model selection, and (4) three real-world case studies engaging with domain experts in pathology, mood prediction, and robotic perception where our framework helps to recommend strong multimodal models for each application.

NeurIPS Conference 2022 Conference Paper

Procedural Image Programs for Representation Learning

  • Manel Baradad
  • Richard Chen
  • Jonas Wulff
  • Tongzhou Wang
  • Rogerio Feris
  • Antonio Torralba
  • Phillip Isola

Learning image representations using synthetic data allows training neural networks without some of the concerns associated with real images, such as privacy and bias. Existing work focuses on a handful of curated generative processes which require expert knowledge to design, making it hard to scale up. To overcome this, we propose training with a large dataset of twenty-one thousand programs, each one generating a diverse set of synthetic images. These programs are short code snippets, which are easy to modify and fast to execute using OpenGL. The proposed dataset can be used for both supervised and unsupervised representation learning, and reduces the gap between pre-training with real and procedurally generated images by 38%.

YNIMG Journal 2019 Journal Article

Task activations produce spurious but systematic inflation of task functional connectivity estimates

  • Michael W. Cole
  • Takuya Ito
  • Douglas Schultz
  • Ravi Mill
  • Richard Chen
  • Carrisa Cocuzza

Most neuroscientific studies have focused on task-evoked activations (activity amplitudes at specific brain locations), providing limited insight into the functional relationships between separate brain locations. Task-state functional connectivity (FC) – statistical association between brain activity time series during task performance – moves beyond task-evoked activations by quantifying functional interactions during tasks. However, many task-state FC studies do not remove the first-order effect of task-evoked activations prior to estimating task-state FC. It has been argued that this results in the ambiguous inference "likely active or interacting during the task", rather than the intended inference "likely interacting during the task". Utilizing a neural mass computational model, we verified that task-evoked activations substantially and inappropriately inflate task-state FC estimates, especially in functional MRI (fMRI) data. Various methods attempting to address this problem have been developed, yet the efficacies of these approaches have not been systematically assessed. We found that most standard approaches for fitting and removing mean task-evoked activations were unable to correct these inflated correlations. In contrast, methods that flexibly fit mean task-evoked response shapes effectively corrected the inflated correlations without reducing effects of interest. Results with empirical fMRI data confirmed the model's predictions, revealing activation-induced task-state FC inflation for both Pearson correlation and psychophysiological interaction (PPI) approaches. These results demonstrate that removal of mean task-evoked activations using an approach that flexibly models task-evoked response shape is an important preprocessing step for valid estimation of task-state FC.