Arrow Research search

Author name cluster

Alan Ritter

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
2 author rows

Possible papers

7

AAAI Conference 2025 Conference Paper

CROSSNEWS: A Cross-Genre Authorship Verification and Attribution Benchmark

  • Marcus Ma
  • Duong Minh Le
  • Junmo Kang
  • Yao Dou
  • John Cadigan
  • Dayne Freitag
  • Alan Ritter
  • Wei Xu

Authorship models have historically generalized poorly to new domains because of the wide distribution of author-identifying signals across domains. In particular, the effects of topic and genre are highly domain-dependent and impact authorship analysis performance greatly. This paper addresses the existing data gap in authorship for these resources by introducing CROSSNEWS, a novel cross-genre dataset that connects formal journalistic articles and casual social media posts. CROSSNEWS is the largest authorship dataset of its kind for supporting both verification and attribution tasks, with comprehensive topic and genre annotations. We use CROSSNEWS to demonstrate that current models exhibit poor performance in genre transfer scenarios, underscoring the need for authorship models robust to genre-specific effects. We also explore SELMA, a new LLM embedding approach for large-scale authorship setups that outperforms existing models in both same-genre and cross-genre settings.

NeurIPS Conference 2025 Conference Paper

Language Models can Self-Improve at State-Value Estimation for Better Search

  • Ethan Mendes
  • Alan Ritter

Collecting ground-truth rewards or human demonstrations for multi-step reasoning tasks is often prohibitively expensive, especially in interactive domains such as web tasks. We introduce Self-Taught Lookahead (STL), a reward-free framework that improves language model–based value functions by reasoning explicitly about state transitions. STL can be viewed as a chain-of-thought analogue of the value iteration algorithm: instead of regressing directly on numeric values, a value LLM is trained to simulate a step of lookahead in natural language—predicting the next action, resulting state, and rationale for its value. This process refines value estimates without any labeled data. The self-supervised procedure yields more accurate state-value predictions, which in turn enable lightweight search algorithms to expand fewer states while maintaining strong performance. Empirically, STL-trained value models built on moderately sized (8B-parameter) open-weight LLMs boost web agent success rates by over 39%, achieving performance comparable to proprietary models. STL also generalizes to multi-hop question answering and math puzzles. Overall, STL enables small open-source models to guide efficient search, reducing inference costs by integrating explicit reasoning with value learning.

NeurIPS Conference 2025 Conference Paper

Probabilistic Reasoning with LLMs for Privacy Risk Estimation

  • Jonathan Zheng
  • Alan Ritter
  • Sauvik Das
  • Wei "Coco" Xu

Probabilistic reasoning is a key aspect of both human and artificial intelligence that allows for handling uncertainty and ambiguity in decision-making. In this paper, we introduce a new numerical reasoning task under uncertainty for large language models, focusing on estimating the privacy risk of user-generated documents containing privacy-sensitive information. We propose BRANCH, a new LLM methodology that estimates the $k$-privacy value of a text—the size of the population matching the given information. BRANCH factorizes a joint probability distribution of personal information as random variables. The probability of each factor in a population is estimated separately using a Bayesian network and combined to compute the final $k$-value. Our experiments show that this method successfully estimates the $k$-value 73% of the time, a 13% increase compared to o3-mini with chain-of-thought reasoning. We also find that LLM uncertainty is a good indicator for accuracy, as high variance predictions are 37. 47% less accurate on average.

ICLR Conference 2025 Conference Paper

Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

  • Junmo Kang
  • Leonid Karlinsky
  • Hongyin Luo
  • Zhen Wang 0041
  • Jacob A. Hansen
  • James R. Glass
  • David D. Cox
  • Rameswar Panda

We present Self-MoE, an approach that transforms a monolithic LLM into a compositional, modular system of self-specialized experts, named MiXSE (MiXture of Self-specialized Experts). Our approach leverages self-specialization, which constructs expert modules using self-generated synthetic data, each equipping a shared base LLM with distinct domain-specific capabilities, activated via self-optimized routing. This allows for dynamic and capability-specific handling of various target tasks, enhancing overall capabilities, without extensive human-labeled data and added parameters. Our empirical results reveal that specializing LLMs may exhibit potential trade-offs in performances on non-specialized tasks. On the other hand, our Self-MoE demonstrates substantial improvements (6.5%p on average) over the base LLM across diverse benchmarks such as knowledge, reasoning, math, and coding. It also consistently outperforms other methods, including instance merging and weight merging, while offering better flexibility and interpretability by design with semantic experts and routing. Our findings highlight the critical role of modularity, the applicability of Self-MoE to multiple base LLMs, and the potential of self-improvement in achieving efficient, scalable, and adaptable systems.

ICLR Conference 2024 Conference Paper

Constrained Decoding for Cross-lingual Label Projection

  • Duong Minh Le
  • Yang Chen 0065
  • Alan Ritter
  • Wei Xu 0004

Zero-shot cross-lingual transfer utilizing multilingual LLMs has become a popular learning paradigm for low-resource languages with no labeled training data. However, for NLP tasks that involve fine-grained predictions on words and phrases, the performance of zero-shot cross-lingual transfer learning lags far behind supervised fine-tuning methods. Therefore, it is common to exploit translation and label projection to further improve the performance by (1) translating training data that is available in a high-resource language (e.g., English) together with the gold labels into low-resource languages, and/or (2) translating test data in low-resource languages to a high-source language to run inference on, then projecting the predicted span-level labels back onto the original test data. However, state-of-the-art marker-based label projection methods suffer from translation quality degradation due to the extra label markers injected in the input to the translation model. In this work, we explore a new direction that leverages constrained decoding for label projection to overcome the aforementioned issues. Our new method not only can preserve the quality of translated texts but also has the versatility of being applicable to both translating training and translating test data strategies. This versatility is crucial as our experiments reveal that translating test data can lead to a considerable boost in performance compared to translating only training data. We evaluate on two cross-lingual transfer tasks, namely Named Entity Recognition and Event Argument Extraction, spanning 20 languages. The results demonstrate that our approach outperforms the state-of-the-art marker-based method by a large margin and also shows better performance than other label projection methods that rely on external word alignment.

ICLR Conference 2024 Conference Paper

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

  • Sam Toyer
  • Olivia Watkins
  • Ethan Adrian Mendes
  • Justin Svegliato
  • Luke Bailey
  • Tiffany Wang
  • Isaac Ong
  • Karim Elmaaroufi

While Large Language Models (LLMs) are increasingly being used in real-world applications, they remain vulnerable to *prompt injection attacks*: malicious third party prompts that subvert the intent of the system designer. To help researchers study this problem, we present a dataset of over 563,000 prompt injection attacks and 118,000 prompt-based "defenses" against prompt injection, all created by players of an online game called Tensor Trust. To the best of our knowledge, this is the first dataset that includes both human-generated attacks and defenses for instruction-following LLMs. The attacks in our dataset have easily interpretable structure, and shed light on the weaknesses of LLMs. We also use the dataset to create a benchmark for resistance to two types of prompt injection, which we refer to as *prompt extraction* and *prompt hijacking*. Our benchmark results show that many models are vulnerable to the attack strategies in the Tensor Trust dataset. Furthermore, we show that some attack strategies from the dataset generalize to deployed LLM-based applications, even though they have a very different set of constraints to the game. We release data and code at [tensortrust.ai/paper](https://tensortrust.ai/paper)

AAAI Conference 2015 Conference Paper

Never-Ending Learning

  • Tom Mitchell
  • William Cohen
  • Estevam Hruschka
  • Partha Talukdar
  • Justin Betteridge
  • Andrew Carlson
  • Bhavana Dalvi Mishra
  • Matthew Gardner

Whereas people learn many different types of knowledge from diverse experiences over many years, most current machine learning systems acquire just a single function or data model from just a single data set. We propose a never-ending learning paradigm for machine learning, to better reflect the more ambitious and encompassing type of learning performed by humans. As a case study, we describe the Never- Ending Language Learner (NELL), which achieves some of the desired properties of a never-ending learner, and we discuss lessons learned. NELL has been learning to read the web 24 hours/day since January 2010, and so far has acquired a knowledge base with over 80 million confidence-weighted beliefs (e. g. , servedWith(tea, biscuits)), while learning continually to improve its reading competence over time. NELL has also learned to reason over its knowledge base to infer new beliefs from old ones, and is now beginning to extend its ontology by synthesizing new relational predicates. NELL can be tracked online at http: //rtw. ml. cmu. edu, and followed on Twitter at @CMUNELL.