Arrow Research search

Author name cluster

Fan Wei

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

FOCS Conference 2025 Conference Paper

Density Measures for Language Generation

  • Jon M. Kleinberg
  • Fan Wei

The recent successes of large language models (LLMs) have led to a surge of theoretical research into the properties of language generation. A recent line of work has proposed an abstract view of the question — called language generation in the limit — in which we view language generation as a game played between an adversary and an algorithm: the adversary generates strings from an unknown language K, known only to come from a countable collection of candidate languages, and after observing a finite set of these strings, the algorithm must generate new strings from the language K that it hasn't seen before. This formalism highlights an important tension: the trade-off between validity (that the algorithm should only produce strings that come from the language) and breadth (that the algorithm should be able to produce "many" strings from the language). This validity-breadth trade-off is a central issue in applied work on language generation as well, where it arises in the balance between hallucination, when models generate invalid utterances, and mode collapse, when models only generate from a very restricted set of feasible outputs. Despite its importance, this trade-off has been challenging to study quantitatively. In this work we develop ways of quantifying this trade-off, by formalizing the notion of breadth through measures of density. Roughly speaking, the density of one language L in another language L' is the limiting fraction of strings from L among the strings of L', where we take the limit over longer and longer finite prefixes of L'. Existing algorithms for language generation in the limit produce output sets that can have zero density in the true language K, in this asymptotic sense, and this represents an important failure of breadth that might seem necessary in any solution to the problem. We show here that such a failure is not in fact necessary: we provide an algorithm for language generation in the limit whose outputs have strictly positive density in the true language K. We also study the internal representations built by algorithms for this problem — the sequence of hypothesized candidate languages they iterate through as they perform generation — showing a precise sense in which the strongest form of breadth achievable is one that may need to "oscillate" indefinitely between hypothesized representations of high density and low density. Our analysis introduces a novel topology on language families, with notions of convergence and limit points in this topology playing a key role in the analysis.

STOC Conference 2017 Conference Paper

Local max-cut in smoothed polynomial time

  • Omer Angel
  • Sébastien Bubeck
  • Yuval Peres
  • Fan Wei

In 1988, Johnson, Papadimitriou and Yannakakis wrote that "Practically all the empirical evidence would lead us to conclude that finding locally optimal solutions is much easier than solving NP-hard problems". Since then the empirical evidence has continued to amass, but formal proofs of this phenomenon have remained elusive. A canonical (and indeed complete) example is the local max-cut problem, for which no polynomial time method is known. In a breakthrough paper, Etscheid and Röglin proved that the smoothed complexity of local max-cut is quasi-polynomial, i.e., if arbitrary bounded weights are randomly perturbed, a local maximum can be found in ϕ n O (log n ) steps where ϕ is an upper bound on the random edge weight density. In this paper we prove smoothed polynomial complexity for local max-cut, thus confirming that finding local optima for max-cut is much easier than solving it.

AAAI Conference 2017 Conference Paper

Self-Paced Multi-Task Learning

  • Changsheng Li
  • Junchi Yan
  • Fan Wei
  • Weishan Dong
  • Qingshan Liu
  • Hongyuan Zha

Multi-task learning is a paradigm, where multiple tasks are jointly learnt. Previous multi-task learning models usually treat all tasks and instances per task equally during learning. Inspired by the fact that humans often learn from easy concepts to hard ones in the cognitive process, in this paper, we propose a novel multi-task learning framework that attempts to learn the tasks by simultaneously taking into consideration the complexities of both tasks and instances per task. We propose a novel formulation by presenting a new task-oriented regularizer that can jointly prioritize tasks and instances. Thus it can be interpreted as a self-paced learner for multi-task learning. An efficient block coordinate descent algorithm is developed to solve the proposed objective function, and the convergence of the algorithm can be guaranteed. Experimental results on the toy and real-world datasets demonstrate the effectiveness of the proposed approach, compared to the state-of-the-arts.