Arrow Research search

Author name cluster

Tsachy Weissman

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers
2 author rows

Possible papers

8

NeurIPS Conference 2025 Conference Paper

ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model

  • Sagnik Bhattacharya
  • Abhiram Gorle
  • Ahsan Bilal
  • Connor Ding
  • Amit Kumar Singh Yadav
  • Tsachy Weissman

Generative modeling of non-negative, discrete data, such as symbolic music, remains challenging due to two persistent limitations in existing methods. Firstly, many approaches rely on modeling continuous embeddings, which is suboptimal for inherently discrete data distributions. Secondly, most models optimize variational bounds rather than exact data likelihood, resulting in inaccurate likelihood estimates and degraded sampling quality. While recent diffusion-based models have addressed these issues separately, we tackle them jointly. In this work, we introduce the Information-Theoretic Discrete Poisson Diffusion Model (ItDPDM), inspired by photon arrival process, which combines exact likelihood estimation with fully discrete-state modeling. Central to our approach is an information-theoretic Poisson Reconstruction Loss (PRL) that has a provable exact relationship with the true data likelihood. ItDPDM achieves improved likelihood and sampling performance over prior discrete and continuous diffusion models on a variety of synthetic discrete datasets. Furthermore, on real-world datasets such as symbolic music and images, ItDPDM attains superior likelihood estimates and competitive generation quality—demonstrating a proof of concept for distribution-robust discrete generative modeling.

NeurIPS Conference 2025 Conference Paper

Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

  • Hao Kang
  • Qingru Zhang
  • Han Cai
  • Weiyuan Xu
  • Tushar Krishna
  • Yilun Du
  • Tsachy Weissman

Large language models (LLMs) have shown remarkable performance across diverse reasoning and generation tasks, and are increasingly deployed as agents in dynamic environments such as code generation and recommendation systems. However, many real-world applications, such as high-frequency trading and real-time competitive gaming, require decisions under strict latency constraints, where faster responses directly translate into higher rewards. Despite the importance of this latency–quality trade-off, it remains underexplored in the context of LLM-based agents. In this work, we present the first systematic study of this trade-off in real-time decision-making tasks. To support our investigation, we introduce two new benchmarks: HFTBench, a high-frequency trading simulation, and StreetFighter, a competitive gaming platform. Our analysis reveals that optimal latency–quality balance varies by task, and that sacrificing quality for lower latency can significantly enhance downstream performance. To address this, we propose FPX, an adaptive framework that dynamically selects model size and quantization level based on real-time demands. Our method achieves the best performance on both benchmarks, improving win rate by up to 80% in Street Fighter and boosting daily yield by up to 26. 52% in trading, underscoring the need for latency-aware evaluation and deployment strategies for LLM-based agents. These results demonstrate the critical importance of latency-aware evaluation and deployment strategies for real-world LLM-based agents.

NeurIPS Conference 2023 Conference Paper

Exact Optimality of Communication-Privacy-Utility Tradeoffs in Distributed Mean Estimation

  • Berivan Isik
  • Wei-Ning Chen
  • Ayfer Ozgur
  • Tsachy Weissman
  • Albert No

We study the mean estimation problem under communication and local differential privacy constraints. While previous work has proposed order-optimal algorithms for the same problem (i. e. , asymptotically optimal as we spend more bits), exact optimality (in the non-asymptotic setting) still has not been achieved. In this work, we take a step towards characterizing the exact-optimal approach in the presence of shared randomness (a random variable shared between the server and the user) and identify several conditions for exact optimality. We prove that one of the conditions is to utilize a rotationally symmetric shared random codebook. Based on this, we propose a randomization mechanism where the codebook is a randomly rotated simplex -- satisfying the properties of the exact-optimal codebook. The proposed mechanism is based on a $k$-closest encoding which we prove to be exact-optimal for the randomly rotated simplex codebook.

ICLR Conference 2023 Conference Paper

Sparse Random Networks for Communication-Efficient Federated Learning

  • Berivan Isik
  • Francesco Pase
  • Deniz Gündüz
  • Tsachy Weissman
  • Michele Zorzi

One main challenge in federated learning is the large communication cost of exchanging weight updates from clients to the server at each round. While prior work has made great progress in compressing the weight updates through gradient compression methods, we propose a radically different approach that does not update the weights at all. Instead, our method freezes the weights at their initial \emph{random} values and learns how to sparsify the random network for the best performance. To this end, the clients collaborate in training a \emph{stochastic} binary mask to find the optimal sparse random network within the original one. At the end of the training, the final model is a sparse network with random weights -- or a subnetwork inside the dense random network. We show improvements in accuracy, communication (less than $1$ bit per parameter (bpp)), convergence speed, and final model size (less than $1$ bpp) over relevant baselines on MNIST, EMNIST, CIFAR-10, and CIFAR-100 datasets, in the low bitrate regime.

NeurIPS Conference 2022 Conference Paper

Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions

  • Wei Zhang
  • Yanjun Han
  • Zhengyuan Zhou
  • Aaron Flores
  • Tsachy Weissman

With the advent and increasing consolidation of e-commerce, digital advertising has very recently replaced traditional advertising as the main marketing force in the economy. In the past four years, a particularly important development in the digital advertising industry is the shift from second-price auctions to first-price auctions for online display ads. This shift immediately motivated the intellectually challenging question of how to bid in first-price auctions, because unlike in second-price auctions, bidding one's private value truthfully is no longer optimal. Following a series of recent works in this area, we consider a differentiated setup: we do not make any assumption about other bidders' maximum bid (i. e. it can be adversarial over time), and instead assume that we have access to a hint that serves as a prediction of other bidders' maximum bid, where the prediction is learned through some blackbox machine learning model. We consider two types of hints: one where a single point-prediction is available, and the other where a hint interval (representing a type of confidence region into which others' maximum bid falls) is available. We establish minimax optimal regret bounds for both cases and highlight the quantitatively different behavior between the two settings. We also provide improved regret bounds when the others' maximum bid exhibits the further structure of sparsity. Finally, we complement the theoretical results with demonstrations using real bidding data.

JMLR Journal 2019 Journal Article

Approximate Profile Maximum Likelihood

  • Dmitri S. Pavlichin
  • Jiantao Jiao
  • Tsachy Weissman

We propose an efficient algorithm for approximate computation of the profile maximum likelihood (PML), a variant of maximum likelihood maximizing the probability of observing a sufficient statistic rather than the empirical sample. The PML has appealing theoretical properties, but is difficult to compute exactly. Inspired by observations gleaned from exactly solvable cases, we look for an approximate PML solution, which, intuitively, clumps comparably frequent symbols into one symbol. This amounts to lower-bounding a certain matrix permanent by summing over a subgroup of the symmetric group rather than the whole group during the computation. We extensively experiment with the approximate solution, and the empirical performance of our approach is competitive and sometimes significantly better than state-of-the-art performances for various estimation problems. [abs] [ pdf ][ bib ] &copy JMLR 2019. ( edit, beta )

ICML Conference 2019 Conference Paper

Neural Joint Source-Channel Coding

  • Kristy Choi
  • Kedar Tatwawadi
  • Aditya Grover
  • Tsachy Weissman
  • Stefano Ermon

For reliable transmission across a noisy communication channel, classical results from information theory show that it is asymptotically optimal to separate out the source and channel coding processes. However, this decomposition can fall short in the finite bit-length regime, as it requires non-trivial tuning of hand-crafted codes and assumes infinite computational power for decoding. In this work, we propose to jointly learn the encoding and decoding processes using a new discrete variational autoencoder model. By adding noise into the latent codes to simulate the channel during training, we learn to both compress and error-correct given a fixed bit-length and computational budget. We obtain codes that are not only competitive against several separation schemes, but also learn useful robust representations of the data for downstream tasks such as classification. Finally, inference amortization yields an extremely fast neural decoder, almost an order of magnitude faster compared to standard decoding methods based on iterative belief propagation.

NeurIPS Conference 2018 Conference Paper

Entropy Rate Estimation for Markov Chains with Large State Space

  • Yanjun Han
  • Jiantao Jiao
  • Chuan-Zheng Lee
  • Tsachy Weissman
  • Yihong Wu
  • Tiancheng Yu

Entropy estimation is one of the prototypical problems in distribution property testing. To consistently estimate the Shannon entropy of a distribution on $S$ elements with independent samples, the optimal sample complexity scales sublinearly with $S$ as $\Theta(\frac{S}{\log S})$ as shown by Valiant and Valiant \cite{Valiant--Valiant2011}. Extending the theory and algorithms for entropy estimation to dependent data, this paper considers the problem of estimating the entropy rate of a stationary reversible Markov chain with $S$ states from a sample path of $n$ observations. We show that \begin{itemize} \item Provided the Markov chain mixes not too slowly, \textit{i. e. }, the relaxation time is at most $O(\frac{S}{\ln^3 S})$, consistent estimation is achievable when $n \gg \frac{S^2}{\log S}$. \item Provided the Markov chain has some slight dependency, \textit{i. e. }, the relaxation time is at least $1+\Omega(\frac{\ln^2 S}{\sqrt{S}})$, consistent estimation is impossible when $n \lesssim \frac{S^2}{\log S}$. \end{itemize} Under both assumptions, the optimal estimation accuracy is shown to be $\Theta(\frac{S^2}{n \log S})$. In comparison, the empirical entropy rate requires at least $\Omega(S^2)$ samples to be consistent, even when the Markov chain is memoryless. In addition to synthetic experiments, we also apply the estimators that achieve the optimal sample complexity to estimate the entropy rate of the English language in the Penn Treebank and the Google One Billion Words corpora, which provides a natural benchmark for language modeling and relates it directly to the widely used perplexity measure.