Arrow Research search

Author name cluster

Dimitris Paparas

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
1 author row

Possible papers

6

ICLR Conference 2025 Conference Paper

Scaling Laws for Downstream Task Performance in Machine Translation

  • Berivan Isik
  • Natalia Ponomareva 0001
  • Hussein Hazimeh 0001
  • Dimitris Paparas
  • Sergei Vassilvitskii
  • Sanmi Koyejo

Scaling laws provide important insights that can guide the design of large language models (LLMs). Existing work has primarily focused on studying scaling laws for pretraining (upstream) loss. However, in transfer learning settings, in which LLMs are pretrained on an unsupervised dataset and then finetuned on a downstream task, we often also care about the downstream performance. In this work, we study the scaling behavior in a transfer learning setting, where LLMs are finetuned for machine translation tasks. Specifically, we investigate how the choice of the pretraining data and its size affect downstream performance (translation quality) as judged by: downstream cross-entropy and translation quality metrics such as BLEU and COMET scores. Our experiments indicate that the size of the finetuning dataset and the distribution alignment between the pretraining and downstream data significantly influence the scaling behavior. With sufficient alignment, both downstream cross-entropy and translation quality scores improve monotonically with more pretraining data. In such cases, we show that it is possible to predict the downstream translation quality metrics with good accuracy using a log-law. However, there are cases where moderate misalignment causes the downstream translation scores to fluctuate or get worse with more pretraining, whereas downstream cross-entropy monotonically improves. By analyzing these, we provide new practical insights for choosing appropriate pretraining data.

FOCS Conference 2015 Conference Paper

On the Complexity of Optimal Lottery Pricing and Randomized Mechanisms

  • Xi Chen 0001
  • Ilias Diakonikolas
  • Anthi Orfanou
  • Dimitris Paparas
  • Xiaorui Sun
  • Mihalis Yannakakis

We study the optimal lottery problem and the optimal mechanism design problem in the setting of a single unit-demand buyer with item values drawn from independent distributions. Optimal solutions to both problems are characterized by a linear program with exponentially many variables. For the menu size complexity of the optimal lottery problem, we present an explicit, simple instance with distributions of support size 2, and show that exponentially many lotteries are required to achieve the optimal revenue. We also show that, when distributions have support size 2 and share the same high value, the simpler scheme of item pricing can achieve the same revenue as the optimal menu of lotteries. The same holds for the case of two items with support size 2 (but not necessarily the same high value). For the computational complexity of the optimal mechanism design problem, we show that unless the polynomial-time hierarchy collapses (more exactly, PNP = P#P), there is no universal efficient randomized algorithm to implement an optimal mechanism even when distributions have support size 3.

MFCS Conference 2013 Conference Paper

Clustering on k-Edge-Colored Graphs

  • Eric Angel
  • Evripidis Bampis
  • Alexander V. Kononov
  • Dimitris Paparas
  • Emmanouil Pountourakis
  • Vassilis Zissimopoulos

Abstract We study the Max k -colored clustering problem, where, given an edge-colored graph with k colors, we seek to color the vertices of the graph so as to find a clustering of the vertices maximizing the number (or the weight) of matched edges, i. e. the edges having the same color as their extremities. We show that the cardinality problem is NP-hard even for edge-colored bipartite graphs with a chromatic degree equal to two and k ≥ 3. Our main result is a constant approximation algorithm for the weighted version of the Max k -colored clustering problem which is based on a rounding of a natural linear programming relaxation. For graphs with chromatic degree equal to two, we improve this ratio by exploiting the relation of our problem with the Max 2-and problem. We also present a reduction to the maximum-weight independent set (IS) problem in bipartite graphs which leads to a polynomial time algorithm for the case of two colors.

STOC Conference 2013 Conference Paper

The complexity of non-monotone markets

  • Xi Chen 0001
  • Dimitris Paparas
  • Mihalis Yannakakis

We introduce the notion of non-monotone utilities, which covers a wide variety of utility functions in economic theory. We show that it is PPAD-hard to compute an approximate Arrow-Debreu market equilibrium in markets with linear and non-monotone utilities. Building on this result, we settle the long-standing open problem regarding the computation of an approximate Arrow-Debreu market equilibrium in markets with CES utilities, by proving that it is PPAD-complete when the Constant Elasticity of Substitution parameter, ρ, is any constant less than -1.