Arrow Research search

Author name cluster

Orgad Keller

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers
1 author row

Possible papers

9

NeurIPS Conference 2024 Conference Paper

Multi-turn Reinforcement Learning with Preference Human Feedback

  • Lior Shani
  • Aviv Rosenberg
  • Asaf Cassel
  • Oran Lang
  • Daniele Calandriello
  • Avital Zipori
  • Hila Noga
  • Orgad Keller

Reinforcement Learning from Human Feedback (RLHF) has become the standard approach for aligning Large Language Models (LLMs) with human preferences, allowing LLMs to demonstrate remarkable abilities in various tasks. Existing methods work by emulating the human preference at the single decision (turn) level, limiting their capabilities in settings that require planning or multi-turn interactions to achieve a long-term goal. In this paper, we address this issue by developing novel methods for Reinforcement Learning (RL) from preference feedback between two full multi-turn conversations. In the tabular setting, we present a novel mirror-descent-based policy optimization algorithm for the general multi-turn preference-based RL problem, and prove its convergence to Nash equilibrium. To evaluate performance, we create a new environment, Education Dialogue, where a teacher agent guides a student in learning a random topic, and show that a deep RL variant of our algorithm outperforms RLHF baselines. Finally, we show that in an environment with explicit rewards, our algorithm recovers the same performance as a reward-based RL baseline, despite relying solely on a weaker preference signal.

AAAI Conference 2021 Conference Paper

Targeted Negative Campaigning: Complexity and Approximations

  • ‪Avishai Zagoury‬‏
  • Orgad Keller
  • Avinatan Hassidim
  • Noam Hazon

Given the ubiquity of negative campaigning in recent political elections, we find it important to study its properties from a computational perspective. To this end, we present a model where elections can be manipulated by convincing voters to demote specific non-favored candidates, and study its properties in the classic setting of scoring rules. When the goal is constructive (making a preferred candidate win), we prove that finding such a demotion strategy is easy for Plurality and Veto, while generally hard for t-approval and Borda. We also provide a t-factor approximation for tapproval for every fixed t, and a 3-factor approximation algorithm for Borda. Interestingly enough—following recent trends in political science that show that the effectiveness of negative campaigning depends on the type of candidate and demographic—when assigning varying prices to different possible demotion operations, we are able to provide inapproximability results. When the goal is destructive (making the leading opponent lose), we show that the problem is easy for a broad class of scoring rules.

JAIR Journal 2019 Journal Article

Approximating Weighted and Priced Bribery in Scoring Rules

  • Orgad Keller
  • Avinatan Hassidim
  • Noam Hazon

The classic Bribery problem is to find a minimal subset of voters who need to change their vote to make some preferred candidate win. Its important generalizations consider voters who are weighted and also have different prices. We provide an approximate solution for these problems for a broad family of scoring rules (which includes Borda, t-approval, and Dowdall).Our algorithm is based on a randomized reduction from these Bribery generalizations to weighted coalitional manipulation (WCM). To solve this WCM instance, we apply the Birkhoff-von Neumann (BvN) decomposition to a fractional manipulation matrix. This allows us to limit the size of the possible ballot search space reducing it from exponential to polynomial, while still obtaining good approximation guarantees. Finding a solution in the truncated search space yields a new algorithm for WCM, which is of independent interest.

JAIR Journal 2019 Journal Article

New Approximations for Coalitional Manipulation in Scoring Rules

  • Orgad Keller
  • Avinatan Hassidim
  • Noam Hazon

We study the problem of coalitional manipulation---where k manipulators try to manipulate an election on m candidates---for any scoring rule, with focus on the Borda protocol. We do so in both the weighted and unweighted settings. For these problems, recent approximation approaches have tried to minimize k, the number of manipulators needed to make some preferred candidate p win (thus assuming that the number of manipulators is not limited in advance). In contrast, we focus on minimizing the score margin of p which is the difference between the maximum score of a candidate and the score of p. We provide algorithms that approximate the optimum score margin, which are applicable to any scoring rule. For the specific case of the Borda protocol in the unweighted setting, our algorithm provides a superior approximation factor for lower values of k.Our methods are novel and adapt techniques from multiprocessor scheduling by carefully rounding an exponentially-large configuration linear program that is solved by using the ellipsoid method with an efficient separation oracle. We believe that such methods could be beneficial in other social choice settings as well.

AAAI Conference 2018 Conference Paper

Approximating Bribery in Scoring Rules

  • Orgad Keller
  • Avinatan Hassidim
  • Noam Hazon

The classic bribery problem is to find a minimal subset of voters who need to change their vote to make some preferred candidate win. We find an approximate solution for this problem for a broad family of scoring rules (which includes Borda and t-approval), in the following sense: if there is a strategy which requires bribing k voters, we efficiently find a strategy which requires bribing at most k + O( √ k) voters. Our algorithm is based on a randomized reduction from bribery to coalitional manipulation (UCM). To solve the UCM problem, we apply the Birkhoff-von Neumann (BvN) decomposition to a fractional manipulation matrix. This allows us to limit the size of the possible ballot search space reducing it from exponential to polynomial, while still obtaining good approximation guarantees. Finding the optimal solution in the truncated search space yields a new algorithm for UCM, which is of independent interest.

AAMAS Conference 2017 Conference Paper

New Approximation for Borda Coalitional Manipulation

  • Orgad Keller
  • Avinatan Hassidim
  • Noam Hazon

We study the problem of Borda Unweighted Coalitional Manipulation, where k manipulators try to manipulate an election on m candidates under the Borda protocol. This problem is known to be NP-hard. While most recent approaches to approximation tried to minimize k, the number of manipulators needed to make the preferred candidate win (thus assuming that the number of manipulators is not limited in advance), we focus instead on minimizing the maximum score obtainable by a non-preferred candidate. We provide a randomized, additive O(k √ m log m) approximation to this value; in other words, if there exists a strategy enabling the preferred candidate to win by an Ω(k √ m log m) margin, our method, with high probability, will find a strategy enabling her to win (albeit with a possibly smaller margin). It thus provides a somewhat stronger guarantee compared to the previous methods, where the addition of an extra manipulator implied (with respect to the original k) a strategy that provides an Ω(m)-additive approximation to a runnerup’s score: when k is o( p m/ log m), our strategy provides a stronger approximation. Our algorithm can also be viewed as a (1 + o(1))-multiplicative approximation since the value we approximate has a natural Ω(km) lower bound. Our methods are novel and adapt techniques from multiprocessor scheduling by carefully rounding an exponentiallylarge configuration linear program that is solved by using the ellipsoid method with an efficient separation oracle. We believe that such methods could be beneficial in approximating coalitional manipulation in other election protocols as well. CCS Concepts •Computing methodologies → Multi-agent systems;

TCS Journal 2014 Journal Article

Generalized substring compression

  • Orgad Keller
  • Tsvi Kopelowitz
  • Shir Landau Feibish
  • Moshe Lewenstein

In substring compression one is given a text to preprocess so that, upon request, a compressed substring is returned. Generalized substring compression is the same with the following twist. The queries contain an additional context substring (or a collection of context substrings) and the answers are the substring in compressed format, where the context substring is used to make the compression more efficient. We focus our attention on generalized substring compression and present the first non-trivial correct algorithm for this problem. Inherent to our algorithm is a new method for finding the bounded longest common prefix of substrings, which may be of independent interest. In addition, we propose an efficient algorithm for substring compression which makes use of range successor queries. We present several tradeoffs for both problems. For compressing the substring S [ i. . j ] (possibly with the substring S [ α. . β ] as a context), the best query times we achieve are O ( C ) and O ( C log ( j − i C ) ) for substring compression query and generalized substring compression query, respectively, where C is the number of phrases encoded. A preliminary version of this paper has been presented in [21].