Arrow Research search

Author name cluster

Konrad Czechowski

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
2 author rows

Possible papers

4

ICLR Conference 2023 Conference Paper

Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

  • Michal Zawalski
  • Michal Tyrolski
  • Konrad Czechowski
  • Tomasz Odrzygózdz
  • Damian Stachura
  • Piotr Piekos
  • Yuhuai Wu
  • Lukasz Kucinski

Complex reasoning problems contain states that vary in the computational cost required to determine the right action plan. To take advantage of this property, we propose Adaptive Subgoal Search (AdaSubS), a search method that adaptively adjusts the planning horizon. To this end, AdaSubS generates diverse sets of subgoals at different distances. A verification mechanism is employed to filter out unreachable subgoals swiftly, making it possible to focus on feasible further subgoals. In this way, AdaSubS benefits from the efficiency of planning with longer-term subgoals and the fine control with shorter-term ones, and thus scales well to difficult planning problems. We show that AdaSubS significantly surpasses hierarchical planning algorithms on three complex reasoning tasks: Sokoban, the Rubik’s Cube, and the inequality-proving benchmark INT.

NeurIPS Conference 2022 Conference Paper

Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

  • Albert Qiaochu Jiang
  • Wenda Li
  • Szymon Tworkowski
  • Konrad Czechowski
  • Tomasz Odrzygóźdź
  • Piotr Miłoś
  • Yuhuai Wu
  • Mateja Jamnik

In theorem proving, the task of selecting useful premises from a large library to unlock the proof of a given conjecture is crucially important. This presents a challenge for all theorem provers, especially the ones based on language models, due to their relative inability to reason over huge volumes of premises in text form. This paper introduces Thor, a framework integrating language models and automated theorem provers to overcome this difficulty. In Thor, a class of methods called hammers that leverage the power of automated theorem provers are used for premise selection, while all other tasks are designated to language models. Thor increases a language model's success rate on the PISA dataset from $39\%$ to $57\%$, while solving $8. 2\%$ of problems neither language models nor automated theorem provers are able to solve on their own. Furthermore, with a significantly smaller computational budget, Thor can achieve a success rate on the MiniF2F dataset that is on par with the best existing methods. Thor can be instantiated for the majority of popular interactive theorem provers via a straightforward protocol we provide.

NeurIPS Conference 2021 Conference Paper

Subgoal Search For Complex Reasoning Tasks

  • Konrad Czechowski
  • Tomasz Odrzygóźdź
  • Marek Zbysiński
  • Michał Zawalski
  • Krzysztof Olejnik
  • Yuhuai Wu
  • Łukasz Kuciński
  • Piotr Miłoś

Humans excel in solving complex reasoning tasks through a mental process of moving from one idea to a related one. Inspired by this, we propose Subgoal Search (kSubS) method. Its key component is a learned subgoal generator that produces a diversity of subgoals that are both achievable and closer to the solution. Using subgoals reduces the search space and induces a high-level search graph suitable for efficient planning. In this paper, we implement kSubS using a transformer-based subgoal module coupled with the classical best-first search framework. We show that a simple approach of generating $k$-th step ahead subgoals is surprisingly efficient on three challenging domains: two popular puzzle games, Sokoban and the Rubik's Cube, and an inequality proving benchmark INT. kSubS achieves strong results including state-of-the-art on INT within a modest computational budget.

ICLR Conference 2020 Conference Paper

Model Based Reinforcement Learning for Atari

  • Lukasz Kaiser
  • Mohammad Babaeizadeh
  • Piotr Milos
  • Blazej Osinski
  • Roy H. Campbell
  • Konrad Czechowski
  • Dumitru Erhan
  • Chelsea Finn

Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment, which corresponds to two hours of real-time play. In most games SimPLe outperforms state-of-the-art model-free algorithms, in some games by over an order of magnitude.