Author name cluster

Andoni Rodríguez

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

1 author row

KR Conference 2025 Conference Paper

Explanations for Unrealizability of Infinite-State Safety Shields

Andoni Rodríguez
Irfansha Shaik
Davide Corsi
Roy Fox
César Sánchez

Safe Reinforcement Learning focuses on developing optimal policies while ensuring safety. A popular method to address such task is shielding, in which a correct-by-construction safety component is synthetised from logical specifications. Recently, shield synthesis has been extended to infinite-state domains, such as continuous environments. This makes shielding more applicable to realistic scenarios. However, often shields might be unrealizable because the specification is inconsistent. In order to address this gap, we present a method to obtain simple unconditional and conditional explanations that witness unrealizability, which goes by temporal formula unrolling: bounded strategy search. In this paper, we show different variants of the technique as well as its applicability

PDF Details DOI

AAAI Conference 2025 Conference Paper

Shield Synthesis for LTL Modulo Theories

Andoni Rodríguez
Guy Amir
Davide Corsi
César Sánchez
Guy Katz

In recent years, Machine Learning (ML) models have achieved remarkable success in various domains. However, these models also tend to demonstrate unsafe behaviors, precluding their deployment in safety-critical systems. To cope with this issue, ample research focuses on developing methods that guarantee the safe behaviour of a given ML model. A prominent example is shielding which incorporates an ex- ternal component (a “shield”) that blocks unwanted behavior. Despite significant progress, shielding suffers from a main setback: it is currently geared towards properties encoded solely in propositional logics (e.g., LTL) and is unsuitable for richer logics. This, in turn, limits the widespread applicability of shielding in many real-world systems. In this work, we address this gap, and extend shielding to LTL modulo theories, by building upon recent advances in reactive synthesis modulo theories. This allowed us to develop a novel approach for generating shields conforming to complex safety specifications in these more expressive, logics. We evaluated our shields and demonstrate their ability to handle rich data with temporal dynamics. To the best of our knowledge, this is the first approach for synthesizing shields for such expressivity.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Adaptive Reactive Synthesis for LTL and LTLf Modulo Theories

Andoni Rodríguez
César Sánchez

Reactive synthesis is the process of generate correct con- trollers from temporal logic specifications. Typically, synthesis is restricted to Boolean specifications in LTL. Recently, a Boolean abstraction technique allows to translate LTLT specifications that contain literals in theories into equi-realizable LTL specifications, but no full synthesis procedure exists yet. In synthesis modulo theories, the system receives valuations of environment variables (from a first-order theory T ) and outputs valuations of system variables from T. In this paper, we address how to syntheize a full controller using a combination of the static Boolean controller obtained from the Booleanized LTL specification together with on-the-fly queries to a solver that produces models of satisfiable existential T formulae. This is the first synthesis method for LTL modulo theories. Additionally, our method can produce adaptive responses which increases explainability and can improve runtime properties like performance. Our approach is applicable to both LTL modulo theories and LTLf modulo theories.

PDF Details DOI

RLJ Journal 2024 Journal Article

Verification-Guided Shielding for Deep Reinforcement Learning

Davide Corsi
Guy Amir
Andoni Rodríguez
Guy Katz
César Sánchez
Roy Fox

In recent years, Deep Reinforcement Learning (DRL) has emerged as an effective approach to solving real-world tasks. However, despite their successes, DRL-based policies suffer from poor reliability, which limits their deployment in safety-critical domains. Various methods have been put forth to address this issue by providing formal safety guarantees. Two main approaches include shielding and verification. While shielding ensures the safe behavior of the policy by employing an external online component (i.e., a ``shield'') that overrides potentially dangerous actions, this approach has a significant computational cost as the shield must be invoked at runtime to validate every decision. On the other hand, verification is an offline process that can identify policies that are unsafe, prior to their deployment, yet, without providing alternative actions when such a policy is deemed unsafe. In this work, we present verification-guided shielding --- a novel approach that bridges the DRL reliability gap by integrating these two methods. Our approach combines both formal and probabilistic verification tools to partition the input domain into safe and unsafe regions. In addition, we employ clustering and symbolic representation procedures that compress the unsafe regions into a compact representation. This, in turn, allows to temporarily activate the shield solely in (potentially) unsafe regions, in an efficient manner. Our novel approach allows to significantly reduce runtime overhead while still preserving formal safety guarantees. We extensively evaluate our approach on two benchmarks from the robotic navigation domain, as well as provide an in-depth analysis of its scalability and completeness.

PDF Details

RLC Conference 2024 Conference Paper

Verification-Guided Shielding for Deep Reinforcement Learning

Davide Corsi
Guy Amir
Andoni Rodríguez
Guy Katz
César Sánchez
Roy Fox

In recent years, Deep Reinforcement Learning (DRL) has emerged as an effective approach to solving real-world tasks. However, despite their successes, DRL-based policies suffer from poor reliability, which limits their deployment in safety-critical domains. Various methods have been put forth to address this issue by providing formal safety guarantees. Two main approaches include shielding and verification. While shielding ensures the safe behavior of the policy by employing an external online component (i. e. , a ``shield'') that overrides potentially dangerous actions, this approach has a significant computational cost as the shield must be invoked at runtime to validate every decision. On the other hand, verification is an offline process that can identify policies that are unsafe, prior to their deployment, yet, without providing alternative actions when such a policy is deemed unsafe. In this work, we present verification-guided shielding --- a novel approach that bridges the DRL reliability gap by integrating these two methods. Our approach combines both formal and probabilistic verification tools to partition the input domain into safe and unsafe regions. In addition, we employ clustering and symbolic representation procedures that compress the unsafe regions into a compact representation. This, in turn, allows to temporarily activate the shield solely in (potentially) unsafe regions, in an efficient manner. Our novel approach allows to significantly reduce runtime overhead while still preserving formal safety guarantees. We extensively evaluate our approach on two benchmarks from the robotic navigation domain, as well as provide an in-depth analysis of its scalability and completeness.

PDF Details