Arrow Research search

Author name cluster

Federico Soldà

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers
1 author row

Possible papers

2

ICLR Conference 2025 Conference Paper

Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory

  • Nikola Zubic
  • Federico Soldà
  • Aurelio L. Sulser
  • Davide Scaramuzza 0001

Despite their successes, deep learning models struggle with tasks requiring complex reasoning and function composition. We present a theoretical and empirical investigation into the limitations of Structured State Space Models (SSMs) and Transformers in such tasks. We prove that one-layer SSMs cannot efficiently perform function composition over large domains without impractically large state sizes, and even with Chain-of-Thought prompting, they require a number of steps that scale unfavorably with the complexity of the function composition. Also, the language of a finite-precision SSM is within the class of regular languages. Our experiments corroborate these theoretical findings. Evaluating models on tasks including various function composition settings, multi-digit multiplication, dynamic programming, and Einstein's puzzle, we find significant performance degradation even with advanced prompting techniques. Models often resort to shortcuts, leading to compounding errors. These findings highlight fundamental barriers within current deep learning architectures rooted in their computational capacities. We underscore the need for innovative solutions to transcend these constraints and achieve reliable multi-step reasoning and compositional task-solving, which is critical for advancing toward general artificial intelligence.

SODA Conference 2022 Conference Paper

Scalar and Matrix Chernoff Bounds from ℓ ∞ -Independence

  • Tali Kaufman
  • Rasmus Kyng
  • Federico Soldà

We present new scalar and matrix Chernoff-style concentration bounds for a broad class of probability distributions over the binary hypercube {0, 1} n. Motivated by recent tools developed for the study of mixing times of Markov chains on discrete distributions, we say that a distribution is ℓ ∞ -independent when the infinity norm of its influence matrix is bounded by a constant. We show that any distribution which is ℓ ∞ -infinity independent satisfies a matrix Chernoff bound that matches the matrix Chernoff bound for independent random variables due to Tropp. Our matrix Chernoff bound is a broad generalization and strengthening of the matrix Chernoff bound of Kyng and Song (FOCS'18). Using our bound, we can conclude as a corollary that a union of O (log |V| ) random spanning trees gives a spectral graph sparsifier of a graph with |V| vertices with high probability matching results for independent edge sampling, and matching lower bounds from Kyng and Song.