Arrow Research search

Author name cluster

Claudio Gallicchio

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers
2 author rows

Possible papers

11

ECAI Conference 2025 Conference Paper

Fed2RC: Federated Rocket Kernels and Ridge Classifier for Time Series Classification

  • Bruno Casella
  • Samuele Fonio
  • Lorenzo Sciandra
  • Claudio Gallicchio
  • Marco Aldinucci
  • Mirko Polato
  • Roberto Esposito

Time series classification is a pivotal task in modern machine learning, with widespread applications in fields such as healthcare, finance, and cybersecurity. While deep learning methods dominate recent developments, their resource demands and privacy limitations hinder deployment on low-power and decentralized environments. To address these challenges, we introduce Fed2RC, a fully federated and gradient-free approach that integrates the efficiency of Rocket-based feature extraction with the robustness of ridge regression in a privacy-preserving setting. Fed2RC builds upon two key ideas: (i) federated selection and aggregation of high-performing random convolution kernels, and (ii) incremental and communication-efficient updates of ridge classifier parameters using closed-form solutions. Additionally, we propose a novel federated protocol for selecting the global ridge regularization parameter λ, and show how to improve the communication efficiency by matrix factorization techniques. Extensive experiments on the UCR benchmark demonstrate that Fed2RC achieves state-of-the-art results with a fraction of the computation and communication costs. Code to reproduce the experiments can be found at: https: //github. com/CasellaJr/Fed2RC.

ICML Conference 2025 Conference Paper

Graph Adaptive Autoregressive Moving Average Models

  • Moshe Eliasof
  • Alessio Gravina
  • Andrea Ceni
  • Claudio Gallicchio
  • Davide Bacciu
  • Carola Schönlieb

Graph State Space Models (SSMs) have recently been introduced to enhance Graph Neural Networks (GNNs) in modeling long-range interactions. Despite their success, existing methods either compromise on permutation equivariance or limit their focus to pairwise interactions rather than sequences. Building on the connection between Autoregressive Moving Average (ARMA) and SSM, in this paper, we introduce GRAMA, a Graph Adaptive method based on a learnable ARMA framework that addresses these limitations. By transforming from static to sequential graph data, GRAMA leverages the strengths of the ARMA framework, while preserving permutation equivariance. Moreover, GRAMA incorporates a selective attention mechanism for dynamic learning of ARMA coefficients, enabling efficient and flexible long-range information propagation. We also establish theoretical connections between GRAMA and Selective SSMs, providing insights into its ability to capture long-range dependencies. Experiments on 26 synthetic and real-world datasets demonstrate that GRAMA consistently outperforms backbone models and performs competitively with state-of-the-art methods.

AAAI Conference 2025 Conference Paper

On Oversquashing in Graph Neural Networks Through the Lens of Dynamical Systems

  • Alessio Gravina
  • Moshe Eliasof
  • Claudio Gallicchio
  • Davide Bacciu
  • Carola-Bibiane Schönlieb

A common problem in Message-Passing Neural Networks is oversquashing -- the limited ability to facilitate effective information flow between distant nodes. Oversquashing is attributed to the exponential decay in information transmission as node distances increase. This paper introduces a novel perspective to address oversquashing, leveraging dynamical systems properties of global and local non-dissipativity, that enable the maintenance of a constant information flow rate. We present SWAN, a uniquely parameterized GNN model with antisymmetry both in space and weight domains, as a means to obtain non-dissipativity. Our theoretical analysis asserts that by implementing these properties, SWAN offers an enhanced ability to transmit information over extended distances. Empirical evaluations on synthetic and real-world benchmarks that emphasize long-range interactions validate the theoretical understanding of SWAN, and its ability to mitigate oversquashing.

NeurIPS Conference 2025 Conference Paper

On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning

  • Alvaro Arroyo
  • Alessio Gravina
  • Benjamin Gutteridge
  • Federico Barbero
  • Claudio Gallicchio
  • Xiaowen Dong
  • Michael Bronstein
  • Pierre Vandergheynst

Graph Neural Networks (GNNs) are models that leverage the graph structure to transmit information between nodes, typically through the message-passing operation. While widely successful, this approach is well-known to suffer from representational collapse as the number of layers increases and insensitivity to the information contained at distant and poorly connected nodes. In this paper, we present a unified view of on the appearance of these issues through the lens of vanishing gradients, using ideas from linear control theory for our analysis. We propose an interpretation of GNNs as recurrent models and empirically demonstrate that a simple state-space formulation of an GNN effectively alleviates these issues at no extra trainable parameter cost. Further, we show theoretically and empirically that (i) Traditional GNNs are by design prone to extreme gradient vanishing even after few layers; (ii) Feature collapse is directly related to the mechanism causing vanishing gradients; (iii) Long-range modeling is most easily achieved by a combination of graph rewiring and vanishing gradient mitigation. We believe our work will help bridge the gap between the recurrent and graph neural network literature and will unlock the design of new deep and performant GNNs.

ICLR Conference 2025 Conference Paper

Port-Hamiltonian Architectural Bias for Long-Range Propagation in Deep Graph Networks

  • Simon Heilig
  • Alessio Gravina
  • Alessandro Trenta
  • Claudio Gallicchio
  • Davide Bacciu

The dynamics of information diffusion within graphs is a critical open issue that heavily influences graph representation learning, especially when considering long-range propagation. This calls for principled approaches that control and regulate the degree of propagation and dissipation of information throughout the neural flow. Motivated by this, we introduce port-Hamiltonian Deep Graph Networks, a novel framework that models neural information flow in graphs by building on the laws of conservation of Hamiltonian dynamical systems. We reconcile under a single theoretical and practical framework both non-dissipative long-range propagation and non-conservative behaviors, introducing tools from mechanical systems to gauge the equilibrium between the two components. Our approach can be applied to general message-passing architectures, and it provides theoretical guarantees on information conservation in time. Empirical results prove the effectiveness of our port-Hamiltonian scheme in pushing simple graph convolutional architectures to state-of-the-art performance in long-range benchmarks.

ICML Conference 2024 Conference Paper

Long Range Propagation on Continuous-Time Dynamic Graphs

  • Alessio Gravina
  • Giulio Lovisotto
  • Claudio Gallicchio
  • Davide Bacciu
  • Claas Grohnfeldt

Learning Continuous-Time Dynamic Graphs (C-TDGs) requires accurately modeling spatio-temporal information on streams of irregularly sampled events. While many methods have been proposed recently, we find that most message passing-, recurrent- or self-attention-based methods perform poorly on long-range tasks. These tasks require correlating information that occurred "far" away from the current event, either spatially (higher-order node information) or along the time dimension (events occurred in the past). To address long-range dependencies, we introduce Continuous-Time Graph Anti-Symmetric Network (CTAN). Grounded within the ordinary differential equations framework, our method is designed for efficient propagation of information. In this paper, we show how CTAN’s (i) long-range modeling capabilities are substantiated by theoretical findings and how (ii) its empirical performance on synthetic long-range benchmarks and real-world benchmarks is superior to other methods. Our results motivate CTAN’s ability to propagate long-range information in C-TDGs as well as the inclusion of long-range tasks as part of temporal graph models evaluation.

ICLR Conference 2023 Conference Paper

Anti-Symmetric DGN: a stable architecture for Deep Graph Networks

  • Alessio Gravina
  • Davide Bacciu
  • Claudio Gallicchio

Deep Graph Networks (DGNs) currently dominate the research landscape of learning from graphs, due to their efficiency and ability to implement an adaptive message-passing scheme between the nodes. However, DGNs are typically limited in their ability to propagate and preserve long-term dependencies between nodes, i.e., they suffer from the over-squashing phenomena. As a result, we can expect them to under-perform, since different problems require to capture interactions at different (and possibly large) radii in order to be effectively solved. In this work, we present Anti-Symmetric Deep Graph Networks (A-DGNs), a framework for stable and non-dissipative DGN design, conceived through the lens of ordinary differential equations. We give theoretical proof that our method is stable and non-dissipative, leading to two key results: long-range information between nodes is preserved, and no gradient vanishing or explosion occurs in training. We empirically validate the proposed approach on several graph benchmarks, showing that A-DGN yields to improved performance and enables to learn effectively even when dozens of layers are used.

AAAI Conference 2020 Conference Paper

Fast and Deep Graph Neural Networks

  • Claudio Gallicchio
  • Alessio Micheli

We address the efficiency issue for the construction of a deep graph neural network (GNN). The approach exploits the idea of representing each input graph as a fixed point of a dynamical system (implemented through a recurrent neural network), and leverages a deep architectural organization of the recurrent units. Efficiency is gained by many aspects, including the use of small and very sparse networks, where the weights of the recurrent units are left untrained under the stability condition introduced in this work. This can be viewed as a way to study the intrinsic power of the architecture of a deep GNN, and also to provide insights for the set-up of more complex fully-trained models. Through experimental results, we show that even without training of the recurrent connections, the architecture of small deep GNN is surprisingly able to achieve or improve the state-of-the-art performance on a significant set of tasks in the field of graphs classification.

ICML Conference 2018 Conference Paper

Tree Edit Distance Learning via Adaptive Symbol Embeddings

  • Benjamin Paaßen
  • Claudio Gallicchio
  • Alessio Micheli
  • Barbara Hammer

Metric learning has the aim to improve classification accuracy by learning a distance measure which brings data points from the same class closer together and pushes data points from different classes further apart. Recent research has demonstrated that metric learning approaches can also be applied to trees, such as molecular structures, abstract syntax trees of computer programs, or syntax trees of natural language, by learning the cost function of an edit distance, i. e. the costs of replacing, deleting, or inserting nodes in a tree. However, learning such costs directly may yield an edit distance which violates metric axioms, is challenging to interpret, and may not generalize well. In this contribution, we propose a novel metric learning approach for trees which we call embedding edit distance learning (BEDL) and which learns an edit distance indirectly by embedding the tree nodes as vectors, such that the Euclidean distance between those vectors supports class discrimination. We learn such embeddings by reducing the distance to prototypical trees from the same class and increasing the distance to prototypical trees from different classes. In our experiments, we show that BEDL improves upon the state-of-the-art in metric learning for trees on six benchmark data sets, ranging from computer science over biomedical data to a natural-language processing data set containing over 300, 000 nodes.