Author name cluster

Stefano Melacci

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers

2 author rows

IJCAI Conference 2025 Conference Paper

A Neuro-Symbolic Framework for Sequence Classification with Relational and Temporal Knowledge

Luca Salvatore Lorello
Marco Lippi
Stefano Melacci

One of the goals of neuro-symbolic artificial intelligence is to exploit background knowledge to improve the performance of learning tasks. However, most of the existing frameworks focus on the simplified scenario where knowledge does not change over time and does not cover the temporal dimension. In this work we consider the much more challenging problem of knowledge-driven sequence classification where different portions of knowledge must be employed at different timesteps, and temporal relations are available. Our extensive experimental evaluation compares multi-stage neuro-symbolic and neural-only architectures, and it is conducted on a newly-introduced benchmarking framework. Results not only demonstrate the challenging nature of this novel setting, but also highlight under-explored shortcomings of neuro-symbolic methods, representing a precious reference for future research.

PDF Details DOI

ECAI Conference 2025 Conference Paper

Pirates of the RAG: Adaptively Attacking LLMs to Leak Knowledge Bases

Christian Di Maio
Cristian Cosci
Marco Maggini
Valentina Poggioni
Stefano Melacci

The growing ubiquity of Retrieval-Augmented Generation (RAG) systems in several real-world services triggers severe concerns about their security. A RAG system improves the generative capabilities of a Large Language Model (LLM) by a retrieval mechanism that operates on a private knowledge base, whose unintended exposure could lead to severe consequences, including breaches of private and sensitive information. This paper presents a black-box attack to force a RAG system to leak its private knowledge base which, unlike existing approaches, is both adaptive and automatic. A relevance-based mechanism and an attacker-side open-source LLM favor the generation of effective queries to leak most of the (hidden) knowledge base. Extensive experimentation proves the quality of the proposed algorithm in different RAG pipelines and domains, compared to very recent related approaches, which turn out to be either not fully black-box, not adaptive, or not based on open-source models. The findings from our study highlight the urgent need for more robust privacy safeguards in the design and deployment of RAG systems. We have made the open-source code for our experimental procedure available for public use [12].

Details

ECAI Conference 2024 Conference Paper

Bridging Continual Learning of Motion and Self-Supervised Representations

Matteo Tiezzi
Simone Marullo
Alessandro Betti
Michele Casoni
Stefano Melacci

Efficiently learning unsupervised pixel-wise visual representations is crucial for training agents that can perceive their environment without relying on heavy human supervision or abundant annotated data. Motivated by recent work that promotes motion as a key source of information in representation learning, we propose a novel instance of contrastive criterions over time and space. In our architecture, pixel-wise motion field and representations are extracted by neural models, trained from scratch in an integrated fashion. Learning proceeds online over time, exploiting also a momentum-based moving average to update the feature extractor, without replaying any large buffers of past data. Experiments on real-world videos and on a recently introduced benchmark, with photorealistic streams generated from a 3D environment, confirm that the proposed model can learn to estimate motion and jointly develop representations. Our model nicely encodes the variable appearance of the visual information in space and time, significantly overcoming a recent approach and it also compares favourably with convolutional and Transformer-based networks, offline-pre-trained on large collections of supervised and unsupervised images.

Details

AAAI Conference 2024 Conference Paper

Neural Time-Reversed Generalized Riccati Equation

Alessandro Betti
Michele Casoni
Marco Gori
Simone Marullo
Stefano Melacci
Matteo Tiezzi

Optimal control deals with optimization problems in which variables steer a dynamical system, and its outcome contributes to the objective function. Two classical approaches to solving these problems are Dynamic Programming and the Pontryagin Maximum Principle. In both approaches, Hamiltonian equations offer an interpretation of optimality through auxiliary variables known as costates. However, Hamiltonian equations are rarely used due to their reliance on forward-backward algorithms across the entire temporal domain. This paper introduces a novel neural-based approach to optimal control. Neural networks are employed not only for implementing state dynamics but also for estimating costate variables. The parameters of the latter network are determined at each time step using a newly introduced local policy referred to as the time-reversed generalized Riccati equation. This policy is inspired by a result discussed in the Linear Quadratic (LQ) problem, which we conjecture stabilizes state dynamics. We support this conjecture by discussing experimental results from a range of optimal control case studies.

PDF Details DOI

NeSy Conference 2023 Conference Paper

Logic Explained Networks

Gabriele Ciravegna
Pietro Barbiero
Francesco Giannini
Marco Gori
Pietro Liò
Marco Maggini
Stefano Melacci

Details

AIJ Journal 2023 Journal Article

Logic Explained Networks

Gabriele Ciravegna
Pietro Barbiero
Francesco Giannini
Marco Gori
Pietro Liò
Marco Maggini
Stefano Melacci

Details DOI

AAAI Conference 2022 Conference Paper

Being Friends Instead of Adversaries: Deep Networks Learn from Data Simplified by Other Networks

Simone Marullo
Matteo Tiezzi
Marco Gori
Stefano Melacci

Amongst a variety of approaches aimed at making the learning procedure of neural networks more effective, the scientific community developed strategies to order the examples according to their estimated complexity, to distil knowledge from larger networks, or to exploit the principles behind adversarial machine learning. A different idea has been recently proposed, named Friendly Training, which consists in altering the input data by adding an automatically estimated perturbation, with the goal of facilitating the learning process of a neural classifier. The transformation progressively fadesout as long as training proceeds, until it completely vanishes. In this work we revisit and extend this idea, introducing a radically different and novel approach inspired by the effectiveness of neural generators in the context of Adversarial Machine Learning. We propose an auxiliary multi-layer network that is responsible of altering the input data to make them easier to be handled by the classifier at the current stage of the training procedure. The auxiliary network is trained jointly with the neural classifier, thus intrinsically increasing the “depth” of the classifier, and it is expected to spot general regularities in the data alteration process. The effect of the auxiliary network is progressively reduced up to the end of training, when it is fully dropped and the classifier is deployed for applications. We refer to this approach as Neural Friendly Training. An extended experimental procedure involving several datasets and different neural architectures shows that Neural Friendly Training overcomes the originally proposed Friendly Training technique, improving the generalization of the classifier, especially in the case of noisy data.

PDF Details

NeurIPS Conference 2022 Conference Paper

Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off

Mateo Espinosa Zarlenga
Pietro Barbiero
Gabriele Ciravegna
Giuseppe Marra
Francesco Giannini
Michelangelo Diligenti
Zohreh Shams
Frederic Precioso

Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts---particularly in real-world conditions where complete and accurate concept supervisions are scarce. To address this, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations. Our experiments demonstrate that Concept Embedding Models (1) attain better or competitive task accuracy w. r. t. standard neural models without concepts, (2) provide concept representations capturing meaningful semantics including and beyond their ground truth labels, (3) support test-time concept interventions whose effect in test accuracy surpasses that in standard concept bottleneck models, and (4) scale to real-world conditions where complete concept supervisions are scarce.

PDF Details

AAAI Conference 2022 Conference Paper

Entropy-Based Logic Explanations of Neural Networks

Pietro Barbiero
Gabriele Ciravegna
Francesco Giannini
Pietro Lió
Marco Gori
Stefano Melacci

Explainable artificial intelligence has rapidly emerged since lawmakers have started requiring interpretable models for safety-critical domains. Concept-based neural networks have arisen as explainable-by-design methods as they leverage human-understandable symbols (i. e. concepts) to predict class memberships. However, most of these approaches focus on the identification of the most relevant concepts but do not provide concise, formal explanations of how such concepts are leveraged by the classifier to make predictions. In this paper, we propose a novel end-to-end differentiable approach enabling the extraction of logic explanations from neural networks using the formalism of First-Order Logic. The method relies on an entropy-based criterion which automatically identifies the most relevant concepts. We consider four different case studies to demonstrate that: (i) this entropy-based criterion enables the distillation of concise logic explanations in safety-critical domains from clinical data to computer vision; (ii) the proposed approach outperforms state-of-the-art white-box models in terms of classification accuracy and matches black box performances.

PDF Details

IJCAI Conference 2022 Conference Paper

Stochastic Coherence Over Attention Trajectory For Continuous Learning In Video Streams

Matteo Tiezzi
Simone Marullo
Lapo Faggi
Enrico Meloni
Alessandro Betti
Stefano Melacci

Devising intelligent agents able to live in an environment and learn by observing the surroundings is a longstanding goal of Artificial Intelligence. From a bare Machine Learning perspective, challenges arise when the agent is prevented from leveraging large fully-annotated dataset, but rather the interactions with supervisory signals are sparsely distributed over space and time. This paper proposes a novel neural-network-based approach to progressively and autonomously develop pixel-wise representations in a video stream. The proposed method is based on a human-like attention mechanism that allows the agent to learn by observing what is moving in the attended locations. Spatio-temporal stochastic coherence along the attention trajectory, paired with a contrastive term, leads to an unsupervised learning criterion that naturally copes with the considered setting. Differently from most existing works, the learned representations are used in open-set class-incremental classification of each frame pixel, relying on few supervisions. Our experiments leverage 3D virtual environments and they show that the proposed agents can learn to distinguish objects just by observing the video stream. Inheriting features from state-of-the art models is not as powerful as one might expect.

PDF Details DOI

AAAI Conference 2020 Conference Paper

A Constraint-Based Approach to Learning and Explanation

Gabriele Ciravegna
Francesco Giannini
Stefano Melacci
Marco Maggini
Marco Gori

In the last few years we have seen a remarkable progress from the cultivation of the idea of expressing domain knowledge by the mathematical notion of constraint. However, the progress has mostly involved the process of providing consistent solutions with a given set of constraints, whereas learning “new” constraints, that express new knowledge, is still an open challenge. In this paper we propose a novel approach to learning of constraints which is based on information theoretic principles. The basic idea consists in maximizing the transfer of information between task functions and a set of learnable constraints, implemented using neural networks subject to L1 regularization. This process leads to the unsupervised development of new constraints that are fulﬁlled in different subportions of the input domain. In addition, we deﬁne a simple procedure that can explain the behaviour of the newly devised constraints in terms of First-Order Logic formulas, thus extracting novel knowledge on the relationships between the original tasks. An experimental evaluation is provided to support the proposed approach, in which we also explore the regularization effects introduced by the proposed Information- Based Learning of Constraint (IBLC) algorithm.

PDF Details

ECAI Conference 2020 Conference Paper

A Lagrangian Approach to Information Propagation in Graph Neural Networks

Matteo Tiezzi
Giuseppe Marra
Stefano Melacci
Marco Maggini
Marco Gori

In many real world applications, data are characterized by a complex structure, that can be naturally encoded as a graph. In the last years, the popularity of deep learning techniques has renewed the interest in neural models able to process complex patterns. In particular, inspired by the Graph Neural Network (GNN) model, different architectures have been proposed to extend the original GNN scheme. GNNs exploit a set of state variables, each assigned to a graph node, and a diffusion mechanism of the states among neighbor nodes, to implement an iterative procedure to compute the fixed point of the (learnable) state transition function. In this paper, we propose a novel approach to the state computation and the learning algorithm for GNNs, based on a constraint optimisation task solved in the Lagrangian framework. The state convergence procedure is implicitly expressed by the constraint satisfaction mechanism and does not require a separate iterative phase for each epoch of the learning procedure. In fact, the computational structure is based on the search for saddle points of the Lagrangian in the adjoint space composed of weights, neural outputs (node states), and Lagrange multipliers. The proposed approach is compared experimentally with other popular models for processing graphs.

Details

NeurIPS Conference 2020 Conference Paper

Focus of Attention Improves Information Transfer in Visual Features

Matteo Tiezzi
Stefano Melacci
Alessandro Betti
Marco Maggini
Marco Gori

Unsupervised learning from continuous visual streams is a challenging problem that cannot be naturally and efficiently managed in the classic batch-mode setting of computation. The information stream must be carefully processed accordingly to an appropriate spatio-temporal distribution of the visual data, while most approaches of learning commonly assume uniform probability density. In this paper we focus on unsupervised learning for transferring visual information in a truly online setting by using a computational model that is inspired to the principle of least action in physics. The maximization of the mutual information is carried out by a temporal process which yields online estimation of the entropy terms. The model, which is based on second-order differential equations, maximizes the information transfer from the input to a discrete space of symbols related to the visual features of the input, whose computation is supported by hidden neurons. In order to better structure the input probability distribution, we use a human-like focus of attention model that, coherently with the information maximization model, is also based on second-order differential equations. We provide experimental results to support the theory by showing that the spatio-temporal filtering induced by the focus of attention allows the system to globally transfer more information from the input stream over the focused areas and, in some contexts, over the whole frames with respect to the unfiltered case that yields uniform probability distributions.

PDF Details

IJCAI Conference 2020 Conference Paper

Human-Driven FOL Explanations of Deep Learning

Gabriele Ciravegna
Francesco Giannini
Marco Gori
Marco Maggini
Stefano Melacci

Deep neural networks are usually considered black-boxes due to their complex internal architecture, that cannot straightforwardly provide human-understandable explanations on how they behave. Indeed, Deep Learning is still viewed with skepticism in those real-world domains in which incorrect predictions may produce critical effects. This is one of the reasons why in the last few years Explainable Artificial Intelligence (XAI) techniques have gained a lot of attention in the scientific community. In this paper, we focus on the case of multi-label classification, proposing a neural network that learns the relationships among the predictors associated to each class, yielding First-Order Logic (FOL)-based descriptions. Both the explanation-related network and the classification-related network are jointly learned, thus implicitly introducing a latent dependency between the development of the explanation mechanism and the development of the classifiers. Our model can integrate human-driven preferences that guide the learning-to-explain process, and it is presented in a unified framework. Different typologies of explanations are evaluated in distinct experiments, showing that the proposed approach discovers new knowledge and can improve the classifier performance.

PDF Details DOI

IJCAI Conference 2019 Conference Paper

Motion Invariance in Visual Environments

Alessandro Betti
Marco Gori
Stefano Melacci

The puzzle of computer vision might find new challenging solutions when we realize that most successful methods are working at image level, which is remarkably more difficult than processing directly visual streams, just as it happens in nature. In this paper, we claim that the processing of a stream of frames naturally leads to formulate the motion invariance principle, which enables the construction of a new theory of visual learning based on convolutional features. The theory addresses a number of intriguing questions that arise in natural vision, and offers a well-posed computational scheme for the discovery of convolutional filters over the retina. They are driven by the Euler- Lagrange differential equations derived from the principle of least cognitive action, that parallels the laws of mechanics. Unlike traditional convolutional networks, which need massive supervision, the proposed theory offers a truly new scenario in which feature learning takes place by unsupervised processing of video signals. An experimental report of the theory is presented where we show that features extracted under motion invariance yield an improvement that can be assessed by measuring information-based indexes.

PDF Details

JMLR Journal 2011 Journal Article

Laplacian Support Vector Machines Trained in the Primal

Stefano Melacci
Mikhail Belkin

In the last few years, due to the growing ubiquity of unlabeled data, much effort has been spent by the machine learning community to develop better understanding and improve the quality of classifiers exploiting unlabeled data. Following the manifold regularization approach, Laplacian Support Vector Machines (LapSVMs) have shown the state of the art performance in semi-supervised classification. In this paper we present two strategies to solve the primal LapSVM problem, in order to overcome some issues of the original dual formulation. In particular, training a LapSVM in the primal can be efficiently performed with preconditioned conjugate gradient. We speed up training by using an early stopping strategy based on the prediction on unlabeled data or, if available, on labeled validation examples. This allows the algorithm to quickly compute approximate solutions with roughly the same classification accuracy as the optimal ones, considerably reducing the training time. The computational complexity of the training algorithm is reduced from O(n 3 ) to O(kn 2 ), where n is the combined number of labeled and unlabeled examples and k is empirically evaluated to be significantly smaller than n. Due to its simplicity, training LapSVM in the primal can be the starting point for additional enhancements of the original LapSVM formulation, such as those for dealing with large data sets. We present an extensive experimental evaluation on real world data showing the benefits of the proposed approach. [abs] [ pdf ][ bib ] &copy JMLR 2011. ( edit, beta )

PDF Details