Author name cluster

Bruno Ribeiro

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

1 author row

NeurIPS Conference 2025 Conference Paper

Differentiable Constraint-Based Causal Discovery

Jincheng Zhou
Mengbo Wang
Anqi He
Yumeng Zhou
Hessam Olya
Murat Kocaoglu
Bruno Ribeiro

Causal discovery from observational data is a fundamental task in artificial intelligence, with far-reaching implications for decision-making, predictions, and interventions. Despite significant advances, existing methods can be broadly categorized as constraint-based or score-based approaches. Constraint-based methods offer rigorous causal discovery but are often hindered by small sample sizes, while score-based methods provide flexible optimization but typically forgo explicit conditional independence testing. This work explores a third avenue: developing differentiable $d$-separation scores, obtained through a percolation theory using soft logic. This enables the implementation of a new type of causal discovery method: gradient-based optimization of conditional independence constraints. Empirical evaluations demonstrate the robust performance of our approach in low-sample regimes, surpassing traditional constraint-based and score-based baselines on a real-world dataset. Code implementing the proposed method is publicly available at [https: //github. com/PurdueMINDS/DAGPA](https: //github. com/PurdueMINDS/DAGPA).

PDF Details

NeurIPS Conference 2024 Conference Paper

A Foundation Model for Zero-shot Logical Query Reasoning

Mikhail Galkin
Jincheng Zhou
Bruno Ribeiro
Jian Tang
Zhaocheng Zhu

Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional queries comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph. Here we present UltraQuery, the first foundation model for inductive reasoning that can zero-shot answer logical queries on any KG. The core idea of UltraQuery is to derive both projections and logical operations as vocabulary-independent functions which generalize to new entities and relations in any KG. With the projection operation initialized from a pre-trained inductive KG completion model, UltraQuery can solve CLQA on any KG after finetuning on a single dataset. Experimenting on 23 datasets, UltraQuery in the zero-shot inference mode shows competitive or better query answering performance than best available baselines and sets a new state of the art on 15 of them.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

DiGRAF: Diffeomorphic Graph-Adaptive Activation Function

Krishna S. Mantri
Xinzhi Wang
Carola-Bibiane Schönlieb
Bruno Ribeiro
Beatrice Bevilacqua
Moshe Eliasof

In this paper, we propose a novel activation function tailored specifically for graph data in Graph Neural Networks (GNNs). Motivated by the need for graph-adaptive and flexible activation functions, we introduce DiGRAF, leveraging Continuous Piecewise-Affine Based (CPAB) transformations, which we augment with an additional GNN to learn a graph-adaptive diffeomorphic activation function in an end-to-end manner. In addition to its graph-adaptivity and flexibility, DiGRAF also possesses properties that are widely recognized as desirable for activation functions, such as differentiability, boundness within the domain, and computational efficiency. We conduct an extensive set of experiments across diverse datasets and tasks, demonstrating a consistent and superior performance of DiGRAF compared to traditional and graph-specific activation functions, highlighting its effectiveness as an activation function for GNNs. Our code is available at https: //github. com/ipsitmantri/DiGRAF.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts

Shirley Wu
Kaidi Cao
Bruno Ribeiro
James Zou
Jure Leskovec

Graph data are inherently complex and heterogeneous, leading to a high natural diversity of distributional shifts. However, it remains unclear how to build machine learning architectures that generalize to the complex distributional shifts naturally occurring in the real world. Here, we develop GraphMETRO, a Graph Neural Network architecture that models natural diversity and captures complex distributional shifts. GraphMETRO employs a Mixture-of-Experts (MoE) architecture with a gating model and multiple expert models, where each expert model targets a specific distributional shift to produce a referential representation w. r. t. a reference model, and the gating model identifies shift components. Additionally, we design a novel objective that aligns the representations from different expert models to ensure reliable optimization. GraphMETRO achieves state-of-the-art results on four datasets from the GOOD benchmark, which is comprised of complex and natural real-world distribution shifts, improving by 67% and 4. 2% on the WebKB and Twitch datasets. Code and data are available at https: //github. com/Wuyxin/GraphMETRO.

PDF Details DOI

NeurIPS Conference 2022 Conference Paper

OOD Link Prediction Generalization Capabilities of Message-Passing GNNs in Larger Test Graphs

Yangze Zhou
Gitta Kutyniok
Bruno Ribeiro

This work provides the first theoretical study on the ability of graph Message Passing Neural Networks (gMPNNs) ---such as Graph Neural Networks (GNNs)--- to perform inductive out-of-distribution (OOD) link prediction tasks, where deployment (test) graph sizes are larger than training graphs. We first prove non-asymptotic bounds showing that link predictors based on permutation-equivariant (structural) node embeddings obtained by gMPNNs can converge to a random guess as test graphs get larger. We then propose a theoretically-sound gMPNN that outputs structural pairwise (2-node) embeddings and prove non-asymptotic bounds showing that, as test graphs grow, these embeddings converge to embeddings of a continuous function that retains its ability to predict links OOD. Empirical results on random graphs show agreement with our theoretical results.

PDF Details

NeurIPS Conference 2021 Conference Paper

Reconstruction for Powerful Graph Representations

Leonardo Cotta
Christopher Morris
Bruno Ribeiro

Graph neural networks (GNNs) have limited expressive power, failing to represent many graph classes correctly. While more expressive graph representation learning (GRL) alternatives can distinguish some of these classes, they are significantly harder to implement, may not scale well, and have not been shown to outperform well-tuned GNNs in real-world tasks. Thus, devising simple, scalable, and expressive GRL architectures that also achieve real-world improvements remains an open challenge. In this work, we show the extent to which graph reconstruction---reconstructing a graph from its subgraphs---can mitigate the theoretical and practical problems currently faced by GRL architectures. First, we leverage graph reconstruction to build two new classes of expressive graph representations. Secondly, we show how graph reconstruction boosts the expressive power of any GNN architecture while being a (provably) powerful inductive bias for invariances to vertex removals. Empirically, we show how reconstruction can boost GNN's expressive power---while maintaining its invariance to permutations of the vertices---by solving seven graph property tasks not solvable by the original GNN. Further, we demonstrate how it boosts state-of-the-art GNN's performance across nine real-world benchmark datasets.

PDF Details

AAAI Conference 2020 Conference Paper

Infinity Learning: Learning Markov Chains from Aggregate Steady-State Observations

Jianfei Gao
Mohamed A. Zahran
Amit Sheoran
Sonia Fahmy
Bruno Ribeiro

We consider the task of learning a parametric Continuous Time Markov Chain (CTMC) sequence model without examples of sequences, where the training data consists entirely of aggregate steady-state statistics. Making the problem harder, we assume that the states we wish to predict are unobserved in the training data. Speciﬁcally, given a parametric model over the transition rates of a CTMC and some known transition rates, we wish to extrapolate its steady state distribution to states that are unobserved. A technical roadblock to learn a CTMC from its steady state has been that the chain rule to compute gradients will not work over the arbitrarily long sequences necessary to reach steady state —from where the aggregate statistics are sampled. To overcome this optimization challenge, we propose ∞-SGD, a principled stochastic gradient descent method that uses randomly-stopped estimators to avoid inﬁnite sums required by the steady state computation, while learning even when only a subset of the CTMC states can be observed. We apply ∞-SGD to a real-world testbed and synthetic experiments showcasing its accuracy, ability to extrapolate the steady state distribution to unobserved states under unobserved conditions (heavy loads, when training under light loads), and succeeding in difﬁcult scenarios where even a tailor-made extension of existing methods fails.

PDF Details

NeurIPS Conference 2020 Conference Paper

Unsupervised Joint k-node Graph Representations with Compositional Energy-Based Models

Leonardo Cotta
Carlos H. C. Teixeira
Ananthram Swami
Bruno Ribeiro

Existing Graph Neural Network (GNN) methods that learn inductive unsupervised graph representations focus on learning node and edge representations by predicting observed edges in the graph. Although such approaches have shown advances in downstream node classification tasks, they are ineffective in jointly representing larger k-node sets, k{>}2. We propose MHM-GNN, an inductive unsupervised graph representation approach that combines joint k-node representations with energy-based models (hypergraph Markov networks) and GNNs. To address the intractability of the loss that arises from this combination, we endow our optimization with a loss upper bound using a finite-sample unbiased Markov Chain Monte Carlo estimator. Our experiments show that the unsupervised joint k-node representations of MHM-GNN produce better unsupervised representations than existing approaches from the literature.

PDF Details

AAAI Conference 2018 Conference Paper

From Monte Carlo to Las Vegas: Improving Restricted Boltzmann Machine Training Through Stopping Sets

Pedro Savarese
Mayank Kakodkar
Bruno Ribeiro

We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs). We denote our approach Markov Chain Las Vegas (MCLV). MCLV gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has maximum number of Markov chain steps K (referred as MCLV-K). We present a MCLV-K gradient estimator (LVS-K) for RBMs and explore the correspondence and differences between LVS-K and Contrastive Divergence (CD-K), with LVS-K signiﬁcantly outperforming CD-K training RBMs over the MNIST dataset, indicating MCLV to be a promising direction in learning generative models.

PDF Details

AAAI Conference 2018 Conference Paper

Subgraph Pattern Neural Networks for High-Order Graph Evolution Prediction

Changping Meng
S Chandra Mouli
Bruno Ribeiro
Jennifer Neville

In this work we generalize traditional node/link prediction tasks in dynamic heterogeneous networks, to consider joint prediction over larger k-node induced subgraphs. Our key insight is to incorporate the unavoidable dependencies in the training observations of induced subgraphs into both the input features and the model architecture itself via high-order dependencies. The strength of the representation is its invariance to isomorphisms and varying local neighborhood sizes, while still being able to take node/edge labels into account, and facilitating inductive reasoning (i. e. , generalization to unseen portions of the network). Empirical results show that our proposed method signiﬁcantly outperforms other stateof-the-art methods designed for static and/or single node/link prediction tasks. In addition, we show that our method is scalable and learns interpretable parameters.

PDF Details