Arrow Research search

Author name cluster

Juan Parras

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
1 author row

Possible papers

3

JBHI Journal 2026 Journal Article

Advancing Cancer Research With Synthetic Data Generation in Low-Data Scenarios

  • Patricia A. Apellániz
  • Borja Arroyo Galende
  • Ana Jiménez
  • Juan Parras
  • Santiago Zazo

The scarcity of medical data, particularly in Survival Analysis (SA) for cancer-related diseases, challenges data-driven healthcare research. While Synthetic Tabular Data Generation (STDG) models have been proposed to address this issue, most rely on datasets with abundant samples, which do not reflect real-world limitations. We suggest using an STDG approach that leverages transfer learning and meta-learning techniques to create an artificial inductive bias, guiding generative models trained on limited samples. Experiments on classification datasets across varying sample sizes validated the method’s robustness, with further clinical utility assessment on cancer-related SA data. While divergence-based similarity validation proved effective in capturing improvements in generation quality, clinical utility validation showed limited sensitivity to sample size, highlighting its shortcomings. In SA experiments, we observed that altering the task can reveal if relationships among variables are accurately generated, with most cases benefiting from the proposed methodology. Our findings confirm the method’s ability to generate high-quality synthetic data under constrained conditions. We emphasize the need to complement utility-based validation with similarity metrics, particularly in low-data settings, to assess STDG performance reliably.

NeurIPS Conference 2025 Conference Paper

DeCaFlow: A deconfounding causal generative model

  • Alejandro Almodóvar
  • Adrián Javaloy
  • Juan Parras
  • Santiago Zazo
  • Isabel Valera

We introduce DeCaFlow, a deconfounding causal generative model. Training once per dataset using just observational data and the underlying causal graph, DeCaFlow enables accurate causal inference on continuous variables under the presence of hidden confounders. Specifically, we extend previous results on causal estimation under hidden confounding to show that a single instance of DeCaFlow provides correct estimates for all causal queries identifiable with do-calculus, leveraging proxy variables to adjust for the causal effects when do-calculus alone is insufficient. Moreover, we show that counterfactual queries are identifiable as long as their interventional counterparts are identifiable, and thus are also correctly estimated by DeCaFlow. Our empirical results on diverse settings—including the Ecoli70 dataset, with 3 independent hidden confounders, tens of observed variables and hundreds of causal queries—show that DeCaFlow outperforms existing approaches, while demonstrating its out-of-the-box applicability to any given causal graph.