Author name cluster

Danial Dervovic

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models

Wenzhuo Tang
Haitao Mao
Danial Dervovic
Ivan Brugere
Saumitra Mishra
Yuying Xie
Jiliang Tang

Models for natural language and images benefit from data scaling behavior: the more data fed into the model, the better they perform. This 'better with more' phenomenon enables the effectiveness of large-scale pre-training on vast amounts of data. However, current graph pre-training methods struggle to scale up data due to heterogeneity across graphs. To achieve effective data scaling, we aim to develop a general model that is able to capture diverse data patterns of graphs and can be utilized to adaptively help the downstream tasks. To this end, we propose UniAug, a universal graph structure augmentor built on a diffusion model. We first pre-train a discrete diffusion model on thousands of graphs across domains to learn the graph structural patterns. In the downstream phase, we provide adaptive enhancement by conducting graph structure augmentation with the help of the pre-trained diffusion model via guided generation. By leveraging the pre-trained diffusion model for structure augmentation, we consistently achieve performance improvements across various downstream tasks in a plug-and-play manner. To the best of our knowledge, this study represents the first demonstration of a data-scaling graph structure augmentor on graphs across domains.

PDF Details

IJCAI Conference 2024 Conference Paper

Are Logistic Models Really Interpretable?

Danial Dervovic
Freddy Lecue
Nicolas Marchesotti
Daniele Magazzeni

The demand for open and trustworthy AI models points towards widespread publishing of model weights. Consumers of these model weights must be able to act accordingly with the information provided. That said, one of the simplest AI classification models, Logistic Regression (LR), has an unwieldy interpretation of its model weights, with greater difficulties when extending LR to generalised additive models. In this work, we show via a User Study that skilled participants are unable to reliably reproduce the action of small LR models given the trained parameters. As an antidote to this, we define Linearised Additive Models (LAMs), an optimal piecewise linear approximation that augments any trained additive model equipped with a sigmoid link function, requiring no retraining. We argue that LAMs are more interpretable than logistic models -- survey participants are shown to solve model reasoning tasks with LAMs much more accurately than with LR given the same information. Furthermore, we show that LAMs do not suffer from large performance penalties in terms of ROC-AUC and calibration with respect to their logistic counterparts on a broad suite of public financial modelling data.

PDF Details DOI

ICML Conference 2024 Conference Paper

Bounding the Excess Risk for Linear Models Trained on Marginal-Preserving, Differentially-Private, Synthetic Data

Yvonne Zhou
Mingyu Liang
Ivan Brugere
Danial Dervovic
Antigoni Polychroniadou
Min Wu 0001
Dana Dachman-Soled

The growing use of machine learning (ML) has raised concerns that an ML model may reveal private information about an individual who has contributed to the training dataset. To prevent leakage of sensitive data, we consider using differentially- private (DP), synthetic training data instead of real training data to train an ML model. A key desirable property of synthetic data is its ability to preserve the low-order marginals of the original distribution. Our main contribution comprises novel upper and lower bounds on the excess empirical risk of linear models trained on such synthetic data, for continuous and Lipschitz loss functions. We perform extensive experimentation alongside our theoretical results.

Details

AAAI Conference 2022 Conference Paper

Optimal Admission Control for Multiclass Queues with Time-Varying Arrival Rates via State Abstraction

Marc Rigter
Danial Dervovic
Parisa Hassanzadeh
Jason Long
Parisa Zehtabi
Daniele Magazzeni

We consider a novel queuing problem where the decisionmaker must choose to accept or reject randomly arriving tasks into a no buffer queue which are processed by N identical servers. Each task has a price, which is a positive real number, and a class. Each class of task has a different price distribution, service rate, and arrives according to an inhomogenous Poisson process. The objective is to decide which tasks to accept so that the total price of tasks processed is maximised over a finite horizon. We formulate the problem using a discrete time Markov Decision Process (MDP) with a hybrid state space. We show that the optimal value function has a specific structure, which enables us to solve the hybrid MDP exactly. Moreover, we rigorously prove that as the gap between successive decision epochs grows smaller, the discrete time solution approaches the optimal solution to the original continuous time problem. To improve the scalability of our approach to a greater number of servers and task classes, we present an approximation based on state abstraction. We validate our approach on synthetic data, as well as a real financial fraud data set, which is the motivating application for this work.

PDF Details

IJCAI Conference 2021 Conference Paper

Non-Parametric Stochastic Sequential Assignment With Random Arrival Times

Danial Dervovic
Parisa Hassanzadeh
Samuel Assefa
Prashant Reddy

We consider a problem wherein jobs arrive at random times and assume random values. Upon each job arrival, the decision-maker must decide immediately whether or not to accept the job and gain the value on offer as a reward, with the constraint that they may only accept at most n jobs over some reference time period. The decision-maker only has access to M independent realisations of the job arrival process. We propose an algorithm, Non-Parametric Sequential Allocation (NPSA), for solving this problem. Moreover, we prove that the expected reward returned by the NPSA algorithm converges in probability to optimality as M grows large. We demonstrate the effectiveness of the algorithm empirically on synthetic data and on public fraud-detection datasets, from where the motivation for this work is derived.

PDF Details DOI