Author name cluster

Sandipan Sikdar

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

2 author rows

AAAI Conference 2024 Conference Paper

FairTrade: Achieving Pareto-Optimal Trade-Offs between Balanced Accuracy and Fairness in Federated Learning

Maryam Badar
Sandipan Sikdar
Wolfgang Nejdl
Marco Fisichella

As Federated Learning (FL) gains prominence in distributed machine learning applications, achieving fairness without compromising predictive performance becomes paramount. The data being gathered from distributed clients in an FL environment often leads to class imbalance. In such scenarios, balanced accuracy rather than accuracy is the true representation of model performance. However, most state-of-the-art fair FL methods report accuracy as the measure of performance, which can lead to misguided interpretations of the model's effectiveness to mitigate discrimination. To the best of our knowledge, this work presents the first attempt towards achieving Pareto-optimal trade-offs between balanced accuracy and fairness in a federated environment (FairTrade). By utilizing multi-objective optimization, the framework negotiates the intricate balance between model's balanced accuracy and fairness. The framework's agnostic design adeptly accommodates both statistical and causal fairness notions, ensuring its adaptability across diverse FL contexts. We provide empirical evidence of our framework's efficacy through extensive experiments on five real-world datasets and comparisons with six baselines. The empirical results underscore the potential of our framework in improving the trade-off between fairness and balanced accuracy in FL applications.

PDF Details DOI

ECAI Conference 2024 Conference Paper

IndMask: Inductive Explanation for Multivariate Time Series Black-Box Models

Seham Nasr
Sandipan Sikdar

In this paper, we introduce IndMask, a framework for explaining decisions of black-box time series models. While there exists a plethora of methods for providing explanations of machine learning models, time series data requires additional considerations. One needs to consider the time aspect in the explanations as well as deal with a large number of input features. Recent work has proposed explaining a time series prediction by generating a mask over the input time series. Each entry in the mask corresponds to an importance score for each feature at each time step. However, these methods only generate instancewise explanations, which means a mask needs to be computed for each input individually, thereby making them unsuited for inductive settings, where explanations need to be generated for numerous inputs, and instancewise explanation generation is severely prohibitive. Additionally, these methods have mostly been evaluated on simple recurrent neural networks and are often only applicable to a specific downstream task. Our proposed framework IndMask addresses these issues by utilizing a parameterized model for mask generation. We also go beyond recurrent neural networks and deploy IndMask to transformer architectures, thereby genuinely demonstrating its model-agnostic nature. The effectiveness of IndMask is further demonstrated through experiments over real-world datasets and time series classification and forecasting tasks. It is also computationally efficient and can be deployed in conjunction with any time series model.

Details

AAAI Conference 2024 Conference Paper

IVP-VAE: Modeling EHR Time Series with Initial Value Problem Solvers

Jingge Xiao
Leonie Basso
Wolfgang Nejdl
Niloy Ganguly
Sandipan Sikdar

Continuous-time models such as Neural ODEs and Neural Flows have shown promising results in analyzing irregularly sampled time series frequently encountered in electronic health records. Based on these models, time series are typically processed with a hybrid of an initial value problem (IVP) solver and a recurrent neural network within the variational autoencoder architecture. Sequentially solving IVPs makes such models computationally less efficient. In this paper, we propose to model time series purely with continuous processes whose state evolution can be approximated directly by IVPs. This eliminates the need for recurrent computation and enables multiple states to evolve in parallel. We further fuse the encoder and decoder with one IVP solver utilizing its invertibility, which leads to fewer parameters and faster convergence. Experiments on three real-world datasets show that the proposed method can systematically outperform its predecessors, achieve state-of-the-art results, and have significant advantages in terms of data efficiency.

PDF Details DOI

ECAI Conference 2024 Conference Paper

TrustFed: Navigating Trade-offs Between Performance, Fairness, and Privacy in Federated Learning

Maryam Badar
Sandipan Sikdar
Wolfgang Nejdl
Marco Fisichella

As Federated Learning (FL) gains prominence in secure machine learning applications, achieving trustworthy predictions without compromising predictive performance becomes paramount. While Differential Privacy (DP) is extensively used for its effective privacy protection, yet its application as a lossy protection method can lower the predictive performance of the machine learning model. Also, the data being gathered from distributed clients in an FL environment often leads to class imbalance making traditional accuracy measure less reflective of the true performance of prediction model. In this context, we introduce a fairness-aware FL framework (TrustFed) based on Gaussian differential privacy and Multi-Objective Optimization (MOO), which effectively protects privacy while providing fair and accurate predictions. To the best of our knowledge, this is the first attempt towards achieving Pareto-optimal trade-offs between balanced accuracy and fairness in a federated environment while safeguarding the privacy of individual clients. The framework’s flexible design adeptly accommodates both statistical parity and equal opportunity fairness notions, ensuring its applicability in various FL scenarios. We demonstrate our framework’s effectiveness through comprehensive experiments on five real-world datasets. TrustFed consistently achieves comparable performance fairness tradeoff to the state-of-the-art (SoTA) baseline models while preserving the anonymization rights of users in FL applications.

Details

AAMAS Conference 2018 Conference Paper

ComPAS: Community Preserving Sampling for Streaming Graphs

Sandipan Sikdar
Tanmoy Chakraborty
Soumya Sarkar
Niloy Ganguly
Animesh Mukherjee

In the era of big data, graph sampling is indispensable in many settings. Existing sampling methods are mostly designed for static graphs, and aim to preserve basic structural properties of the original graph (such as degree distribution, clustering coefficient etc.) in the sample. We argue that for any sampling method it is impossible to produce an universal representative sample which can preserve all the properties of the original graph; rather sampling should be application specific (such as preserving hubs - needed for information diffusion). Here we consider community detection as an application scenario. We propose ComPAS, a novel sampling strategy that unlike previous methods, is not only designed for streaming graphs (which is a more realistic representation of a real-world scenario) but also preserves the community structure of the original graph in the sample. Empirical results on both synthetic and different real-world graphs show that ComPAS is the best to preserve the underlying community structure with average performance reaching 73. 2% of the most informed algorithm for static graphs.

PDF