Author name cluster

Sidharth Kumar

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

2 author rows

ICLR Conference 2025 Conference Paper

Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models Trained on Corrupted Data

Asad Aali
Giannis Daras
Brett Levac
Sidharth Kumar
Alexandros G. Dimakis
Jonathan I. Tamir

We provide a framework for solving inverse problems with diffusion models learned from linearly corrupted data. Firstly, we extend the Ambient Diffusion framework to enable training directly from measurements corrupted in the Fourier domain. Subsequently, we train diffusion models for MRI with access only to Fourier subsampled multi-coil measurements at acceleration factors R$=2, 4, 6, 8$. Secondly, we propose $\textit{Ambient Diffusion Posterior Sampling}$ (A-DPS), a reconstruction algorithm that leverages generative models pre-trained on one type of corruption (e.g. image inpainting) to perform posterior sampling on measurements from a different forward process (e.g. image blurring). For MRI reconstruction in high acceleration regimes, we observe that A-DPS models trained on subsampled data are better suited to solving inverse problems than models trained on fully sampled data. We also test the efficacy of A-DPS on natural image datasets (CelebA, FFHQ, and AFHQ) and show that A-DPS can sometimes outperform models trained on clean data for several image restoration tasks in both speed and performance.

Details

AAAI Conference 2025 Conference Paper

Column-Oriented Datalog on the GPU

Yihao Sun
Sidharth Kumar
Thomas Gilray
Kristopher Micinski

Datalog is a logic programming language widely used in knowledge representation and reasoning (KRR), program analysis, and social media mining due to its expressiveness and high performance. Traditionally, Datalog engines use either row-oriented or column-oriented storage. Engines like VLog and Nemo favor column-oriented storage for efficiency on limited-resource machines, while row-oriented engines like Soufflé use advanced datastructures with locking to perform better on multi-core CPUs. The advent of modern datacenter GPUs, such as the NVIDIA H100 with its ability to run over 16k threads simultaneously and high memory bandwidth, has reopened the debate on which storage layout is more effective. This paper presents the first column-oriented Datalog engines tailored to the strengths of modern GPUs. We present VFLog, a CUDA-based Datalog runtime library with a column-oriented GPU datastructure that supports all necessary relational algebra operations. Our results demonstrate over 200x performance gains over SOTA CPU-based column-oriented Datalog engines and a 2.5x speedup over GPU Datalog engines in various workloads, including KRR.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Faster Annotation for Elevation-Guided Flood Extent Mapping by Consistency-Enhanced Active Learning

Saugat Adhikari
Da Yan
Tianyang Wang
Landon Dyken
Sidharth Kumar
Lyuheng Yuan
Akhlaque Ahmad
Jiao Han

Flood extent mapping is crucial for disaster response and damage assessment. While Earth imagery and terrain data (in the form of DEM) are now readily available, there are few flood annotation data for training machine learning models, which hinders the automated mapping of flooded areas. We propose ALFA, an interactive active-learning-based approach to minimize the annotators' efforts when preparing the ground-truth flood map in a satellite image. ALFA calibrates the prediction consistency of a segmentation model (1) across training cycles and (2) for various data augmentations. The two consistencies are integrated into the design of both the acquisition function and the loss function to enhance the robustness of active learning with limited annotation inputs. ALFA recommends those superpixels that the underlying model is most uncertain about, and users can annotate their pixels with minimal clicks with the help of elevation guidance. Extensive experiments on various regions hit by flooding show that we can improve the annotation time from hours to around 20 minutes. ALFA is open sourced at https: //github. com/saugatadhikari/alfa.

PDF Details DOI