Author name cluster

Tomas Tokar

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers

1 author row

AAAI Conference 2026 Conference Paper

Efficient Modality Translation via Arbitrary Conditioning and Wasserstein Regularization

Tomas Tokar
Scott Sanner

The central challenge in multimodal generative modeling lies in accurately approximating the joint data distribution, even when some modalities are missing. Existing multimodal VAEs solve this by designing increasingly complex encoding architectures, relying on modality-specific encoders, factorized posteriors, and custom inference procedures. This restricts their ability to capture relations among modalities by amortizing the encoding parameters. We challenge this paradigm by introducing a model trained for arbitrary conditioning, i.e., generating any modality given a subset of observed modalities and a logical index indicating which modalities are present or missing. This enables a single unified encoder to handle any subset of modalities while capturing inter-modal relationships via a compact, shared posterior. We find that to work efficiently in the multimodal setup, arbitrary conditioning requires replacing the KL divergence with Wasserstein regularization, which allows more dispersed latent embeddings to support learning over diverse data and modality subsets. This key insight exposes a critical deficiency in existing methods, which rely on KL regularization that tends to concentrate individual embeddings near the standard Gaussian prior, despite coming from very diverse subsets of multimodal inputs. We prove that Wasserstein regularization ensures that the aggregate latent distribution -- spanning all conditioning subsets -- aligns with the prior without requiring mixture models or auxiliary inference tricks. Empirically, the proposed model improves cross-modal generation and yields better reconstructions than state-of-the-art multimodal VAEs.

PDF Details DOI

AAAI Conference 2025 Conference Paper

ICE-T: Interactions-aware Cross-column Contrastive Embedding for Heterogeneous Tabular Datasets

Tomas Tokar
Scott Sanner

Finding high-quality representations of heterogeneous tabular datasets is crucial for their effective use in downstream machine learning tasks. Contrastive representation learning (CRL) methods have been previously shown to provide a straightforward way to learn such representations across various data domains. Current tabular CRL methods learn joint embeddings of data instances (tabular rows) by minimizing a contrastive loss between the original instance and its perturbations. Unlike existing tabular CRL methods, we propose leveraging frameworks established in multimodal representation learning, treating each tabular column as a distinct modality. A naive approach that applies a contrastive loss pairwise to tabular columns is not only prohibitively expensive as the number of columns increases, but as we demonstrate, it also fails to capture interactions between variables. Instead, we propose a novel method called ICE-T that learns each columnar embedding by contrasting it with aggregate embeddings of the complementary part of the table, thus capturing interactions and scaling linearly with the number of columns. Unlike existing tabular CRL methods, ICE-T allows for column-specific embeddings to be obtained independently of the rest of the table, enabling the inference of missing values and translation between columnar variables. We provide a comprehensive evaluation of ICE-T across diverse datasets, demonstrating that it generally surpasses the performance of the state-of-the-art alternatives.

PDF Details DOI