Author name cluster

Grigorios Chrysos

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

1 author row

AAAI Conference 2026 Conference Paper

Beyond Neuron-Level Sparsity: Achieving Faithful and Interpretable LLMs with Mixture of Decoders

Grigorios Chrysos

As large language models (LLMs) scale, ensuring interpretability and privacy becomes critical. This talk addresses these interconnected challenges with novel approaches to model specialization and safety. First, we tackle the dense, distributed nature of LLM representations by casting Mixture-of-Experts (MoE) as a tensor decomposition, enabling specialized experts in a factorized space. Second, we argue that current neuron-level sparsity methods create a severe accuracy-sparsity trade-off, and we propose a paradigm shift to layer-level sparsity with the Mixture of Decoders (MxD). We explain how MxD uses tensor factorization to expand dense layers into thousands of specialized, full-rank sublayers, demonstrating how it significantly outperforms alternatives in preserving model faithfulness and performance across LLMs up to 3B parameters. Finally, we address privacy in open-weight models by proposing a scalable and certifiable algorithm that induces maximal uncertainty on protected instances, proving tight bounds that characterize the resulting privacy-utility tradeoff.

PDF Details DOI

TMLR Journal 2026 Journal Article

LJ-Bench: Ontology-Based Benchmark for U.S. Crime

Hung Yun Tseng
Wuzhen Li
Blerina Gkotse
Grigorios Chrysos

The potential of Large Language Models (LLMs) to provide harmful information remains a significant concern due to the vast breadth of illegal queries they may encounter. Unfortunately, existing benchmarks only focus on a handful types of illegal activities, and are not grounded in legal works. In this work, we introduce an ontology of crime-related concepts grounded in the legal frameworks of Model Panel Code, which serves as an influential reference for criminal law and has been adopted by many U.S. states, and instantiated using Californian Law. This structured knowledge forms the foundation for LJ-Bench, the first comprehensive benchmark designed to evaluate LLM robustness against a wide range of illegal activities. Spanning 76 distinct crime types organized taxonomically, LJ-Bench enables systematic assessment of diverse attacks, revealing valuable insights into LLM vulnerabilities across various crime categories — LLMs exhibit heightened susceptibility to attacks targeting societal harm rather than those directly impacting individuals. Our benchmark aims to facilitate the development of more robust and trustworthy LLMs. The LJ-Bench benchmark and LJ-Ontology, along with experiments implementation for reproducibility are publicly available at https://github.com/AndreaTseng/LJ-Bench.

PDF Details

TMLR Journal 2025 Journal Article

Single-pass Detection of Jailbreaking Input in Large Language Models

Leyla Naz Candogan
Yongtao Wu
Elias Abad Rocamora
Grigorios Chrysos
Volkan Cevher

Defending aligned Large Language Models (LLMs) against jailbreaking attacks is a challenging problem, with existing approaches requiring multiple requests or even queries to auxiliary LLMs, making them computationally heavy. Instead, we focus on detecting jailbreaking input in a single forward pass. Our method, called SPD, leverages the information carried by the logits to predict whether the output sentence will be harmful. This allows us to defend in just a forward pass. SPD can not only detect attacks effectively on open-source models, but also minimizes the misclassification of harmless inputs. Furthermore, we show that SPD remains effective even without complete logit access in GPT-3.5 and GPT-4. We believe that our proposed method offers a promising approach to efficiently safeguard LLMs against adversarial attacks.

PDF Details

NeurIPS Conference 2025 Conference Paper

Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders

James Oldfield
Shawn Im
Sharon Li
Mihalis Nicolaou
Ioannis Patras
Grigorios Chrysos

Multilayer perceptrons (MLPs) are an integral part of large language models, yet their dense representations render them difficult to understand, edit, and steer. Recent methods learn interpretable approximations via neuron-level sparsity, yet fail to faithfully reconstruct the original mapping--significantly increasing model's next-token cross-entropy loss. In this paper, we advocate for moving to layer-level sparsity to overcome the accuracy trade-off in sparse layer approximation. Under this paradigm, we introduce Mixture of Decoders (MxDs). MxDs generalize MLPs and Gated Linear Units, expanding pre-trained dense layers into tens of thousands of specialized sublayers. Through a flexible form of tensor factorization, each sparsely activating MxD sublayer implements a linear transformation with full-rank weights--preserving the original decoders' expressive capacity even under heavy sparsity. Experimentally, we show that MxDs significantly outperform state-of-the-art methods (e. g. , Transcoders) on the sparsity-accuracy frontier in language models with up to 3B parameters. Further evaluations on sparse probing and feature steering demonstrate that MxDs learn similarly specialized features of natural language--opening up a promising new avenue for designing interpretable yet faithful decompositions. Our code is included at: https: //github. com/james-oldfield/MxD.

PDF Details

TMLR Journal 2024 Journal Article

PNeRV: A Polynomial Neural Representation for Videos

Sonam Gupta
Snehal Singh Tomar
Grigorios Chrysos
Sukhendu Das
Rajagopalan N Ambasamduram

Extracting Implicit Neural Representations (INRs) on video data poses unique challenges due to the additional temporal dimension. In the context of videos, INRs have predominantly relied on a frame-only parameterization, which sacrifices the spatiotemporal continuity observed in pixel-level (spatial) representations. To mitigate this, we introduce Polynomial Neural Representation for Videos (PNeRV), a parameter-wise efficient, patch-wise INR for videos that preserves spatiotemporal continuity. PNeRV leverages the modeling capabilities of Polynomial Neural Networks to perform the modulation of a continuous spatial (patch) signal with a continuous time (frame) signal. We further propose a custom Hierarchical Patch-wise Spatial Sampling Scheme that ensures spatial continuity while retaining parameter efficiency. We also employ a carefully designed Positional Embedding methodology to further enhance PNeRV's performance. Our extensive experimentation demonstrates that PNeRV outperforms the baselines in conventional Implicit Neural Representation tasks like compression along with downstream applications that require spatiotemporal continuity in the underlying representation. PNeRV not only addresses the challenges posed by video data in the realm of INRs but also opens new avenues for advanced video processing and analysis.

PDF Details

TMLR Journal 2023 Journal Article

Federated Learning under Covariate Shifts with Generalization Guarantees

Ali Ramezani-Kebrya
Fanghui Liu
Thomas Pethick
Grigorios Chrysos
Volkan Cevher

This paper addresses intra-client and inter-client covariate shifts in federated learning (FL) with a focus on the overall generalization performance. To handle covariate shifts, we formulate a new global model training paradigm and propose Federated Importance-Weighted Empirical Risk Minimization (FTW-ERM) along with improving density ratio matching methods without requiring perfect knowledge of the supremum over true ratios. We also propose the communication-efficient variant FITW-ERM with the same level of privacy guarantees as those of classical ERM in FL. We theoretically show that FTW-ERM achieves smaller generalization error than classical ERM under certain settings. Experimental results demonstrate the superiority of FTW-ERM over existing FL baselines in challenging imbalanced federated settings in terms of data distribution shifts across clients.

PDF Details

NeurIPS Conference 2023 Conference Paper

Maximum Independent Set: Self-Training through Dynamic Programming

Lorenzo Brusca
Lars C. P. M. Quaedvlieg
Stratis Skoulakis
Grigorios Chrysos
Volkan Cevher

This work presents a graph neural network (GNN) framework for solving the maximum independent set (MIS) problem, inspired by dynamic programming (DP). Specifically, given a graph, we propose a DP-like recursive algorithm based on GNNs that firstly constructs two smaller sub-graphs, predicts the one with the larger MIS, and then uses it in the next recursive call. To train our algorithm, we require annotated comparisons of different graphs concerning their MIS size. Annotating the comparisons with the output of our algorithm leads to a self-training process that results in more accurate self-annotation of the comparisons and vice versa. We provide numerical evidence showing the superiority of our method vs prior methods in multiple synthetic and real-world datasets.

PDF Details

NeurIPS Conference 2023 Conference Paper

On the Convergence of Encoder-only Shallow Transformers

Yongtao Wu
Fanghui Liu
Grigorios Chrysos
Volkan Cevher

In this paper, we aim to build the global convergence theory of encoder-only shallow Transformers under a realistic setting from the perspective of architectures, initialization, and scaling under a finite width regime. The difficulty lies in how to tackle the softmax in self-attention mechanism, the core ingredient of Transformer. In particular, we diagnose the scaling scheme, carefully tackle the input/output of softmax, and prove that quadratic overparameterization is sufficient for global convergence of our shallow Transformers under commonly-used He/LeCun initialization in practice. Besides, neural tangent kernel (NTK) based analysis is also given, which facilitates a comprehensive comparison. Our theory demonstrates the separation on the importance of different scaling schemes and initialization. We believe our results can pave the way for a better understanding of modern Transformers, particularly on training dynamics.

PDF Details

TMLR Journal 2023 Journal Article

Revisiting adversarial training for the worst-performing class

Thomas Pethick
Grigorios Chrysos
Volkan Cevher

Despite progress in adversarial training (AT), there is a substantial gap between the top-performing and worst-performing classes in many datasets. For example, on CIFAR10, the accuracies for the best and worst classes are 74% and 23%, respectively. We argue that this gap can be reduced by explicitly optimizing for the worst-performing class, resulting in a min-max-max optimization formulation. Our method, called class focused online learning (CFOL), includes high probability convergence guarantees for the worst class loss and can be easily integrated into existing training setups with minimal computational overhead. We demonstrate an improvement to 32% in the worst class accuracy on CIFAR10, and we observe consistent behavior across CIFAR100 and STL10. Our study highlights the importance of moving beyond average accuracy, which is particularly important in safety-critical applications.

PDF Details

NeurIPS Conference 2022 Conference Paper

Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study

Yongtao Wu
Zhenyu Zhu
Fanghui Liu
Grigorios Chrysos
Volkan Cevher

Neural tangent kernel (NTK) is a powerful tool to analyze training dynamics of neural networks and their generalization bounds. The study on NTK has been devoted to typical neural network architectures, but it is incomplete for neural networks with Hadamard products (NNs-Hp), e. g. , StyleGAN and polynomial neural networks (PNNs). In this work, we derive the finite-width NTK formulation for a special class of NNs-Hp, i. e. , polynomial neural networks. We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK. Based on our results, we elucidate the separation of PNNs over standard neural networks with respect to extrapolation and spectral bias. Our two key insights are that when compared to standard neural networks, PNNs can fit more complicated functions in the extrapolation regime and admit a slower eigenvalue decay of the respective NTK, leading to a faster learning towards high-frequency functions. Besides, our theoretical results can be extended to other types of NNs-Hp, which expand the scope of our work. Our empirical results validate the separations in broader classes of NNs-Hp, which provide a good justification for a deeper understanding of neural architectures.

PDF Details

NeurIPS Conference 2022 Conference Paper

Generalization Properties of NAS under Activation and Skip Connection Search

Zhenyu Zhu
Fanghui Liu
Grigorios Chrysos
Volkan Cevher

Neural Architecture Search (NAS) has fostered the automatic discovery of state-of-the-art neural architectures. Despite the progress achieved with NAS, so far there is little attention to theoretical guarantees on NAS. In this work, we study the generalization properties of NAS under a unifying framework enabling (deep) layer skip connection search and activation function search. To this end, we derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime using a certain search space including mixed activation functions, fully connected, and residual neural networks. We use the minimum eigenvalue to establish generalization error bounds of NAS in the stochastic gradient descent training. Importantly, we theoretically and experimentally show how the derived results can guide NAS to select the top-performing architectures, even in the case without training, leading to a train-free algorithm based on our theory. Accordingly, our numerical validation shed light on the design of computationally efficient methods for NAS. Our analysis is non-trivial due to the coupling of various architectures and activation functions under the unifying framework and has its own interest in providing the lower bound of the minimum eigenvalue of NTK in deep learning theory.

PDF Details

NeurIPS Conference 2022 Conference Paper

Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)

Zhenyu Zhu
Fanghui Liu
Grigorios Chrysos
Volkan Cevher

We study the average robustness notion in deep neural networks in (selected) wide and narrow, deep and shallow, as well as lazy and non-lazy training settings. We prove that in the under-parameterized setting, width has a negative effect while it improves robustness in the over-parameterized setting. The effect of depth closely depends on the initialization and the training mode. In particular, when initialized with LeCun initialization, depth helps robustness with the lazy training regime. In contrast, when initialized with Neural Tangent Kernel (NTK) and He-initialization, depth hurts the robustness. Moreover, under the non-lazy training regime, we demonstrate how the width of a two-layer ReLU network benefits robustness. Our theoretical developments improve the results by [Huang et al. NeurIPS21; Wu et al. NeurIPS21] and are consistent with [Bubeck and Sellke NeurIPS21; Bubeck et al. COLT21].

PDF Details

NeurIPS Conference 2022 Conference Paper

Sound and Complete Verification of Polynomial Networks

Elias Abad Rocamora
Mehmet Fatih Sahin
Fanghui Liu
Grigorios Chrysos
Volkan Cevher

Polynomial Networks (PNs) have demonstrated promising performance on face and image recognition recently. However, robustness of PNs is unclear and thus obtaining certificates becomes imperative for enabling their adoption in real-world applications. Existing verification algorithms on ReLU neural networks (NNs) based on classical branch and bound (BaB) techniques cannot be trivially applied to PN verification. In this work, we devise a new bounding method, equipped with BaB for global convergence guarantees, called Verification of Polynomial Networks or VPN for short. One key insight is that we obtain much tighter bounds than the interval bound propagation (IBP) and DeepT-Fast [Bonaert et al. , 2021] baselines. This enables sound and complete PN verification with empirical validation on MNIST, CIFAR10 and STL10 datasets. We believe our method has its own interest to NN verification. The source code is publicly available at https: //github. com/megaelius/PNVerification.

PDF Details

NeurIPS Conference 2021 Conference Paper

Conditional Generation Using Polynomial Expansions

Grigorios Chrysos
Markos Georgopoulos
Yannis Panagakis

Generative modeling has evolved to a notable field of machine learning. Deep polynomial neural networks (PNNs) have demonstrated impressive results in unsupervised image generation, where the task is to map an input vector (i. e. , noise) to a synthesized image. However, the success of PNNs has not been replicated in conditional generation tasks, such as super-resolution. Existing PNNs focus on single-variable polynomial expansions which do not fare well to two-variable inputs, i. e. , the noise variable and the conditional variable. In this work, we introduce a general framework, called CoPE, that enables a polynomial expansion of two input variables and captures their auto- and cross-correlations. We exhibit how CoPE can be trivially augmented to accept an arbitrary number of input variables. CoPE is evaluated in five tasks (class-conditional generation, inverse problems, edges-to-image translation, image-to-image translation, attribute-guided generation) involving eight datasets. The thorough evaluation suggests that CoPE can be useful for tackling diverse conditional generation tasks. The source code of CoPE is available at https: //github. com/grigorisg9gr/polynomial nets for conditional generation.

PDF Details