Arrow Research search

Author name cluster

Abbas Rahimi

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

19 papers
2 author rows

Possible papers

19

JBHI Journal 2026 Journal Article

A Composable Channel-Adaptive Architecture for Seizure Classification

  • Francesco S. Carzaniga
  • Michael Hersche
  • Kaspar A. Schindler
  • Abbas Rahimi

Multi-variate time-series are one of the primary data modalities involved in large classes of problems, where deep learning models represent the state-of-the-art solution. In the healthcare domain electrophysiological data, such as intracranial electroencephalography (iEEG), is used to perform a variety of tasks. However, iEEG models require that the number of channels be fixed, while iEEG setups in clinics are highly personalized and thus vary considerably from one subject to the next. To address this concern, we propose a channel-adaptive (CA) architecture that seamlessly functions on any multi-variate signal with an arbitrary number of channels. Each CA-model can be pre-trained on a large corpus of iEEG recordings from multiple heterogeneous subjects, and then finetuned to each subject using equal or lower amounts of data compared to existing state-of-the-art models, and in only 1/5 of the time. We evaluate our CA-models on a seizure detection task both on a short-term ( $\sim$ 15 hours) and a long-term ( $\sim$ 2600 hours) dataset. In particular, our CA-EEGWaveNet — based on EEGWaveNet — is trained on a single seizure of the tested subject, while the baseline EEGWaveNet is trained on all but one. CA-EEGWaveNet surpasses the baseline in median F1-score (0. 78 vs 0. 76). Similarly, CA-EEGNet — based on EEGNet — also surpasses its baseline (0. 79 vs 0. 74). Overall, we show that the CA architecture is a drop-in replacement for existing seizure classification models, bringing better characteristics and performance across the board.

NAI Journal 2026 Journal Article

Towards Learning to Reason: Comparing LLMs With Neuro-Symbolic on Arithmetic Relations in Abstract Reasoning

  • Michael Hersche
  • Giacomo Camposampiero
  • Roger Wattenhofer
  • Abu Sebastian
  • Abbas Rahimi

This work compares large language models (LLMs) and neuro-symbolic approaches in solving Raven’s progressive matrices (RPMs), a visual abstract reasoning test that involves the understanding of mathematical rules such as progression or arithmetic addition. Providing the visual attributes directly as textual prompts, which assumes an oracle visual perception module, allows us to measure the model’s abstract reasoning capability in isolation. Despite providing such compositionally-structured representations from the oracle visual perception and advanced prompting techniques, both GPT-4 and Llama-3 70B cannot achieve perfect accuracy on the center constellation of the I-RAVEN dataset. Our analysis reveals that the root cause lies in the LLM’s weakness in understanding and executing arithmetic rules. As a potential remedy, we analyze the Abductive Rule Learner with Context-awareness (ARLC), a neuro-symbolic approach that learns to reason with vector-symbolic architectures. Here, concepts are represented with distributed vectors such that dot products between encoded vectors define a similarity kernel, and element-wise vector operations perform addition/subtraction on the encoded values. We find that ARLC achieves almost perfect accuracy on the center constellation of I-RAVEN, demonstrating a high fidelity in arithmetic rules. To stress the length generalization capabilities, we extend the RPM tests to larger matrices (3 × 10 instead of typical 3 × 3) and larger dynamic ranges of the attribute values (from 10 up to 1000). We find that the LLM’s accuracy of solving arithmetic rules drops to sub-10%, especially as the dynamic range expands, while ARLC can maintain a high accuracy due to emulating symbolic computations on top of distributed representations. 1

NeurIPS Conference 2025 Conference Paper

Analog Foundation Models

  • Julian Büchel
  • Iason Chalas
  • Giovanni Acampa
  • An Chen
  • Omobayode Fagbohungbe
  • Hsinyu Tsai
  • Kaoutar El Maghraoui
  • Manuel Le Gallo

Analog in-memory computing (AIMC) is a promising compute paradigm to improve speed and power efficiency of neural network inference beyond the limits of conventional von Neumann-based architectures. However, AIMC introduces fundamental challenges such as noisy computations and strict constraints on input and output quantization. Because of these constraints and imprecisions, off-the-shelf LLMs are not able to achieve 4-bit-level performance when deployed on AIMC-based hardware. While researchers previously investigated recovering this accuracy gap on small, mostly vision-based models, a generic method applicable to LLMs pre-trained on trillions of tokens does not yet exist. In this work, we introduce a general and scalable method to robustly adapt LLMs for execution on noisy, low-precision analog hardware. Our approach enables state-of-the-art models — including Phi-3-mini-4k-instruct and Llama-3. 2-1B-Instruct — to retain performance comparable to 4-bit weight, 8-bit activation baselines, despite the presence of analog noise and quantization constraints. Additionally, we show that as a byproduct of our training methodology, analog foundation models can be quantized for inference on low-precision digital hardware. Finally, we show that our models also benefit from test-time compute scaling, showing better scaling behavior than models trained with 4-bit weight and 8-bit static input quantization. Our work bridges the gap between high-capacity LLMs and efficient analog hardware, offering a path toward energy-efficient foundation models. Code is available at github. com/IBM/analog-foundation-models.

NeSy Conference 2025 Conference Paper

Can Large Reasoning Models do Analogical Reasoning under Perceptual Uncertainty?

  • Giacomo Camposampiero
  • Michael Hersche
  • Roger Wattenhofer
  • Abu Sebastian
  • Abbas Rahimi

This work presents a first evaluation of two state-of-the-art Large Reasoning Models (LRMs), OpenAI’s o3-mini and DeepSeek R1, on analogical reasoning, focusing on well-established nonverbal human IQ tests based on Raven’s progressive matrices. We benchmark with the I-RAVEN dataset and its extension, I-RAVEN-X, which tests the ability to generalize to longer reasoning rules and ranges of the attribute values. To assess the influence of visual uncertainties on these symbolic analogical reasoning tests, we extend the I-RAVEN-X dataset, which otherwise assumes an oracle perception. We adopt a two-fold strategy to simulate this imperfect visual perception: 1) we introduce confounding attributes which, being sampled at random, do not contribute to the prediction of the correct answer of the puzzles, and 2) smooth the distributions of the input attributes’ values. We observe a sharp decline in OpenAI’s o3-mini task accuracy, dropping from 86. 6% on the original I-RAVEN to just 17. 0%—approaching random chance—on the more challenging I-RAVEN-X, which increases input length and range and emulates perceptual uncertainty. This drop occurred despite spending 3. 4x more reasoning tokens. A similar trend is also observed for DeepSeek R1: from 80. 6% to 23. 2%. On the other hand, a neuro-symbolic probabilistic abductive model, ARLC, that achieves state-of-the-art performances on I-RAVEN, can robustly reason under all these out-of-distribution tests, maintaining strong accuracy with only a modest accuracy reduction from 98. 6% to 88. 0%. Our code is available at https: //github. com/IBM/raven-large-language-models.

NAI Journal 2025 Journal Article

Factorizers for distributed sparse block codes

  • Michael Hersche
  • Aleksandar Terzić
  • Geethan Karunaratne
  • Jovin Langenegger
  • Angéline Pouget
  • Giovanni Cherubini
  • Luca Benini
  • Abu Sebastian

Distributed sparse block codes (SBCs) exhibit compact representations for encoding and manipulating symbolic data structures using fixed-width vectors. One major challenge however is to disentangle, or factorize, the distributed representation of data structures into their constituent elements without having to search through all possible combinations. This factorization becomes more challenging when SBCs vectors are noisy due to perceptual uncertainty and approximations made by modern neural networks to generate the query SBCs vectors. To address these challenges, we first propose a fast and highly accurate method for factorizing a more flexible and hence generalized form of SBCs, dubbed GSBCs. Our iterative factorizer introduces a threshold-based nonlinear activation, conditional random sampling, and an ℓ ∞ -based similarity metric. Its random sampling mechanism, in combination with the search in superposition, allows us to analytically determine the expected number of decoding iterations, which matches the empirical observations up to the GSBC’s bundling capacity. Secondly, the proposed factorizer maintains a high accuracy when queried by noisy product vectors generated using deep convolutional neural networks (CNNs). This facilitates its application in replacing the large fully connected layer (FCL) in CNNs, whereby C trainable class vectors, or attribute combinations, can be implicitly represented by our factorizer having F -factor codebooks, each with C F fixed codevectors. We provide a methodology to flexibly integrate our factorizer in the classification layer of CNNs with a novel loss function. With this integration, the convolutional layers can generate a noisy product vector that our factorizer can still decode, whereby the decoded factors can have different interpretations based on downstream tasks. We demonstrate the feasibility of our method on four deep CNN architectures over CIFAR-100, ImageNet-1K, and RAVEN datasets. In all use cases, the number of parameters and operations are notably reduced compared to the FCL.

AAAI Conference 2025 Conference Paper

On the Expressiveness and Length Generalization of Selective State Space Models on Regular Languages

  • Aleksandar Terzic
  • Michael Hersche
  • Giacomo Camposampiero
  • Thomas Hofmann
  • Abu Sebastian
  • Abbas Rahimi

Selective state-space models (SSMs) are an emerging alternative to the Transformer, offering the unique advantage of parallel training and sequential inference. Although these models have shown promising performance on a variety of tasks, their formal expressiveness and length generalization properties remain underexplored. In this work, we provide insight into the workings of selective SSMs by analyzing their expressiveness and length generalization performance on regular language tasks, i.e., finite-state automaton (FSA) emulation. We address certain limitations of modern SSM-based architectures by introducing the Selective Dense State-Space Model (SD-SSM), the first selective SSM that exhibits perfect length generalization on a set of various regular language tasks using a single layer. It utilizes a dictionary of dense transition matrices, a softmax selection mechanism that creates a convex combination of dictionary matrices at each time step, and a readout consisting of layer normalization followed by a linear map. We then proceed to evaluate variants of diagonal selective SSMs by considering their empirical performance on commutative and non-commutative automata. We explain the experimental results with theoretical considerations.

NeSy Conference 2025 Conference Paper

Practical Lessons on Vector-Symbolic Architectures in Deep Learning-Inspired Environments

  • Francesco S. Carzaniga
  • Michael Hersche
  • Kaspar Schindler
  • Abbas Rahimi

Neural networks have shown unprecedented capabilities, rivaling human performance in many tasks. However, current neural architectures are not capable of symbolic manipulation, which is thought to be a hallmark of human intelligence. Vector-symbolic architectures (VSAs) promise to bring this ability through simple vector manipulation, highly amenable to current and emerging hardware and software stacks built for their neural counterparts. Integrating the two models into the paradigm of neuro-vector-symbolic architectures may achieve even more human-like performance. However, despite ongoing efforts, there are no clear guidelines on the deployment of VSA in deep learning-based training situations. In this work, we aim to begin providing such guidelines by offering four practical lessons we have observed through the analysis of many VSA models and implementations. We provide thorough benchmarks and results that corroborate such lessons. First, we observe that Multiply-add-permute (MAP) and Hadamard linear binding (HLB) are up to 3-4$\times$ faster than holographic reduced representations (HRR), even when the latter is equipped with optimized FFT-based convolutions. Second, we propose further speed improvements by replacing similarity search with a linear readout, with no effect on retrieval. Third, we analyze the retrieval performance of MAP, HRR and HLB in a noise-free and noisy scenario to simulate processing by a neural network, and show that they are equivalent. Finally, we implement a hierarchical multi-level composition scheme, with notable benefits to the flexibility of integration of VSAs inside existing neural architectures. Overall, we show that these four lessons lead to faster and more effective deployment of VSA.

NeurIPS Conference 2025 Conference Paper

Scalable Evaluation and Neural Models for Compositional Generalization

  • Giacomo Camposampiero
  • Pietro Barbiero
  • Michael Hersche
  • Roger Wattenhofer
  • Abbas Rahimi

Compositional generalization—a key open challenge in modern machine learning—requires models to predict unknown combinations of known concepts. However, assessing compositional generalization remains a fundamental challenge due to the lack of standardized evaluation protocols and the limitations of current benchmarks, which often favor efficiency over rigor. At the same time, general-purpose vision architectures lack the necessary inductive biases, and existing approaches to endow them compromise scalability. As a remedy, this paper introduces: 1) a rigorous evaluation framework that unifies and extends previous approaches while reducing computational requirements from combinatorial to constant; 2) an extensive and modern evaluation on the status of compositional generalization in supervised vision backbones, training more than 5000 models; 3) Attribute Invariant Networks, a class of models establishing a new Pareto frontier in compositional generalization, achieving a 23. 43% accuracy improvement over baselines while reducing parameter overhead from 600% to 16% compared to fully disentangled counterparts.

NeurIPS Conference 2025 Conference Paper

Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models

  • Aleksandar Terzic
  • Nicolas Menet
  • Michael Hersche
  • Thomas Hofmann
  • Abbas Rahimi

Modern state-space models (SSMs) often utilize structured transition matrices which enable efficient computation but pose restrictions on the model’s expressivity, as measured in terms of the ability to emulate finite-state automata (FSA). While unstructured transition matrices are optimal in terms of expressivity, they come at a prohibitively high compute and memory cost, even for moderate state sizes. We propose a structured sparse parametrization of transition matrices in SSMs that enables FSA state tracking with provably optimal state size and depth, while keeping the computational cost of the recurrence comparable to that of diagonal SSMs. Our method, \emph{PD-SSM}, parametrizes the transition matrix as the product of a column one-hot matrix ($P$) and a complex-valued diagonal matrix ($D$). As a result, the computational cost of parallel scans scales linearly with the state size. Theoretically, the model is BIBO-stable and can emulate any $N$-state FSA with one layer of dimension $N$ and a linear readout of size $N ×N$, significantly improving on all current structured SSM guarantees. Experimentally, the model significantly outperforms a wide collection of modern SSM variants on various FSA state tracking tasks. On multivariate time-series classification, it outperforms neural controlled differential equations, a paradigm explicitly built for time-series analysis. Finally, we integrate PD-SSM into a hybrid Transformer-SSM architecture and demonstrate that the model can effectively track the states of a complex FSA in which transitions are encoded into sets of variable-length English sentences. The code is available at https: //github. com/IBM/expressive-sparse-state-space-model.

ICLR Conference 2025 Conference Paper

The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEG to Noisier EEG

  • Francesco S. Carzaniga
  • Gary Tom Hoppeler
  • Michael Hersche
  • Kaspar Schindler
  • Abbas Rahimi

All data modalities are not created equal, even when the signal they measure comes from the same source. In the case of the brain, two of the most important data modalities are the scalp electroencephalogram (EEG), and the intracranial electroencephalogram (iEEG). iEEG benefits from a higher signal-to-noise ratio (SNR), as it measures the electrical activity directly in the brain, while EEG is noisier and has lower spatial and temporal resolutions. Nonetheless, both EEG and iEEG are important sources of data for human neurology, from healthcare to brain–machine interfaces. They are used by human experts, supported by deep learning (DL) models, to accomplish a variety of tasks, such as seizure detection and motor imagery classification. Although the differences between EEG and iEEG are well understood by human experts, the performance of DL models across these two modalities remains under-explored. To help characterize the importance of clean data on the performance of DL models, we propose BrainCodec, a high-fidelity EEG and iEEG neural compressor. We find that training BrainCodec on iEEG and then transferring to EEG yields higher reconstruction quality than training on EEG directly. In addition, we also find that training BrainCodec on both EEG and iEEG improves fidelity when reconstructing EEG. Our work indicates that data sources with higher SNR, such as iEEG, provide better performance across the board also in the medical time-series domain. This finding is consistent with reports coming from natural language processing, where clean data sources appear to have an outsized effect on the performance of the DL model overall. BrainCodec also achieves up to a 64x compression on iEEG and EEG without a notable decrease in quality. BrainCodec markedly surpasses current state-of-the-art compression models both in final compression ratio and in reconstruction fidelity. We also evaluate the fidelity of the compressed signals objectively on a seizure detection and a motor imagery task performed by standard DL models. Here, we find that BrainCodec achieves a reconstruction fidelity high enough to ensure no performance degradation on the downstream tasks. Finally, we collect the subjective assessment of an expert neurologist, that confirms the high reconstruction quality of BrainCodec in a realistic scenario. The code is available at https://github.com/IBM/eeg-ieeg-brain-compressor.

NeurIPS Conference 2024 Conference Paper

Limits of Transformer Language Models on Learning to Compose Algorithms

  • Jonathan Thomm
  • Giacomo Camposampiero
  • Aleksandar Terzic
  • Michael Hersche
  • Bernhard Schölkopf
  • Abbas Rahimi

We analyze the capabilities of Transformer language models in learning compositional discrete tasks. To this end, we evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks. In particular, we measure how well these models can reuse primitives observable in the sub-tasks to learn the composition task. Our results indicate that compositional learning in state-of-the-art Transformer language models is highly sample inefficient: LLaMA requires more data samples than relearning all sub-tasks from scratch to learn the compositional task; in-context prompting with few samples is unreliable and fails at executing the sub-tasks or correcting the errors in multi-round code generation. Further, by leveraging complexity theory, we support these findings with a theoretical analysis focused on the sample inefficiency of gradient descent in memorizing feedforward models. We open source our code at https: //github. com/IBM/limitations-lm-algorithmic-compositional-learning.

ECAI Conference 2024 Conference Paper

RETRO-LI: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization

  • Gentiana Rashiti
  • Geethan Karunaratne
  • Mrinmaya Sachan
  • Abu Sebastian
  • Abbas Rahimi

The retrieval augmented generation (RAG) system such as RETRO has been shown to improve language modeling capabilities and reduce toxicity and hallucinations by retrieving from a database of non-parametric memory containing trillions of entries. We introduce RETRO-LI that shows retrieval can also help using a small scale database, but it demands more accurate and better neighbors when searching in a smaller hence sparser non-parametric memory. This can be met by using a proper semantic similarity search. We further propose adding a regularization to the non-parametric memory for the first time: it significantly reduces perplexity when the neighbor search operations are noisy during inference, and it improves generalization when a domain shift occurs. We also show that the RETRO-LI’s non-parametric memory can potentially be implemented on analog in-memory computing hardware, exhibiting O(1) search time while causing noise in retrieving neighbors, with minimal (<1%) performance loss. Our code is available at: https: //github. com/IBM/Retrieval-Enhanced-Transformer-Little

NeSy Conference 2024 Conference Paper

Terminating Differentiable Tree Experts

  • Jonathan Thomm
  • Michael Hersche
  • Giacomo Camposampiero
  • Aleksandar Terzic
  • Bernhard Schölkopf
  • Abbas Rahimi

Abstract We advance the recently proposed neuro-symbolic Differentiable Tree Machine, which learns tree operations using a combination of transformers and Tensor Product Representations. We investigate the architecture and propose two key components. We first remove a series of different transformer layers that are used in every step by introducing a mixture of experts. This results in a Differentiable Tree Experts model with a constant number of parameters for any arbitrary number of steps in the computation, compared to the previous method in the Differentiable Tree Machine with a linear growth. Given this flexibility in the number of steps, we additionally propose a new termination algorithm to provide the model the power to choose how many steps to make automatically. The resulting Terminating Differentiable Tree Experts model sluggishly learns to predict the number of steps without an oracle. It can do so while maintaining the learning capabilities of the model, converging to the optimal amount of steps.

NeSy Conference 2024 Conference Paper

Towards Learning Abductive Reasoning Using VSA Distributed Representations

  • Giacomo Camposampiero
  • Michael Hersche
  • Aleksandar Terzic
  • Roger Wattenhofer
  • Abu Sebastian
  • Abbas Rahimi

Abstract We introduce the Abductive Rule Learner with Context-awareness (ARLC), a model that solves abstract reasoning tasks based on Learn-VRF. ARLC features a novel and more broadly applicable training objective for abductive reasoning, resulting in better interpretability and higher accuracy when solving Raven’s progressive matrices (RPM). ARLC allows both programming domain knowledge and learning the rules underlying a data distribution. We evaluate ARLC on the I-RAVEN dataset, showcasing state-of-the-art accuracy across both in-distribution and out-of-distribution (unseen attribute-rule pairs) tests. ARLC surpasses neuro-symbolic and connectionist baselines, including large language models, despite having orders of magnitude fewer parameters. We show ARLC ’s robustness to post-programming training by incrementally learning from examples on top of programmed knowledge, which only improves its performance and does not result in catastrophic forgetting of the programmed solution. We validate ARLC ’s seamless transfer learning from a 2 \(\, \times \, \) 2 RPM constellation to unseen constellations. Our code is available at https: //github. com/IBM/abductive-rule-learner-with-context-awareness.

NeSy Conference 2023 Conference Paper

Decoding Superpositions of Bound Symbols Represented by Distributed Representations

  • Michael Hersche
  • Zuzanna Opala
  • Geethan Karunaratne
  • Abu Sebastian
  • Abbas Rahimi

Vector-symbolic architectures (VSAs) express data structures with an arbitrary complexity and perform symbolic computations on them by exploiting high-dimensional distributed representations and associated key operations. VSAs typically use dense random vectors, aka hypervectors, to represent atomic symbols that can be combined into compound symbols by multiplicative binding and additive superposition operators. For instance, a VSA-based neural encoder can bind two atomic symbols, and further superpose a set of such bound symbols—all by distributed vectors that have the same dimension. Nevertheless, decoding such an additive-multiplicative vector, to the atomic symbols from which it is built, is not a trivial task. Recently, a solution based on resonator networks was proposed to iteratively factorize one of the bound symbols. After finding the factorization, it is explained away by subtracting it from the superposition. This explaining away, however, causes noise amplification that limits the number of symbols that can be reliably decoded in large problem sizes. Here, we present novel methods that efficiently decode VSA-based data structures consisting of multiplicative binding and additive superposition of symbols. We expand the pure sequential explaining away approach by performing multiple decodings in parallel using a dedicated query sampler. Compared to the baseline resonator network, this mix of sequential and parallel decoding retrieves up to 8× more additive components from larger problems in synthetic and real-world experiments.

NeurIPS Conference 2023 Conference Paper

MIMONets: Multiple-Input-Multiple-Output Neural Networks Exploiting Computation in Superposition

  • Nicolas Menet
  • Michael Hersche
  • Geethan Karunaratne
  • Luca Benini
  • Abu Sebastian
  • Abbas Rahimi

With the advent of deep learning, progressively larger neural networks have been designed to solve complex tasks. We take advantage of these capacity-rich models to lower the cost of inference by exploiting computation in superposition. To reduce the computational burden per input, we propose Multiple-Input-Multiple-Output Neural Networks (MIMONets) capable of handling many inputs at once. MIMONets augment various deep neural network architectures with variable binding mechanisms to represent an arbitrary number of inputs in a compositional data structure via fixed-width distributed representations. Accordingly, MIMONets adapt nonlinear neural transformations to process the data structure holistically, leading to a speedup nearly proportional to the number of superposed input items in the data structure. After processing in superposition, an unbinding mechanism recovers each transformed input of interest. MIMONets also provide a dynamic trade-off between accuracy and throughput by an instantaneous on-demand switching between a set of accuracy-throughput operating points, yet within a single set of fixed parameters. We apply the concept of MIMONets to both CNN and Transformer architectures resulting in MIMOConv and MIMOFormer, respectively. Empirical evaluations show that MIMOConv achieves $\approx 2$–$4\times$ speedup at an accuracy delta within [+0. 68, -3. 18]% compared to WideResNet CNNs on CIFAR10 and CIFAR100. Similarly, MIMOFormer can handle $2$–$4$ inputs at once while maintaining a high average accuracy within a [-1. 07, -3. 43]% delta on the long range arena benchmark. Finally, we provide mathematical bounds on the interference between superposition channels in MIMOFormer. Our code is available at https: //github. com/IBM/multiple-input-multiple-output-nets.

NeSy Conference 2023 Conference Paper

VSA-based Positional Encoding Can Replace Recurrent Networks in Emergent Symbol Binding

  • Francesco S. Carzaniga
  • Michael Hersche
  • Kaspar Schindler
  • Abbas Rahimi

Variable binding is an open problem in both neuroscience and machine learning relating to how neural circuits combine multiple features into a single entity. Emergent Symbols through Binding in External Memory is a recent development tackling variable binding with a compelling solution. An emergent symbolic binding network (ESBN) is able to infer abstract rules through indirection using a dual-stack setup—whereby one stack contains variables and the other contains the associated keys—by autonomously learning a relationship between the two. New keys are generated from previous ones by maintaining a strict time-ordering through the usage of recurrent networks, in particular LSTMs. It is then a natural question whether such an expensive requirement could be replaced by a more economical alternative. In this work, we explore the viability of replacing LSTMs with simpler multi-layer perceptrons (MLPs) by exploiting the properties of high-dimensional spaces through a bundling-based positional encoding. We show how a combination of vector symbolic architectures and appropriate activation functions can achieve and surpass the results reported in the ESBN work, highlighting the role that imbuing the latent space with an explicit structure can play for these unconventional symbolic models.

JBHI Journal 2021 Journal Article

An Ensemble of Hyperdimensional Classifiers: Hardware-Friendly Short-Latency Seizure Detection With Automatic iEEG Electrode Selection

  • Alessio Burrello
  • Simone Benatti
  • Kaspar Schindler
  • Luca Benini
  • Abbas Rahimi

We propose a new algorithm for detecting epileptic seizures. Our algorithm first extracts three features, namely mean amplitude, line length, and local binary patterns that are fed to an ensemble of classifiers using hyperdimensional (HD) computing. These features are embedded into prototype vectors representing ictal (during seizures) and interictal (between seizures) brain states are constructed. These vectors can be computed at different spatial scales ranging from a single electrode up to many electrodes. This flexibility allows our algorithm to identify the electrodes that discriminate best between ictal and interictal brain states. We assess our algorithm on the SWEC-ETHZ iEEG dataset that includes 99 short-time iEEG seizures recorded with 36 to 100 electrodes from 16 drug-resistant epilepsy patients. Using k-fold cross-validation and all electrodes, our algorithm surpasses state-of-the-art algorithms yielding significantly shorter latency (8. 81 s vs. 11. 57 s) in seizure onset detection, and higher specificity (97. 31% vs. 94. 84%) and accuracy (96. 85% vs. 95. 42%). We can further reduce the latency of our algorithm to 3. 74 s by allowing a slightly higher percentage of false alarms (2% specificity loss). Using only the top 10% of the electrodes ranked by our algorithm, we still maintain superior latency, sensitivity, and specificity compared to the other algorithms with all the electrodes. We finally demonstrate the suitability of our algorithm to deployment on low-cost embedded hardware platforms, thanks to its robustness to noise/artifacts affecting the signal, its low computational complexity, and the small memory-footprint on a RISC-V microcontroller.