Author name cluster

Daniel Kifer

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

NeSy Conference 2025 Conference Paper

Bridging Neural and Symbolic Computation: A Learnability Study of RNNs on Counter and Dyck Languages

Neisarg Dave
Daniel Kifer
C. Lee Giles
Ankur Mali

This work presents a neuro-symbolic analysis of the learnability of Recurrent Neural Networks (RNNs) in classifying structured formal languages—specifically, **counter languages** and **Dyck languages**, which serve as canonical examples of context-free and mildly context-sensitive grammars. While prior studies have highlighted the expressive power of first-order (LSTM) and second-order (O2RNN) architectures within the Chomsky hierarchy, we challenge this perspective by shifting the focus from theoretical expressivity to *practical learnability under finite precision constraints*. Our results suggest that RNNs function more as finite-state machines than stack-based automata when implemented with realistic training regimes and embedding representations. We show that classification performance degrades sharply as structural similarities between positive and negative sequences increase—highlighting a core limitation in the RNN’s ability to internalize hierarchical structure without symbolic scaffolding. Interestingly, even simple linear classifiers built on top of RNN-derived embeddings outperform chance, underscoring the hidden representational capacity within learned states. To probe generalization, we train models on input lengths up to 40 and evaluate on lengths extending to 500, using 10 distinct seeds to measure statistical robustness. O2RNNs consistently demonstrate greater stability and generalization compared to LSTMs, particularly under varied initialization strategies. These findings expose the fragility of learned language representations and emphasize the role of architectural bias, initialization, and data sampling in determining what is truly learnable. Ultimately, our study reframes RNN learnability through the lens of *symbolic structure and computational constraints*, advocating for stronger formal criteria when assessing neural models’ capacity to reason over structured sequences. We argue that expressivity alone is insufficient—**stability, precision, and symbolic alignment** are essential for true neuro-symbolic generalization.

Details

ICLR Conference 2025 Conference Paper

Sensitivity-Constrained Fourier Neural Operators for Forward and Inverse Problems in Parametric Differential Equations

Abdolmehdi Behroozi
Chaopeng Shen
Daniel Kifer

Parametric differential equations of the form $\frac{\partial u}{\partial t} = f(u, x, t, p)$ are fundamental in science and engineering. While deep learning frameworks like the Fourier Neural Operator (FNO) efficiently approximate differential equation solutions, they struggle with inverse problems, sensitivity calculations $\frac{\partial u}{\partial p}$, and concept drift. We address these challenges by introducing a novel sensitivity loss regularizer, demonstrated through Sensitivity-Constrained Fourier Neural Operators (SC-FNO). Our approach maintains high accuracy for solution paths and outperforms both standard FNO and FNO with Physics-Informed Neural Network regularization. SC-FNO exhibits superior performance in parameter inversion tasks, accommodates more complex parameter spaces (tested with up to 82 parameters), reduces training data requirements, and decreases training time while maintaining accuracy. These improvements apply across various differential equations and neural operators, enhancing their reliability without significant computational overhead (30%–130% extra training time per epoch). Models and selected experiment code are available at: [https://github.com/AMBehroozi/SC_Neural_Operators](https://github.com/AMBehroozi/SC_Neural_Operators).

Details

NeurIPS Conference 2024 Conference Paper

Efficient and Private Marginal Reconstruction with Local Non-Negativity

Brett Mullins
Miguel Fuentes
Yingtai Xiao
Daniel Kifer
Cameron Musco
Daniel Sheldon

Differential privacy is the dominant standard for formal and quantifiable privacy and has been used in major deployments that impact millions of people. Many differentially private algorithms for query release and synthetic data contain steps that reconstruct answers to queries from answers to other queries that have been measured privately. Reconstruction is an important subproblem for such mechanisms to economize the privacy budget, minimize error on reconstructed answers, and allow for scalability to high-dimensional datasets. In this paper, we introduce a principled and efficient postprocessing method ReM (Residuals-to-Marginals) for reconstructing answers to marginal queries. Our method builds on recent work on efficient mechanisms for marginal query release, based on making measurements using a residual query basis that admits efficient pseudoinversion, which is an important primitive used in reconstruction. An extension GReM-LNN (Gaussian Residuals-to-Marginals with Local Non-negativity) reconstructs marginals under Gaussian noise satisfying consistency and non-negativity, which often reduces error on reconstructed answers. We demonstrate the utility of ReM and GReM-LNN by applying them to improve existing private query answering mechanisms.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

An Optimal and Scalable Matrix Mechanism for Noisy Marginals under Convex Loss Functions

Yingtai Xiao
Guanlin He
Danfeng Zhang
Daniel Kifer

Noisy marginals are a common form of confidentiality-protecting data release and are useful for many downstream tasks such as contingency table analysis, construction of Bayesian networks, and even synthetic data generation. Privacy mechanisms that provide unbiased noisy answers to linear queries (such as marginals) are known as matrix mechanisms. We propose ResidualPlanner, a matrix mechanism for marginals with Gaussian noise that is both optimal and scalable. ResidualPlanner can optimize for many loss functions that can be written as a convex function of marginal variances (prior work was restricted to just one predefined objective function). ResidualPlanner can optimize the accuracy of marginals in large scale settings in seconds, even when the previous state of the art (HDMM) runs out of memory. It even runs on datasets with 100 attributes in a couple of minutes. Furthermore ResidualPlanner can efficiently compute variance/covariance values for each marginal (prior methods quickly run out of memory, even for relatively small datasets).

PDF Details

AAAI Conference 2023 Conference Paper

Backpropagation-Free Deep Learning with Recursive Local Representation Alignment

Alexander G. Ororbia
Ankur Mali
Daniel Kifer
C. Lee Giles

Training deep neural networks on large-scale datasets requires significant hardware resources whose costs (even on cloud platforms) put them out of reach of smaller organizations, groups, and individuals. Backpropagation (backprop), the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. Furthermore, researchers must continually develop various specialized techniques, such as particular weight initializations and enhanced activation functions, to ensure stable parameter optimization. Our goal is to seek an effective, neuro-biologically plausible alternative to backprop that can be used to train deep networks. In this paper, we propose a backprop-free procedure, recursive local representation alignment, for training large-scale architectures. Experiments with residual networks on CIFAR-10 and the large benchmark, ImageNet, show that our algorithm generalizes as well as backprop while converging sooner due to weight updates that are parallelizable and computationally less demanding. This is empirical evidence that a backprop-free algorithm can scale up to larger datasets.

PDF Details DOI

NeurIPS Conference 2022 Conference Paper

Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting

Alex Ororbia
Ankur Mali
C Lee Giles
Daniel Kifer

In lifelong learning systems based on artificial neural networks, one of the biggest obstacles is the inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the popular back-propagation of errors. Grounded in the neurocognitive theory of predictive coding, our model adapts its synapses in a biologically-plausible fashion while another neural system learns to direct and control this cortex-like structure, mimicking some of the task-executive control functionality of the basal ganglia. In our experiments, we demonstrate that our self-organizing system experiences significantly less forgetting compared to standard neural models, outperforming a swath of previously proposed methods, including rehearsal/data buffer-based methods, on both standard (SplitMNIST, Split Fashion MNIST, etc. ) and custom benchmarks even though it is trained in a stream-like fashion. Our work offers evidence that emulating mechanisms in real neuronal systems, e. g. , local learning, lateral competition, can yield new directions and possibilities for tackling the grand challenge of lifelong machine learning.

PDF Details

NeurIPS Conference 2021 Conference Paper

An Uncertainty Principle is a Price of Privacy-Preserving Microdata

John Abowd
Robert Ashmead
Ryan Cumings-Menon
Simson Garfinkel
Daniel Kifer
Philip Leclerc
William Sexton
Ashley Simpson

Privacy-protected microdata are often the desired output of a differentially private algorithm since microdata is familiar and convenient for downstream users. However, there is a statistical price for this kind of convenience. We show that an uncertainty principle governs the trade-off between accuracy for a population of interest (``sum query'') vs. accuracy for its component sub-populations (``point queries''). Compared to differentially private query answering systems that are not required to produce microdata, accuracy can degrade by a logarithmic factor. For example, in the case of pure differential privacy, without the microdata requirement, one can provide noisy answers to the sum query and all point queries while guaranteeing that each answer has squared error $O(1/\epsilon^2)$. With the microdata requirement, one must choose between allowing an additional $\log^2(d)$ factor ($d$ is the number of point queries) for some point queries or allowing an extra $O(d^2)$ factor for the sum query. We present lower bounds for pure, approximate, and concentrated differential privacy. We propose mitigation strategies and create a collection of benchmark datasets that can be used for public study of this problem.

PDF Details

AAAI Conference 2021 Conference Paper

Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units

Ankur Mali
Alexander G. Ororbia
Daniel Kifer
C. Lee Giles

Automated mathematical reasoning is a challenging problem that requires an agent to learn algebraic patterns that contain long-range dependencies. Two particular tasks that test this type of reasoning are (1) mathematical equation verification, which requires determining whether trigonometric and linear algebraic statements are valid identities or not, and (2) equation completion, which entails filling in a blank within an expression to make it true. Solving these tasks with deep learning requires that the neural model learn how to manipulate and compose various algebraic symbols, carrying this ability over to previously unseen expressions. Artificial neural networks, including recurrent networks and transformers, struggle to generalize on these kinds of difficult compositional problems, often exhibiting poor extrapolation performance. In contrast, recursive neural networks (recursive-NNs) are, theoretically, capable of achieving better extrapolation due to their tree-like design but are difficult to optimize as the depth of their underlying tree structure increases. To overcome this issue, we extend recursive-NNs to utilize multiplicative, higher-order synaptic connections and, furthermore, to learn to dynamically control and manipulate an external memory. We argue that this key modification gives the neural system the ability to capture powerful transition functions for each possible input. We demonstrate the effectiveness of our proposed higher-order, memory-augmented recursive-NN models on two challenging mathematical equation tasks, showing improved extrapolation, stable performance, and faster convergence. Our models achieve a 1. 53% average improvement over current state-of-the-art methods in equation verification and achieve a 2. 22% Top-1 average accuracy and 2. 96% Top- 5 average accuracy for equation completion.

PDF Details

TIST Journal 2019 Journal Article

A Simple Baseline for Travel Time Estimation using Large-scale Trip Data

Hongjian Wang
Xianfeng Tang
Yu-Hsuan Kuo
Daniel Kifer
Zhenhui Li

The increased availability of large-scale trajectory data provides rich information for the study of urban dynamics. For example, New York City Taxi 8 Limousine Commission regularly releases source/destination information of taxi trips, where 173 million taxi trips released for Year 2013 [29]. Such a big dataset provides us potential new perspectives to address the traditional traffic problems. In this article, we study the travel time estimation problem. Instead of following the traditional route-based travel time estimation, we propose to simply use a large amount of taxi trips without using the intermediate trajectory points to estimate the travel time between source and destination. Our experiments show very promising results. The proposed big-data-driven approach significantly outperforms both state-of-the-art route-based method and online map services. Our study indicates that novel simple approaches could be empowered by big data and these approaches could serve as new baselines for some traditional computational problems.

Details DOI

AAAI Conference 2019 Conference Paper

Adversarial Training for Community Question Answer Selection Based on Multi-Scale Matching

Xiao Yang
Madian Khabsa
Miaosen Wang
Wei Wang
Ahmed Hassan Awadallah
Daniel Kifer
C. Lee Giles

Community-based question answering (CQA) websites represent an important source of information. As a result, the problem of matching the most valuable answers to their corresponding questions has become an increasingly popular research topic. We frame this task as a binary (relevant/irrelevant) classification problem, and present an adversarial training framework to alleviate label imbalance issue. We employ a generative model to iteratively sample a subset of challenging negative samples to fool our classification model. Both models are alternatively optimized using REIN- FORCE algorithm. The proposed method is completely different from previous ones, where negative samples in training set are directly used or uniformly down-sampled. Further, we propose using Multi-scale Matching which explicitly inspects the correlation between words and ngrams of different levels of granularity. We evaluate the proposed method on SemEval 2016 and SemEval 2017 datasets and achieves state-of-the-art or similar performance.

PDF Details

IJCAI Conference 2017 Conference Paper

Learning to Read Irregular Text with Attention Mechanisms

Xiao Yang
Dafang He
Zihan Zhou
Daniel Kifer
C. Lee Giles

We present a robust end-to-end neural-based model to attentively recognize text in natural images. Particularly, we focus on accurately identifying irregular (perspectively distorted or curved) text, which has not been well addressed in the previous literature. Previous research on text reading often works with regular (horizontal and frontal) text and does not adequately generalize to processing text with perspective distortion or curving effects. Our work proposes to overcome this difficulty by introducing two learning components: (1) an auxiliary dense character detection task that helps to learn text specific visual patterns, (2) an alignment loss that provides guidance to the training of an attention model. We show with experiments that these two components are crucial for achieving fast convergence and high classification accuracy for irregular text recognition. Our model outperforms previous work on two irregular-text datasets: SVT-Perspective and CUTE80, and is also highly-competitive on several regular-text datasets containing primarily horizontal and frontal text.

PDF Details

AAAI Conference 2017 Conference Paper

Predicting Demographics of High-Resolution Geographies with Geotagged Tweets

Omar Montasser
Daniel Kifer

In this paper, we consider the problem of predicting demographics of geographic units given geotagged Tweets that are composed within these units. Traditional survey methods that offer demographics estimates are usually limited in terms of geographic resolution, geographic boundaries, and time intervals. Thus, it would be highly useful to develop computational methods that can complement traditional survey methods by offering demographics estimates at ﬁner geographic resolutions, with ﬂexible geographic boundaries (i. e. not con- ﬁned to administrative boundaries), and at different time intervals. While prior work has focused on predicting demographics and health statistics at relatively coarse geographic resolutions such as the county-level or state-level, we introduce an approach to predict demographics at ﬁner geographic resolutions such as the blockgroup-level. For the task of predicting gender and race/ethnicity counts at the blockgrouplevel, an approach adapted from prior work to our problem achieves an average correlation of 0. 389 (gender) and 0. 569 (race) on a held-out test dataset. Our approach outperforms this prior approach with an average correlation of 0. 671 (gender) and 0. 692 (race).

PDF Details

AAAI Conference 2010 Conference Paper

What Is an Opinion About? Exploring Political Standpoints Using Opinion Scoring Model

Bi Chen
Leilei Zhu
Daniel Kifer
Dongwon Lee

In this paper, we propose a generative model to automatically discover the hidden associations between topics words and opinion words. By applying those discovered hidden associations, we construct the opinion scoring models to extract statements which best express opinionists’ standpoints on certain topics. For experiments, we apply our model to the political area. First, we visualize the similarities and dissimilarities between Republican and Democratic senators with respect to various topics. Second, we compare the performance of the opinion scoring models with 14 kinds of methods to find the best ones. We find that sentences extracted by our opinion scoring models can effectively express opinionists’ standpoints.

PDF Details