Arrow Research search

Author name cluster

Prashanth Krishnamurthy

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers
2 author rows

Possible papers

16

TIST Journal 2026 Journal Article

REMEND: Neural Decompilation for Reverse Engineering Math Equations from Binary Executables

  • Meet Udeshi
  • Prashanth Krishnamurthy
  • Ramesh Karri
  • Farshad Khorrami

Analysis of binary executables implementing mathematical equations can benefit from the reverse engineering of semantic information about the implementation. Traditional algorithmic reverse engineering tools either do not recover semantic information or rely on dynamic analysis and symbolic execution with high reverse engineering time. Algorithmic tools also require significant re-engineering effort to target new platforms and languages. Recently, neural methods for decompilation have been developed to recover human-like source code, but they do not extract semantic information explicitly. We develop REMEND, a neural decompilation framework to reverse engineer math equations from binaries to explicitly recover program semantics like dataflow and order of operations. REMEND combines a transformer encoder–decoder model for neural decompilation with algorithmic processing for enhanced symbolic reasoning necessary for math equations. REMEND is the first work to demonstrate that transformers for neural decompilation go beyond source code and reason about program semantics in the form of math equations. We train on a synthetically generated dataset containing multiple implementations and compilations of math equations to produce a robust neural decompilation model and demonstrate retargettability. REMEND obtains an accuracy of 89.8% to 92.4% across three Instruction Set Architectures (ISAs), three optimization levels, and two programming languages with a single trained model, extending the capability of state-of-the-art neural decompilers. We achieve high accuracy with a small model of up to 12 million parameters and an average execution time of 0.132 seconds per function. On a real-world dataset collected from open source programs, REMEND generalizes better than state-of-the-art neural decompilers despite being trained with synthetic data, achieving 8% higher accuracy. The synthetic and real-world datasets are provided at https://hf.co/udiboy1209/REMEND.

AAAI Conference 2026 Conference Paper

Towards Effective Offensive Security LLM Agents: Hyperparameter Tuning, LLM as a Judge, and a Lightweight CTF Benchmark

  • Minghao Shao
  • Nanda Rani
  • Kimberly Milner
  • Haoran Xi
  • Meet Udeshi
  • Saksham Aggarwal
  • Venkata Sai Charan Putrevu
  • Sandeep K. Shukla

Recent advances in LLM agentic systems have improved the automation of offensive security tasks, particularly for Capture the Flag (CTF) challenges. We systematically investigate the key factors that drive agent success and provide a detailed recipe for building effective LLM-based offensive security agents. First, we present CTFJudge, a framework leveraging LLM as a judge to analyze agent trajectories and provide granular evaluation across CTF solving steps. Second, we propose a novel metric, CTF Competency Index (CCI) for partial correctness, revealing how closely agent solutions align with human-crafted gold standards. Third, we examine how LLM hyperparameters, namely temperature, top-p, and maximum token length, influence agent performance and automated cybersecurity task planning. For rapid evaluation, we present CTFTiny, a curated benchmark of 50 representative CTF challenges across binary exploitation, web, reverse engineering, forensics, and cryptography. Our findings identify optimal multi-agent coordination settings and lay the groundwork for future LLM agent research in cybersecurity.

TMLR Journal 2025 Journal Article

EMMA: Efficient Visual Alignment in Multi-Modal LLMs

  • Sara Ghazanfari
  • Alexandre Araujo
  • Prashanth Krishnamurthy
  • Siddharth Garg
  • Farshad Khorrami

Multi-modal Large Language Models (MLLMs) have recently exhibited impressive general- purpose capabilities by leveraging vision foundation models to encode the core concepts of images into representations. These are then combined with instructions and processed by the language model to generate high-quality responses. Despite significant progress in enhancing the language component, challenges persist in optimally fusing visual encodings within the language model for task-specific adaptability. Recent research has focused on improving this fusion through modality adaptation modules but at the cost of significantly increased model complexity and training data needs. In this paper, we propose EMMA (Efficient Multi-Modal Adaptation), a lightweight cross-modality module designed to efficiently fuse visual and textual encodings, generating instruction-aware visual representations for the language model. Our key contributions include: (1) an efficient early fusion mechanism that integrates vision and language representations with minimal added parameters (less than 0.2% increase in model size), (2) an in-depth interpretability analysis that sheds light on the internal mechanisms of the proposed method; (3) comprehensive experiments that demonstrate notable improvements on both specialized and general benchmarks for MLLMs. Empirical results show that EMMA boosts performance across multiple tasks by up to 9.3% while significantly improving robustness against hallucinations.

ICML Conference 2025 Conference Paper

EnIGMA: Interactive Tools Substantially Assist LM Agents in Finding Security Vulnerabilities

  • Talor Abramovich
  • Meet Udeshi
  • Minghao Shao
  • Kilian Lieret
  • Haoran Xi
  • Kimberly Milner
  • Sofija Jancheska
  • John Yang 0002

Although language model (LM) agents have demonstrated increased performance in multiple domains, including coding and web-browsing, their success in cybersecurity has been limited. We present EnIGMA, an LM agent for autonomously solving Capture The Flag (CTF) challenges. We introduce new tools and interfaces to improve the agent’s ability to find and exploit security vulnerabilities, focusing on interactive terminal programs. These novel Interactive Agent Tools enable LM agents, for the first time, to run interactive utilities, such as a debugger and a server connection tool, which are essential for solving these challenges. Empirical analysis on 390 CTF challenges across four benchmarks demonstrate that these new tools and interfaces substantially improve our agent’s performance, achieving state-of-the-art results on NYU CTF, Intercode-CTF, and CyBench. Finally, we analyze data leakage, developing new methods to quantify it and identifying a new phenomenon we term soliloquizing, where the model self-generates hallucinated observations without interacting with the environment.

ICML Conference 2025 Conference Paper

MP-Nav: Enhancing Data Poisoning Attacks against Multimodal Learning

  • Jingfeng Zhang
  • Prashanth Krishnamurthy
  • Naman Patel
  • Anthony Tzes
  • Farshad Khorrami

Despite the success of current multimodal learning at scale, its susceptibility to data poisoning attacks poses security concerns in critical applications. Attacker can manipulate model behavior by injecting maliciously crafted yet minute instances into the training set, stealthily mismatching distinct concepts. Recent studies have manifested the vulnerability by poisoning multimodal tasks such as Text-Image Retrieval (TIR) and Visual Question Answering (VQA). However, the current attacking method only rely on random choice of concepts for misassociation and random instance selections for injecting the poisoning noise, which often achieves the suboptimal effect and even risks failure due to the dilution of poisons by the large number of benign instances. This study introduces MP-Nav (Multimodal Poison Navigator), a plug-and-play module designed to evaluate and even enhance data poisoning attacks against multimodal models. MP-Nav operates at both the concept and instance levels, identifying semantically similar concept pairs and selecting robust instances to maximize the attack efficacy. The experiments corroborate MP-Nav can significantly improve the efficacy of state-of-the-art data poisoning attacks such as AtoB and ShadowCast in multimodal tasks, and maintain model utility across diverse datasets. Notably, this study underscores the vulnerabilities of multimodal models and calls for the counterpart defenses.

ICRA Conference 2025 Conference Paper

MultiTalk: Introspective and Extrospective Dialogue for Human-Environment-LLM Alignment

  • Venkata Naren Devarakonda
  • Ali Umut Kaypak
  • Shuaihang Yuan
  • Prashanth Krishnamurthy
  • Yi Fang 0006
  • Farshad Khorrami

LLMs have shown promising results in task planning due to their strong natural language understanding and reasoning capabilities. However, issues such as hallucinations, ambiguities in human instructions, environmental constraints, and limitations in the executing agent's capabilities often lead to flawed or incomplete plans. This paper proposes MultiTalk, an LLM-based task planning methodology that addresses these issues through a framework of introspective and extrospective dialogue loops. This approach helps ground generated plans in the context of the environment and the agent's capabilities, while also resolving uncertainties and ambiguities in the given task. These loops are enabled by specialized systems designed to extract and predict task-specific states, and flag mismatches or misalignments among the human user, the LLM agent, and the environment. Effective feedback pathways between these systems and the LLM planner foster meaningful dialogue. The efficacy of this methodology is demonstrated through its application to robotic manipulation tasks. Experiments and ablations highlight the robustness and reliability of our method, and comparisons with baselines further illustrate the superiority of MultiTalk in task planning for embodied agents. Project Website: https://llm-multitalk.github.io/

NeurIPS Conference 2025 Conference Paper

OSVI-WM: One-Shot Visual Imitation for Unseen Tasks using World-Model-Guided Trajectory Generation

  • Raktim Goswami
  • Prashanth Krishnamurthy
  • Yann LeCun
  • Farshad Khorrami

Visual imitation learning enables robotic agents to acquire skills by observing expert demonstration videos. In the one-shot setting, the agent generates a policy after observing a single expert demonstration without additional fine-tuning. Existing approaches typically train and evaluate on the same set of tasks, varying only object configurations, and struggle to generalize to unseen tasks with different semantic or structural requirements. While some recent methods attempt to address this, they exhibit low success rates on hard test tasks that, despite being visually similar to some training tasks, differ in context and require distinct responses. Additionally, most existing methods lack an explicit model of environment dynamics, limiting their ability to reason about future states. To address these limitations, we propose a novel framework for one-shot visual imitation learning via world-model-guided trajectory generation. Given an expert demonstration video and the agent’s initial observation, our method leverages a learned world model to predict a sequence of latent states and actions. This latent trajectory is then decoded into physical waypoints that guide the agent’s execution. Our method is evaluated on two simulated benchmarks and three real-world robotic platforms, where it consistently outperforms prior approaches, with over 30% improvement in some cases.

ICRA Conference 2024 Conference Paper

High-Dimensional Controller Tuning through Latent Representations

  • Alireza Sarmadi
  • Prashanth Krishnamurthy
  • Farshad Khorrami

In this paper, we propose a method to automatically and efficiently tune high-dimensional vectors of controller parameters. The proposed method first learns a mapping from the high-dimensional controller parameter space to a lower dimensional space using a machine learning-based algorithm. This mapping is then utilized in an actor-critic framework using Bayesian optimization (BO). The proposed approach is applicable to complex systems (such as quadruped robots). In addition, the proposed approach also enables efficient generalization to different control tasks while also reducing the number of evaluations required while tuning the controller parameters. We evaluate our method on a legged locomotion application. We show the efficacy of the algorithm in tuning the high-dimensional controller parameters and also reducing the number of evaluations required for the tuning. Moreover, it is shown that the method is successful in generalizing to new tasks and is also transferable to other robot dynamics.

ICLR Conference 2024 Conference Paper

LipSim: A Provably Robust Perceptual Similarity Metric

  • Sara Ghazanfari
  • Alexandre Araujo
  • Prashanth Krishnamurthy
  • Farshad Khorrami
  • Siddharth Garg

Recent years have seen growing interest in developing and applying perceptual similarity metrics. Research has shown the superiority of perceptual metrics over pixel-wise metrics in aligning with human perception and serving as a proxy for the human visual system. On the other hand, as perceptual metrics rely on neural networks, there is a growing concern regarding their resilience, given the established vulnerability of neural networks to adversarial attacks. It is indeed logical to infer that perceptual metrics may inherit both the strengths and shortcomings of neural networks. In this work, we demonstrate the vulnerability of state-of-the-art perceptual similarity metrics based on an ensemble of ViT-based feature extractors to adversarial attacks. We then propose a framework to train a robust perceptual similarity metric called LipSim (Lipschitz Similarity Metric) with provable guarantees. By leveraging 1-Lipschitz neural networks as the backbone, LipSim provides guarded areas around each data point and certificates for all perturbations within an $\ell_2$ ball. Finally, a comprehensive set of experiments shows the performance of LipSim in terms of natural and certified scores and on the image retrieval application.

NeurIPS Conference 2024 Conference Paper

NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security

  • Minghao Shao
  • Sofija Jancheska
  • Meet Udeshi
  • Brendan Dolan-Gavitt
  • Haoran Xi
  • Kimberly Milner
  • Boyuan Chen
  • Max Yin

Large Language Models (LLMs) are being deployed across various domains today. However, their capacity to solve Capture the Flag (CTF) challenges in cybersecurity has not been thoroughly evaluated. To address this, we develop a novel method to assess LLMs in solving CTF challenges by creating a scalable, open-source benchmark database specifically designed for these applications. This database includes metadata for LLM testing and adaptive learning, compiling a diverse range of CTF challenges from popular competitions. Utilizing the advanced function calling capabilities of LLMs, we build a fully automated system with an enhanced workflow and support for external tool calls. Our benchmark dataset and automated framework allow us to evaluate the performance of five LLMs, encompassing both black-box and open-source models. This work lays the foundation for future research into improving the efficiency of LLMs in interactive cybersecurity tasks and automated task planning. By providing a specialized benchmark, our project offers an ideal platform for developing, testing, and refining LLM-based approaches to vulnerability detection and resolution. Evaluating LLMs on these challenges and comparing with human performance yields insights into their potential for AI-driven cybersecurity solutions to perform real-world threat management. We make our benchmark dataset open source to public https: //github. com/NYU-LLM-CTF/NYU CTF Bench along with our playground automated framework https: //github. com/NYU-LLM-CTF/llm ctf automation.

IROS Conference 2019 Conference Paper

Adaptive Adversarial Videos on Roadside Billboards: Dynamically Modifying Trajectories of Autonomous Vehicles

  • Naman Patel
  • Prashanth Krishnamurthy
  • Siddharth Garg
  • Farshad Khorrami

Deep neural networks (DNNs) are being incorporated into various autonomous systems like self-driving cars and robots. However, there is a rising concern about the robustness of these systems because of their susceptibility to adversarial attacks on DNNs. Past research has established that DNNs used for classification and object detection are prone to attacks causing targeted misclassification. In this paper, we show the effectiveness of an adversarial dynamic attack on an end-to-end trained DNN controlling an autonomous vehicle. We launch the attack by installing a billboard on the roadside and displaying videos to approaching vehicles to cause the DNN controller in the vehicle to generate steering commands that cause, for example, unintended lane changes or motion off the road causing accidents. The billboard has an integrated camera estimating the pose of the on-coming vehicle. The approach enables dynamic adversarial perturbation that adapts to the relative pose of the vehicle and uses the dynamics of the vehicle to steer it along adversary-chosen trajectories while being robust to variations in view, lighting, and weather. We demonstrate the effectiveness of the attack on a recently published off-the-shelf end-to-end learning-based autonomous navigation system in a high-fidelity simulator, CARLA (CAR Learning to Act). The proposed approach may also be applied to other systems driven by an end-to-end trained network.

IROS Conference 2018 Conference Paper

Adversarial Learning-Based On-Line Anomaly Monitoring for Assured Autonomy

  • Naman Patel
  • Apoorva Nandini Saridena
  • Anna Choromanska
  • Prashanth Krishnamurthy
  • Farshad Khorrami

The paper proposes an on-line monitoring framework for continuous real-time safety/security in learning-based control systems (specifically application to a unmanned ground vehicle). We monitor validity of mappings from sensor inputs to actuator commands, controller-focused anomaly detection (CFAM), and from actuator commands to sensor inputs, system-focused anomaly detection (SFAM). CFAM is an image conditioned energy based generative adversarial network (EBGAN) in which the energy based discriminator distinguishes between proper and anomalous actuator commands. SFAM is based on an action condition video prediction framework to detect anomalies between predicted and observed temporal evolution of sensor data. We demonstrate the effectiveness of the approach on our autonomous ground vehicle for indoor environments and on Udacity dataset for outdoor environments.

IROS Conference 2017 Conference Paper

Sensor modality fusion with CNNs for UGV autonomous driving in indoor environments

  • Naman Patel
  • Anna Choromanska
  • Prashanth Krishnamurthy
  • Farshad Khorrami

We present a novel end-to-end learning framework to enable ground vehicles to autonomously navigate unknown environments by fusing raw pixels from cameras and depth measurements from a LiDAR. A deep neural network architecture is introduced to effectively perform modality fusion and reliably predict steering commands even in the presence of sensor failures. The proposed network is trained on our own dataset, from LiDAR and a camera mounted on a UGV taken in an indoor corridor environment. Comprehensive experimental evaluation to demonstrate the robustness of our network architecture is performed to show that the proposed deep learning neural network is able to autonomously navigate in the corridor environment. Furthermore, we demonstrate that the fusion of the camera and LiDAR modalities provides further benefits beyond robustness to sensor failures. Specifically, the multimodal fused system shows a potential to navigate around static and dynamic obstacles and to handle changes in environment geometry without being trained for these tasks.

IROS Conference 2015 Conference Paper

Low-profile crawling for humanoid motion in tight spaces

  • Griswald Brooks
  • Prashanth Krishnamurthy
  • Farshad Khorrami

While humanoid robots and various associated algorithms for gait control and motion planning have been considered in the prior literature, the possibility of utilizing a low-profile crawling gait to operate in tight spaces (e. g. , going under obstacles) has not been considered previously and enables new capabilities of the humanoid robot. In this paper, a new gait for humanoid robots is presented that enables humanoid motion in tight spaces that are vertically constrained. The gait is laterally symmetric and utilizes a cooperative motion of both the hands and feet. The addition of this gait expands the set of environments that can be handled by the humanoid robot. The efficacy of the proposed gait is demonstrated experimentally on a NAO humanoid robot.

IROS Conference 2007 Conference Paper

Control design for unmanned sea surface vehicles: hardware-in-the-loop simulator and experimental results

  • Prashanth Krishnamurthy
  • Farshad Khorrami
  • Tzer Leei Ng

We address the control design problem for stabilization and tracking of unmanned sea surface vehicles (USSVs). To this end, we describe the design and implementation of a high-accuracy real-time six degree-of-freedom (DOF) hardware-in-the-loop (HITL) simulation platform for use in development and evaluation of controllers for USSVs. The HITL platform incorporates a nonlinear dynamic model of the USSV, emulation of sensors and instrumentation onboard the USSV, and the actual hardware and software components used for control of the USSV in the experimental testbed. Detailed models of hydrodynamic effects, actuators including thrusters/propellers and control surfaces, and disturbances including ocean currents, waves, and wind are included in the dynamic simulation. The fidelity of the developed HITL simulator is demonstrated through comparisons with experimental data collected from a USSV. We also propose a nonlinear backstepping-based controller for stabilization and tracking for USSVs and present closed-loop results from HITL simulation and experimental testing.

IROS Conference 2003 Conference Paper

TriM: an ultra-accurate high-speed six degree-of-freedom manipulator using linear motors

  • Prashanth Krishnamurthy
  • Farshad Khorrami

In this paper, we propose a novel six degree-of-freedom positioning system. This device is a tripod structure with inextensible limbs actuated at the base by two dimensional linear stepper motors (although other types of actuators may be utilized). The kinematics (both the direct and the inverse kinematics) and dynamics are presented in detail. The dynamics of the actuators (Sawyer motors) are also included in the dynamic modeling. We also carry out a kinematic optimization of the system parameters to maximize the manipulator workspace. The proposed manipulator achieves large range of motion in all the six degrees of freedom. Furthermore, high resolution and high speed motion may be achieved in all axes.