Arrow Research search

Author name cluster

Thai Le

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
2 author rows

Possible papers

6

AAAI Conference 2026 Short Paper

Style-First Authorship Verification for Academic Integrity in the Generative AI Era (Student Abstract)

  • Jun Jang
  • Thai Le
  • Bo Wang

With the rise of generative artificial intelligence (GenAI), academic dishonesty in classrooms has skyrocketed, yet the existing solutions for detecting such dishonesty often fall short. Standard "AI detectors" merely analyze one text at a time, failing to account for students' previous writings, which risks erroneous predictions. Meanwhile, existing token-based authorship verification (AV) models fail to analyze the nuances in writing styles that truly distinguish authorship. To fill this existing gap, we propose a novel AV framework that combines token-level stylometric features (e.g., POS tag patterns) with handcrafted stylistic features (e.g., sentence structure variation) to construct a comprehensive feature set. Using both benchmark corpora and real-world high school student essays, we trained multiple machine learning classifiers using the proposed feature set. Our initial experiments show that our approach outperforms the standard token-only baselines by over 25%, while offering interpretable, style-based insights. These preliminary results highlight the importance of nuanced stylistic features and suggest that a holistic AV system can provide educators with more reliable and transparent detection tools. Looking ahead, we plan to extend this work with large language models and multi-agent approaches to further enhance robustness and adaptability.

AAAI Conference 2025 Conference Paper

NOMATTERXAI: Generating “No Matter What” Alterfactual Examples for Explaining Black-Box Text Classification Models

  • Tuc Van Nguyen
  • James Michels
  • Hua Shen
  • Thai Le

In Explainable AI (XAI), counterfactual explanations (CEs) are a well-studied method to communicate feature relevance through contrastive reasoning of ``what if'' to explain AI models' predictions. However, they only focus on important (i.e., relevant) features and largely disregard less important (i.e., irrelevant) ones. Such irrelevant features can be crucial in many applications, especially when users need to ensure that an AI model's decisions are not affected or biased against specific attributes such as gender, race, religion, or political affiliation. To address this gap, the concept of alterfactual explanations (AEs) has been proposed. AEs explore an alternative reality of ``no matter what'', where irrelevant features are substituted with alternative features (e.g., ``republicans'' -> ``democrats'') within the same attribute (e.g., ``politics'') while maintaining a similar prediction output. This serves to validate whether the specified attributes influence AI model predictions. Despite the promise of AEs, there is a lack of computational approaches to systematically generate them, particularly in the text domain, where creating AEs for AI text classifiers presents unique challenges. This paper addresses this challenge by formulating AE generation as an optimization problem and introducing NoMatterXAI, a novel algorithm that generates AEs for text classification tasks. Our approach achieves high fidelity of up to 95% while preserving context similarity of over 90% across multiple models and datasets. A human study further validates the effectiveness of AEs in explaining AI text classifiers to end users.

AAMAS Conference 2025 Conference Paper

xSRL: Safety-Aware Explainable Reinforcement Learning - Safety as a Product of Explainability

  • Risal Shahriar Shefin
  • Md Asifur Rahman
  • Thai Le
  • Sarra Alqahtani

Reinforcement learning (RL) has shown great promise in simulated environments, such as games, where failures have minimal consequences. However, the deployment of RL agents in real-world systems such as autonomous vehicles, robotics, UAVs, and medical devices demands a higher level of safety and transparency, particularly when facing adversarial threats. Safe RL algorithms aim to address these concerns by optimizing both task performance and safety constraints. However, errors are inevitable, and when they occur, it is essential that RL agents can explain their actions to human operators. This makes trust in the safety mechanisms of RL systems crucial for effective deployment. Explainability plays a key role in building this trust by providing clear, actionable insights into the agent’s decision-making process, ensuring that safety-critical decisions are well understood. While machine learning (ML) has seen significant advances in interpretability and visualization, explainability methods for RL remain limited. Current tools fail to address the dynamic, sequential nature of RL and its need to balance task performance with safety constraints over time. The re-purposing of traditional ML methods, such as saliency maps, is inadequate for safety-critical RL applications where mistakes can result in severe consequences. To bridge this gap, we propose xSRL, a framework that integrates both local and global explanations to provide a comprehensive understanding of RL agents’ behavior. In addition, xSRL enables developers to identify policy vulnerabilities through adversarial attacks, offering tools to debug and patch agents without retraining. Thus, xSRL enhances the RL safety as a byproduct of explainability and transparency. Our experiments and user studies demonstrate xSRL’s effectiveness in increasing safety in RL systems, making them more reliable and trustworthy for real-world deployment. Code is available at https: //github. com/risal-shefin/xSRL

AAAI Conference 2024 Conference Paper

ALISON: Fast and Effective Stylometric Authorship Obfuscation

  • Eric Xing
  • Saranya Venkatraman
  • Thai Le
  • Dongwon Lee

Authorship Attribution (AA) and Authorship Obfuscation (AO) are two competing tasks of increasing importance in privacy research. Modern AA leverages an author's consistent writing style to match a text to its author using an AA classifier. AO is the corresponding adversarial task, aiming to modify a text in such a way that its semantics are preserved, yet an AA model cannot correctly infer its authorship. To address privacy concerns raised by state-of-the-art (SOTA) AA methods, new AO methods have been proposed but remain largely impractical to use due to their prohibitively slow training and obfuscation speed, often taking hours. To this challenge, we propose a practical AO method, ALISON, that (1) dramatically reduces training/obfuscation time, demonstrating more than 10x faster obfuscation than SOTA AO methods, (2) achieves better obfuscation success through attacking three transformer-based AA methods on two benchmark datasets, typically performing 15% better than competing methods, (3) does not require direct signals from a target AA classifier during obfuscation, and (4) utilizes unique stylometric features, allowing sound model interpretation for explainable obfuscation. We also demonstrate that ALISON can effectively prevent four SOTA AA methods from accurately determining the authorship of ChatGPT-generated texts, all while minimally changing the original text semantics. To ensure the reproducibility of our findings, our code and data are available at: https://github.com/EricX003/ALISON.

ECAI Conference 2024 Conference Paper

TopFormer: Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles

  • Adaku Uchendu
  • Thai Le
  • Dongwon Lee 0001

Recent advances in Large Language Models (LLMs) have enabled the generation of open-ended high-quality texts, that are non-trivial to distinguish from human-written texts. We refer to such LLM-generated texts as deepfake texts. There are currently over 72K text generation models in the huggingface model repo. As such, users with malicious intent can easily use these open-sourced LLMs to generate harmful texts and dis/misinformation at scale. To mitigate this problem, a computational method to determine if a given text is a deepfake text or not is desired–i. e. , Turing Test (TT). In particular, in this work, we investigate the more general version of the problem, known as Authorship Attribution (AA), in a multi-class setting–i. e. , not only determining if a given text is a deepfake text or not but also being able to pinpoint which LLM is the author. We propose TopFormer to improve existing AA solutions by capturing more linguistic patterns in deepfake texts by including a Topological Data Analysis (TDA) layer in the Transformer-based model. We show the benefits of having a TDA layer when dealing with imbalanced, and multi-style datasets, by extracting TDA features from the reshaped pooled_output of our backbone as input. This Transformer-based model captures contextual representations (i. e. , semantic and syntactic linguistic features), while TDA captures the shape and structure of data (i. e. , linguistic structures). Finally, TopFormer, outperforms all baselines in all 3 datasets, achieving up to 7% increase in Macro F1 score. Our code and datasets are available at: https: //github. com/AdaUchendu/topformer

AAMAS Conference 2022 Conference Paper

CAPS: Comprehensible Abstract Policy Summaries for Explaining Reinforcement Learning Agents

  • Joe McCalmon
  • Thai Le
  • Sarra Alqahtani
  • Dongwon Lee

As reinforcement learning (RL) continues to improve and be applied in situations alongside humans, the need to explain the learned behaviors of RL agents to end-users becomes more important. Strategies for explaining the reasoning behind an agent’s policy, called policy-level explanations, can lead to important insights about both the task and the agent’s behaviors. Following this line of research, in this work, we propose a novel approach, named as CAPS, that summarizes an agent’s policy in the form of a directed graph with natural language descriptions. A decision tree based clustering method is utilized to abstract the state space of the task into fewer, condensed states which makes the policy graphs more digestible to end-users. This abstraction allows the users to control the size of the policy graph to achieve their desired balance between comprehensibility and accuracy. In addition, we develop a heuristic optimization method to find the most explainable graph policy and present it to the users. Finally, we use the user-defined predicates to enrich the abstract states with semantic meaning. We test our approach on 5 RL tasks, using both deterministic and stochastic policies, and show that our method is: (1) agnostic to the algorithms used to train the policies, and (2) comparable in accuracy and superior in explanation capabilities to existing baselines. Especially, when provided with our explanation graph, end-users are able to accurately interpret policies of trained RL agents 80% of the time, compared to 10% when provided with the next best baseline. We make our code and datasets available to ensure the reproducibility of our research findings: https: //github. com/mccajl/CAPS