Do Large Language Models (LLMs) Understand Chronology? (Student Abstract)

Pattaraphon Kenny Wongchamcharoen; Paul Glasserman

doi:10.1609/aaai.v40i48.42295

Back to AAAI

AAAI 2026

Do Large Language Models (LLMs) Understand Chronology? (Student Abstract)

Short Paper AAAI Student Abstract and Poster Program Artificial Intelligence

PDF Details DOI

Abstract

Large language models have shown great potential as forecasting tools in finance and economics, but backtesting performance is subject to look-ahead bias if the period overlaps with an LLM’s training window. Prompt-based attempts to avoid look-ahead bias require that LLMs understand chronology. We test LLMs’ ability to understand and enforce chronological order in three types of tasks: sorting randomly shuffled historical events; conditional sorting of events defined by some conditions; and anachronism detection based on intersections of multiple timelines. Our experiments use events that we first confirm are known to the LLM; this ensures that we test chronological understanding on an LLM’s pretrained internal knowledge. Across three LLM families— GPT-4.1 (standard), GPT-5 (hybrid-reasoning), and Claude 3.7 Sonnet (large-reasoning, with and without Extended Thinking), we find that performance degrades rapidly with problem complexity but improves greatly for reasoning models with test-time extended reasoning. These patterns are important for the real-time application of LLMs in finance.

Do Large Language Models (LLMs) Understand Chronology? (Student Abstract)

Abstract

Authors

Keywords

Context