Introduction
Human perception is inherently temporal, shaped by deadlines and uncertainty. While behavioral science interprets hesitation and response switching as signals of cognitive control, AI evaluation typically relies on static accuracy, ignoring how decisions evolve. As Vision-Language Models (VLMs) become decision partners in time-sensitive tasks, we need metrics that quantify their reliability over time. We introduce the Temporal Hallucination Index (THI), a behaviorally grounded metric designed to measure temporal instability and enable direct human–AI comparison.
Methods
THI captures response failures including delays, timeouts, drift, and persistence. We operationalized THI using a classic Tumbling-E visual acuity task implemented with a randomized staircase to modulate difficulty (arcmin). The protocol imposes specific time constraints (3s for humans; 17s for AI) and records choices, reaction times, and confidence. This granular tracking allows us to dissociate simple perceptual limits from failures in decisional stability and temporal control.
Results
Under matched conditions, humans exhibited high temporal stability (THI = 0.03), maintaining consistent responses and sub-two-second reaction times even as perceptual difficulty increased near threshold. In contrast, AI systems showed substantial instability (THI = 0.23). This was marked by frequent timeouts, rapid response reversals ("flip-flopping"), and persistence errors, despite the systems being granted significantly longer response windows. These findings reveal qualitative disparities in processing consistency rather than just visual acuity.
Conclusions
THI provides a robust tool for quantifying temporal instability. By mapping AI "temporal hallucinations" to established human behaviors—like inconsistency and perseveration—THI enables principled comparisons of reliability. This framework highlights critical divergences in temporal cognition, offering a new lens for evaluating confidence and control in artificial agents compared to biological ones.
