Direct vs. Indirect: What Is a Vocal Biomarker Actually Measuring?

Author: Camille Noufi, Ph.D

Reviewed: Dr. Amit Mehta, MD

Published: January 16, 2026

Reading Time: ~7-10 minutes

Neurological damage can affect speech through direct motor and auditory-linguistic mechanisms, as well as indirect pathways such as mood, fatigue, and autonomic dysregulation. Both converge in measurable changes in the voice signal.

The Accuracy Trap: When High Performance Masks the Wrong Signal

Imagine This: A vendor pitches you a voice-based traumatic brain injury (TBI) screening tool. The accuracy looks impressive: it distinguishes brain injury patients from healthy controls with an accuracy above 90%. It works over the phone, requires no special equipment, and could slot into your existing intake workflow. You're intrigued.

But there's a question the pitch doesn't answer: what is the model actually learning?

Voice reflects many physiological processes at once. It carries signatures of neurological damage, but also of cognitive load, emotional state, and autonomic arousal. These produce overlapping acoustic patterns. A model trained to detect one condition may inadvertently learn to detect another that shares similar features.

Imagine deploying that tool in your urgent care triage line. A patient calls after a fall, reporting dizziness and confusion. The model flags high risk based on their voice. You escalate them to an in-person visit. The CT is clean. The neuro exam is unremarkable. But on the phone, the patient's voice was shaky, their breathing shallow, their speech halting. They were anxious about what the fall might mean.

The model detected a real pattern. It just wasn't brain injury.

This is the specificity problem. Moving vocal biomarkers from research to clinical use requires understanding not just whether we can detect a condition, but through which physiological pathways that detection is happening.

How Voice Actually Encodes Physiology

Not all pathways from pathology to acoustic signal work the same way. Some conditions directly alter the systems that produce voice. Damage to the motor cortex impairs neural circuits controlling laryngeal muscles. The vocal folds vibrate less stably, producing measurable perturbations in the acoustic waveform. The pathway is relatively straightforward: neurological damage affects motor control, motor control affects vocal fold biomechanics, biomechanics appear in acoustics. The signature is mechanistically anchored to the pathology.

Other conditions affect voice through intermediate states. Anxiety elevates autonomic arousal, which may tense laryngeal muscles and alter pitch. Depression often flattens prosody. Cognitive load slows speech and increases pauses. The autonomic nervous system continuously regulates physiological parameters supporting phonation: subglottal pressure, respiratory coordination, laryngeal tension. When cognitive load increases or emotional state shifts, those adjustments manifest in the voice. The acoustic signature reflects the state, not necessarily the underlying condition that produced it.

We can call these direct pathways and indirect pathways. It's a simplification, but a useful one. Direct pathway features narrow the differential. They indicate neurological involvement, something affecting the motor systems that produce voice, even if they don't pinpoint which specific neurological condition. Indirect pathway features offer no such anchoring. They appear across neurological, psychiatric, and general medical conditions alike. A model relying primarily on indirect features may detect a real pattern without being able to distinguish whether it reflects brain injury, a primary psychiatric condition, or general medical illness.

This framework helps identify where specificity challenges are likely to arise and what methodological choices can address them. TBI illustrates this well, precisely because it engages both pathway types simultaneously.

Why Traumatic Brain Injury Exposes the Core Challenge

Current screening for TBI, particularly mild TBI and concussion, relies heavily on patient self-report. The first clinical interaction is often a phone call. They ask what happened, when, and how the patient feels. The problem is obvious: the person reporting is often the person impaired. Vocal biomarkers offer something different: an objective signal extracted from natural conversation, independent of the patient's ability to self-assess, and available in any encounter where the patient speaks.

TBI affects voice through direct pathways in well-documented ways. Damage to the motor cortex impairs laryngeal muscle control. Cerebellar injury disrupts speech timing and rhythm. Brainstem involvement compromises respiratory-phonatory coordination. These often produce measurable acoustic changes: elevated jitter and shimmer, reduced cepstral peak prominence, decreased articulatory precision. Estimates suggest 30 to 86 percent of people with acute or subacute TBI develop some form of dysarthria, depending on severity and timing. These motor speech features point toward neurological involvement rather than purely cognitive or emotional origins.

TBI also engages indirect pathways through its cognitive and psychiatric sequelae. Over half of TBI patients meet criteria for major depression within the first year, nearly eight times the general population rate. Among those with depression, 60 percent also develop anxiety disorders. In mild TBI specifically, anxiety affects approximately 16 percent of patients, PTSD about 11 percent, and chronic pain about 16 percent. Each produces its own vocal signature: depression often flattens prosody, anxiety alters pitch patterns and increases muscle tension, cognitive deficits slow speech and increase pauses. Beyond psychiatric diagnoses, many TBI patients develop autonomic dysregulation itself, which can alter voice even without obvious motor deficits or diagnosable psychological conditions.

Motor speech deficits also appear in Parkinson's, stroke, and ALS. Cognitive and affective changes also appear in primary depression, anxiety disorders, and chronic fatigue. But the pattern of motor control disruption appearing alongside the cognitive, emotional, and autonomic profile typical of brain injury creates a richer signal than either pathway alone. A primary anxiety disorder does not produce the pattern of motor speech deficits characteristic of dysarthria. Anxiety can cause laryngeal tension and phonatory changes, but these differ from the coordination and timing deficits seen in neurological motor impairment. Parkinson's produces motor deficits but with a different trajectory and comorbidity profile. The confluence of both pathway types makes TBI harder to attribute to a single alternative explanation.

What This Means for Building Real Systems

Our work reflects this logic at multiple stages. We build cohorts with intentional comorbidity structure, ensuring conditions sharing indirect pathway features appear in controls so models must learn what distinguishes TBI specifically. We also prioritize mechanistic interpretability, checking whether the patterns driving predictions align with direct and indirect pathway expectations. In real-world primary and urgent care tests, this approach yields strong discrimination (>0.90 AUC) even against the heterogeneous mix of conditions that actually present to these settings.

Designing for Real-World Use: Screening vs. Monitoring

This pathway framework also shapes how vocal biomarkers can be used. Screening prioritizes sensitivity, catching cases that warrant further evaluation, tolerating some false positives because they get ruled out downstream. Both pathway types contribute here. Direct pathway features help separate neurological involvement from purely psychological presentations. Indirect pathway features add sensitivity by capturing the cognitive and affective disruption that accompanies brain injury.

Monitoring asks whether the patient is recovering, and here tracking both pathways becomes essential. Anyone who has managed a concussion knows the struggle of post-diagnosis recovery tracking: self-assessing balance, motor control, language fluency, memory, light sensitivity, sleep quality, emotional state, headaches. TBI affects multiple systems, and recovery means tracking all of them. Vocal biomarkers that capture both pathway types align naturally with this. Motor speech features reflect coordination and control. Prosodic and temporal features reflect cognitive load, emotional regulation, and autonomic stability. If motor features improve while cognitive load features remain elevated, that tells you which systems are recovering and which are lagging. This is the kind of granularity the pathway framework makes possible.

A Framework That Scales Beyond One Condition

The direct/indirect distinction is a simplification, but it captures something essential for building vocal biomarkers that work in the real world: the pathways from condition to voice shape how we design cohorts, evaluate models, and choose deployment contexts. TBI illustrates the value of engaging both pathway types: motor speech features provide grounding in neurological pathology, while cognitive and autonomic features provide richness and a holistic view into the full-body effects of brain damage.

This framework generalizes beyond TBI. When evaluating any vocal biomarker work, the same principles apply.

Ask about mechanism. What is the proposed pathway from condition to voice? If indirect, what other conditions engage similar pathways? A study claiming detection through purely indirect pathways should demonstrate how it distinguishes that condition from others producing overlapping signatures.

Ask about controls. A model distinguishing TBI from healthy individuals has learned something different than one distinguishing TBI from depression or other neurological conditions. The choice of control population heavily influences what the model is actually learning.

Ask about use case. Is this for screening or monitoring? What confounds matter in the deployment context? A model developed on one population may not transfer to another where the comorbidity profile differs.
At the modeling level, what separates a research finding from something ready for deployment is understanding which pathways your features engage.