Voice Biomarker · Hematology

Hearing Anemia
Before the Blood Draw

A live clinical deployment at Shrimad Rajchandra Medical and Research Institute, Gujarat — applying Sona-2's Large Acoustic Model to non-invasive anemia screening at roughly 100 assessments per day.

Amplifier Health Research · 40-day hospital pilot · Active · 2026

ModalityVoice / Acoustic AI
ConditionAnemia screening
PhaseHospital pilot
StatusActive
Pilot snapshot · SRMD Gujarat Active
75% Cohort anemia prevalence (WHO)
~100 Assessments per day
300 Calibration session target
6M Target assessments/yr
Languages & ground truth Pilot
Gujarati and Hindi in production. Concurrent CBC hemoglobin (g/dL) matched by session ID. Fine-tuning and label exchange ongoing weekly.
The Challenge

1.6 Billion People.
One Diagnostic Tool.

Anemia is the most prevalent nutritional disorder on earth. The only way to confirm it is a blood draw. In rural, high-volume clinical settings, that bottleneck is the entire problem.

Across rural Gujarat — and much of low-resource India — anemia prevalence reaches 60 to 70% in some regions. It affects maternal health, child development, and workforce productivity at population scale. It is almost entirely preventable with early identification and iron supplementation.

The diagnostic gap isn't a knowledge problem. Every frontline health worker knows anemia is likely. The problem is confirmation. Without a blood draw, you cannot objectively diagnose it. In settings where lab capacity is scarce and patient volume is high, that requirement means cases are missed, under-treated, and under-counted.

"No other biomarker for anemia exists at population scale. You must do a blood draw. That is exactly what we are trying to change."

Amplifier's acoustic model detects physiological correlates of anemia — tissue oxygen deficit, altered respiratory drive, changes in phonation energy — from 20 seconds of prompted speech. No consumables. No needles. Any smartphone. The same screen that takes a blood draw can be preceded by a voice triage that tells the health worker which patients need it most.

Partnership

Three Forces.
One Clinical Deployment.

A population health crisis, a hospital with the patient volume and infrastructure, and an acoustic model with proven signal direction. The fit was structural.

The anemia program started with a clinical reality that no technology had solved. Amplifier had Sona-2 already demonstrating meaningful sensitivity and specificity against hemoglobin levels across thousands of samples. The question was whether that signal could be deployed in a new clinical environment, with a local patient population, in new languages.

Shrimad Rajchandra Medical and Research Institute, running an active anemia outreach program at roughly 100 patient assessments per day, had the patient volume, the blood draw infrastructure, and the clinical motivation. Amplifier had Sona-2 and a B2C web application localizable to Hindi and Gujarati.

The team structured a label exchange: Amplifier provides session IDs and preliminary model scores; SRMD maps them to hemoglobin values from concurrent CBC draws. That feedback loop is now running and has markedly improved model performance on this population.

Fine-tuning on local patient data is ongoing. Blinded validation evaluation pending at the 300-session threshold. All performance data reflects pilot-phase directional findings only.
Clinical Partner
SRMD Institute · Gujarat
Active anemia outreach with ~100 assessments/day. Running the pilot — capturing voice sessions alongside daily CBC workflow, returning hemoglobin labels weekly.
AI & Model Platform
Amplifier Health
Built the Sona-2 Large Acoustic Model and India web application. Driving local fine-tuning and calibration to Hindi and Gujarati populations. Public grant application in progress.
Data Protocol
Label Exchange
Session ID matched to hemoglobin (g/dL) from concurrent CBC draw. Weekly cadence. 300 linked sessions is the inflection point for blinded validation and model lock.
Pilot Signal

Real Patients.
Real Ground Truth.

Early signal from live deployment confirmed what the model was built to detect. Not a formal validation — directionally decisive enough to justify everything that followed.

Cohort Signal · SRMD Outpatient Population

45 of 60 patients confirmed anemic by WHO criteria

The model flagged elevated risk at a rate consistent with confirmed CBC anemia. Prevalence matched regional epidemiological data. Signal direction confirmed.

Fine-Tuning Effect

Model performance markedly improved post fine-tune

Weekly label exchange with SRMD has continuously improved Sona-2's performance. Calibration to Gujarati and Hindi speech patterns is ongoing as the dataset grows.

Languages · Ground Truth

Gujarati and Hindi · CBC hemoglobin (g/dL)

Both local languages deployed in production. Ground truth is concurrent CBC hemoglobin from the same clinical visit, matched by session ID — no workflow change required.

This phase established signal direction and initiated the fine-tuning loop — not to report clinical performance metrics. Sensitivity and specificity will be established at the 300-session blinded evaluation threshold.

PopulationRural Gujarat outpatient
Daily volume~100 assessments/day
Sample typeB2C app · live capture
Ground truthCBC hemoglobin g/dL
LanguagesGujarati · Hindi
Signal directionConfirmed
Fine-tuningActive · weekly
ModelSona-2, India fine-tuned
Study Design

The Validation Study.
Now Underway.

Live deployment. Real patients. Concurrent ground truth. The calibration dataset is being assembled at 100 sessions per day.

Active data collection — ongoing
Each patient using the app during a visit with a concurrent blood draw generates a linked training sample. Voice captures ~20 seconds of prompted speech. CBC captures hemoglobin in g/dL. Matched by session ID. No additional burden on staff or patients. Comorbidity context collected where available.
Blinded calibration — at 300 sessions
At 300 linked sessions the model enters formal calibration. Amplifier will be blinded to hemoglobin values, run inference, and return predictions to SRMD for performance evaluation. Sensitivity and specificity against a pre-specified hemoglobin threshold are the primary metrics.
Model lock and evidence package
Full linked dataset assembled. Fine-tuned India model locked. Performance evidence package built: sensitivity, specificity, confidence calibration, population prevalence context. Serves two purposes — public grant submission and the commercial contract conversation for scaled deployment.
PopulationRural Gujarat outpatient
Daily volume~100 assessments per day
ComparatorHemoglobin g/dL, concurrent CBC
Voice modalityB2C app, hardware-agnostic
LanguagesGujarati and Hindi
Calibration target300 linked sessions
IRBNot required this phase
ModelSona-2, fine-tuned India
Clinical Framework

Screen. Refer. Treat.

Three stages. One pathway from first contact to confirmed intervention — with voice replacing the blood draw at the triage layer.

01 —
Screen
Frontline identification
Current frontline triage depends on symptom recognition and clinical suspicion — neither reliably catches early-stage anemia. A finger stick is the only objective tool. Sona-2 adds a passive, non-invasive acoustic layer at first contact. Any device. No consumables. No blood.
Passive capture Hardware-agnostic No consumables
02 —
Refer
Risk stratification
Sona-2's three-tier output — Elevated Risk, Monitor, or Clear — allows frontline workers to prioritize which patients need confirmatory lab work today. In settings where lab capacity is limited and volume is high, that stratification is operationally decisive.
Risk stratification Lab prioritization Workflow integration
03 —
Treat
Confirmation & intervention
Elevated Risk patients are referred for confirmatory hemoglobin measurement. Confirmed diagnosis triggers the appropriate intervention: iron supplementation, dietary counseling, or escalation. Long-term: voice screen eliminates the blood draw at the population triage level.
Confirmatory CBC Targeted intervention Population impact
Technology

How Sona-2 Reads
Anemia in Voice

Five stages from raw audio to clinically actionable triage output. No blood required at any point in the pipeline.

1
Audio capture
Any recording device — smartphone, tablet, basic handset. Hardware-agnostic, designed for low-resource environments. Patients respond to structured prompts in Gujarati or Hindi, generating approximately 20 seconds of continuous speech.
Hardware-agnosticGujarati · Hindi~20 seconds
2
Feature extraction
Audio decomposed into 88 acoustic features — frequency, energy, spectral, and temporal parameters — using the extended Geneva Minimalistic Acoustic Parameter Set. Captures the acoustic correlates of anemia: tissue oxygen deficit, fatigue, altered respiratory drive.
88 acoustic featureseGeMAPS extended
3
Encoder
Features passed through a contrastive language-audio pretraining encoder backed by a hierarchical token-semantic audio transformer. Maps features into the LAM's learned embedding space.
CLAP encoderHTSAT backbone
4
Sona-2 LAM inference
The Large Acoustic Model runs inference against its trained anemia phenotype, fine-tuned on local patient data from the India deployment. Calibrated specifically for this population, clinical environment, and languages. Continuously improving via weekly label exchange.
India fine-tunedContinuous improvement
5
3-tier triage output
Model returns one of three outputs: Elevated Risk, Monitor, or Clear — each with a confidence score and feature-level explainability. Designed for frontline clinical workflows. Tells the health worker which patients need a blood draw today.
Elevated riskMonitorClearConfidence score
Scale & Commercial

One Playbook.
Every High-Prevalence Geography.

This pilot is the proof point that unlocks a recurring screening contract, a public grant, and a replicable deployment template for any population on earth.

Affected globally
1.6B
Most prevalent nutritional deficiency on earth
India prevalence
60–70%
Some states and regions per NFHS data
Grant status
Public
In progress · validation is the trigger
Post-validation target
6M
Assessments per year · existing partner
Language expansion
Any
Same pipeline · new fine-tune · not a new build

Why this market is structurally different

Every competing approach requires a blood draw, a finger stick, or specialized hardware. Sona-2 requires none of these. The barrier to deployment is a smartphone and 20 seconds of speech. That asymmetry doesn't erode — it compounds with each new language and geography.

What a validated pilot unlocks

A direct screening contract at approximately 6 million assessments annually with the existing clinical partner. A public grant that funds continued development. And a deployment template — the same fine-tuning pipeline and label exchange protocol, replicable to any high-prevalence population in any language.

The commercial model is per-organization screening contracts — cloud-based, hardware-agnostic, no EHR required. De-identified, federated. No raw audio leaves the facility.

What's Next

Three Milestones.
One Clear Path.

The pilot is active. The label exchange is running. These are the near-term gates from calibration to deployment at scale.

MILESTONE 01
300-Session Calibration Dataset Locked
Complete the linked dataset: 300 voice sessions with matched hemoglobin values from concurrent CBC draws. At current velocity this milestone is within reach. Once locked, the model enters formal blinded evaluation — sensitivity and specificity against a pre-specified hemoglobin threshold.
MILESTONE 02
Sona-2 Anemia Model Locked for India
Once blinded evaluation clears, the fine-tuned India model is locked. Performance evidence package assembled: sensitivity, specificity, confidence calibration, population-level prevalence context. Serves two purposes — public grant submission and the commercial contract conversation.
MILESTONE 03
6 Million Assessments Annually
A validated model opens the direct path to scaled deployment at ~6 million assessments per year with the existing clinical partner. The same architecture, fine-tuning pipeline, and label exchange protocol then become the template for the next high-prevalence geography.