Our Approach Technology Portable Record Connectivity Privacy AI Pipeline Get Started

End-to-end pipeline

From the clinician's voice to a structured clinical encounter in six stages, all on-device. The structured output feeds directly into the billing engine for automated ICD-10 to CPT/HCPCS claim generation and SOAP note production.


Omnilingual speech recognition

Meta Omnilingual ASR — 1600+ languages, CTC architecture, fully offline via ONNX Runtime. Supports all ChartLite target languages (Zulu, Xhosa, Amharic, Chichewa, English, and more).

Tier Model Quantization Size Target
LITE Omnilingual ASR 300M INT8 365 MB Galaxy A03/A04 (<4 GB RAM)
STANDARD Omnilingual ASR 1B INT8 1.03 GB Mid-range+ (4+ GB RAM)

Dual-mode ASR: Omnilingual ONNX on-device when offline, Google Speech-to-Text when connected. Automatic fallback ensures voice capture always works. Apache 2.0 licensed.


Retrieval-augmented extraction

Instead of stuffing 815 reference entries into every prompt, we retrieve only what's relevant.

01

Index

At app startup, TF-IDF vector store indexes 300 ICD-10 codes + 515 formulary drugs (~20–50ms).

02

Retrieve

Per transcript, cosine similarity finds 10–15 most relevant codes and drugs.

03

Prompt

Compact prompt: instructions + retrieved references + transcript.

04

Generate

Qwen 3.5 processes with 80% more context window available.

Component Before (static prompt) After (RAG pipeline)
Reference data ~6,000 tokens (815 entries) ~400–800 tokens (15–25 entries)
Available for transcript ~700 tokens ~5,000+ tokens
Available for generation ~1,000 tokens ~2,000+ tokens
Disambiguation quality Low (no keywords/aliases) High (retrieved entries include keywords + local terms)

Unified JSON extraction format

A single benchmark JSON schema shared by both cloud (Claude) and on-device (Qwen 3.5) extractors — consistent output regardless of inference path.

Benchmark JSON
{
  "diagnoses": [
    {
      "icd10Code": "J06.9",
      "description": "Upper resp. infection",
      "isPrimary": true,
      "confidence": 0.9
    }
  ],
  "medications": [
    {
      "formularyCode": "0097",
      "name": "Paracetamol",
      "dose": 500,
      "unit": "mg",
      "frequency": "TDS"
    }
  ]
}

One schema, every model. Hallucination guards and field validation run identically on cloud and on-device results. The structured JSON output feeds directly into the billing module for automated insurance claim generation (ICD-10 to CPT/HCPCS mapping, E/M level coding) and SOAP note production.


Dictation-first for on-device

Short structured snippets instead of full conversation recording — optimized for small language models.

1

Clinician presses mic

Dictates "BP 168 over 98, pulse 92" (~5–30 seconds)

2

On-device ASR transcribes instantly

Meta Omnilingual ASR (1600+ languages) runs in real-time on device via ONNX

3

Regex extraction provides immediate preview

No model load required — instant structured feedback

4

Snippets accumulate throughout consultation

Each dictation adds to the encounter transcript

5

Single LLM pass at finalization

All snippets processed together for a coherent structured encounter

Why dictation mode?


Battery-conscious processing

Model loads once for N patients instead of N times.

Patient 1
transcript
Queue
Patient 2
transcript
Patient 3
transcript
Load model once
Process batch
Unload & free RAM
Trigger Behavior
Manual Clinician taps "Process Queue" during a break
Urgent Immediate single extraction for referral/emergency
End of session Process remaining queue before closing

Quantized for the edge

Tier Model Quantization Size Context Window
SMALL Qwen 3.5 0.8B Q4_K_M 560 MB 32,768 tokens
LARGE Qwen 3.5 2B Q4_K_M 1.5 GB 32,768 tokens

Hardware-aware selection: 0.8B for 2GB devices, 2B for 4GB+. Both run via llama.cpp built from source.


Fits on a budget phone

Device Class Example RAM ASR LLM Total Footprint
Budget Galaxy A03 2 GB LITE (365 MB) Qwen 0.8B (560 MB) ~950 MB
Mid-range Galaxy A14 4 GB STANDARD (1.03 GB) Qwen 2B (1.5 GB) ~2.6 GB
High-end Galaxy A54 6+ GB STANDARD (1.03 GB) Qwen 2B (1.5 GB) ~2.6 GB