Dictation benchmark

Speed and accuracy for Mac dictation apps.

TapTalk, Wispr Flow, Superwhisper, Apple Dictation, and Aqua Voice measured on the same fixed audio set. Results publish only after the run data is complete.

Current benchmark
StatusCollecting
Dataset40 samples
Runs3x target
Updated2026-05-16
Benchmark status

Leaderboard ready. Results pending.

Same fixed 40-sample dictation set for every app. Accuracy is scored with WER and CER against manually written references. Speed uses median latency from recording end to final visible text. No estimated accuracy or speed numbers are published before the run set is complete.

40 samples 3 runs per sample 2026-05-16
AccuracyWER and CER against hand-written references.
SpeedMedian latency after speech ends.
PrivacyUpload, offline, and account posture.
Coverage40 samples, 3 target runs each.
App
Index
Accuracy WER
Speed after release
Uploads
Coverage
TapTalkOn-device
Pending
Pending
Pending
No
0
Wispr FlowCloud-first
Pending
Pending
Pending
Yes
0
SuperwhisperMixed local/cloud modes
Pending
Pending
Pending
Mode-dependent
0
Apple DictationApple system service
Pending
Pending
Pending
Mode-dependent
0
Aqua VoiceCloud-first
Pending
Pending
Pending
Yes
0
Scores stay pending until every app has measured runs. The table is visible now so the public benchmark shape is clear before numbers are published.

What gets measured.

Each metric is narrow on purpose. The page should make the result easy to audit later.

Accuracy

WER and CER against manually written references. Lower is better.

Reference checked

Speed

Median latency from recording end to final visible text in the target field.

Median only

Privacy

On-device status, offline behavior, account requirement, and audio upload posture.

Fact-based

Scoring method

The overall index is intentionally simple: accuracy is the main factor, speed matters for daily writing, and privacy is treated as a product property rather than a model metric.

Accuracy50% of index. Uses median WER across the fixed sample set.
Speed30% of index. Uses median latency after recording ends.
Privacy20% of index. Rewards no audio uploads and offline-capable dictation.

Dataset shape

V1 covers short practical dictation tasks: emails, Slack-style replies, technical notes, support text, mixed German/English terms, names, URLs, and product copy.

LanguagesGerman and English.
RunsThree target runs per app and sample.
OutputOnly measured medians are published.

Apps in v1.

The first published run covers the competitors already used on the TapTalk compare pages.

TapTalkOn-device
Audio uploadsNo
Wispr FlowCloud-first
Audio uploadsYes
SuperwhisperMixed local/cloud modes
Audio uploadsMode-dependent
Apple DictationApple system service
Audio uploadsMode-dependent
Aqua VoiceCloud-first
Audio uploadsYes

Compare TapTalk with other dictation apps.

Read the product-level comparisons while the benchmark dataset stays separate.

Open comparisons