Dictation benchmark

Speed and accuracy for Mac dictation apps.

TapTalk, Wispr Flow, Superwhisper, Apple Dictation, and Aqua Voice measured on the same fixed audio set. Results publish only after the run data is complete.

Current benchmark

StatusCollecting

Dataset40 samples

Runs3x target

Updated2026-05-16

Benchmark status

Leaderboard ready. Results pending.

Same fixed 40-sample dictation set for every app. Accuracy is scored with WER and CER against manually written references. Speed uses median latency from recording end to final visible text. No estimated accuracy or speed numbers are published before the run set is complete.

40 samples 3 runs per sample 2026-05-16

AccuracyWER and CER against hand-written references.

SpeedMedian latency after speech ends.

PrivacyUpload, offline, and account posture.

Coverage40 samples, 3 target runs each.

App

Index

Accuracy WER

Speed after release

Uploads

Coverage

TapTalkOn-device

Pending

Wispr FlowCloud-first

Pending

Yes

SuperwhisperMixed local/cloud modes

Pending

Mode-dependent

Apple DictationApple system service

Pending

Mode-dependent

Aqua VoiceCloud-first

Pending

Yes

Scores stay pending until every app has measured runs. The table is visible now so the public benchmark shape is clear before numbers are published.

What gets measured.

Each metric is narrow on purpose. The page should make the result easy to audit later.

Accuracy

WER and CER against manually written references. Lower is better.

Reference checked

Speed

Median latency from recording end to final visible text in the target field.

Median only

Privacy

On-device status, offline behavior, account requirement, and audio upload posture.

Fact-based

Scoring method

The overall index is intentionally simple: accuracy is the main factor, speed matters for daily writing, and privacy is treated as a product property rather than a model metric.

Accuracy50% of index. Uses median WER across the fixed sample set.

Speed30% of index. Uses median latency after recording ends.

Privacy20% of index. Rewards no audio uploads and offline-capable dictation.

Dataset shape

V1 covers short practical dictation tasks: emails, Slack-style replies, technical notes, support text, mixed German/English terms, names, URLs, and product copy.

LanguagesGerman and English.

RunsThree target runs per app and sample.

OutputOnly measured medians are published.

Apps in v1.

The first published run covers the competitors already used on the TapTalk compare pages.

TapTalkOn-device

Audio uploadsNo

Wispr FlowCloud-first

Audio uploadsYes

SuperwhisperMixed local/cloud modes

Audio uploadsMode-dependent

Apple DictationApple system service

Audio uploadsMode-dependent

Aqua VoiceCloud-first

Audio uploadsYes

Compare TapTalk with other dictation apps.

Read the product-level comparisons while the benchmark dataset stays separate.

Open comparisons