Starting with version 1.3.2, Whisper Notes for Mac ships with NVIDIA Parakeet TDT 0.6B as the default speech engine. It's 10x faster than Whisper Large V3 Turbo for English, and more accurate. Whisper models are still available if you need other languages.
Why We Switched the Default
Whisper is great, but it was designed as a general-purpose model. It handles 100+ languages, translates, generates timestamps — a Swiss Army knife. The trade-off is speed. For English dictation, where you just want words on screen fast, it's overkill.
Here's the thing that bugged me: when using Fn-key system-wide dictation with Whisper, finishing a ~1 minute utterance meant waiting 3–5 seconds for the transcript to appear. That pause breaks the flow. You stop talking, you wait, you stare at the cursor — it kills the magic of voice typing.
Parakeet changed that completely. The speed is so fast that the transcript appears the instant you stop speaking. Speak, and the words are just there. Once you experience that feeling — that seamless, zero-wait flow — it's really hard to go back to Whisper.
How Fast Is Parakeet V3?
Numbers speak louder than words. Here's a real-world comparison using a 35-minute audio file on the same Mac:
| Model | 35 min Audio |
|---|---|
| Whisper Large V3 Turbo | 3 minutes |
| Parakeet TDT 0.6B v3 | 18 seconds |
That's 10x faster. And because the model is smaller (600M vs 800M parameters), it uses less memory and less battery too.
What Makes Parakeet v3 So Fast
Whisper listens to audio the way you'd read a book out loud — word by word, frame by frame, never skipping ahead. Even during silence, it's still processing, still guessing what comes next. That's thorough, but slow.
Parakeet takes a fundamentally different approach. It compresses the audio signal 8x before processing, so the model sees only what matters. Then, instead of grinding through every single frame, it predicts not just what word you said, but how long that word lasts — and jumps ahead. Silence? Skipped entirely. A long vowel? One prediction instead of dozens.
The result is a model that processes speech the way your brain does — focusing on the words, ignoring the gaps. That's why it's 10x faster with fewer parameters and higher accuracy.
Benchmarks: Parakeet v3 vs Whisper
Parakeet v3 matches or beats models 2-4x its size across FLEURS, CoVoST, and MLS benchmarks
On the Hugging Face Open ASR Leaderboard, Parakeet v3 tops the chart with only 600M parameters — less than half of Whisper Large V3's 1.55B:
| Model | Parameters | Avg WER | Speed (RTFx) |
|---|---|---|---|
| Parakeet TDT 0.6B v3 | 0.6B | 6.32% | 3,333x |
| Canary 1B v2 | 1.0B | 7.15% | 749x |
| Whisper Large V3 | 1.55B | 7.44% | 146x |
| Whisper Large V3 Turbo | 0.8B | 7.6% | 350x |
Lower WER = fewer errors. Higher RTFx = faster. Parakeet wins on both. With 600M parameters, it's also the smallest model on that list — which means it runs beautifully on Apple Silicon with minimal memory and battery drain.
No More Hallucinations
If you've used Whisper for dictation, you've probably seen it hallucinate during silence — repeating phrases, inventing words, or spitting out "Subtitles by Amara.org" from nowhere. This happens because Whisper's autoregressive decoder always expects to produce text, even when there's nothing to transcribe.
NVIDIA trained Parakeet on 36,000 hours of pure non-speech audio (background noise, coughs, silence) paired with empty string targets. The model learned what silence sounds like and stays quiet. For "always-on" system-wide dictation, this is a game-changer — no more garbage text appearing when you pause to think.
Languages Parakeet Supports
Parakeet v3 supports 25 languages: Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, and Ukrainian.
That covers most of Europe, but it does not support Chinese, Japanese, Korean, Arabic, or Hindi. That's why we kept the Whisper models as downloadable options. If you dictate in Japanese or Mandarin, pick Whisper Large V3 Turbo from the model picker. For English and European languages, Parakeet v3 is simply the better engine.
Model picker: Parakeet V3 (default), Whisper Small, and Whisper Large V3 Turbo — all running locally
Model Picker in Whisper Notes
Open Settings to switch between models:
- Parakeet V3 (default) — Fastest, best for English & European languages
- Whisper Small — Lightweight, 100+ languages
- Whisper Large V3 Turbo — Most accurate multi-language model
All models run 100% locally on your Mac. No internet, no cloud, no data leaves your device.
Try It
Parakeet v3 is available now in the Mac version — just download the latest DMG. If the feedback is positive, we'll bring Parakeet to the iOS version in a future update.
Questions or feedback? Email support@whispernotes.app.