Offline Meeting Transcription on Mac: Record Zoom, Teams & Meet Locally

May 13, 2026
·
8 min read
·Whisper Notes Team

We built offline meeting transcription for Mac. It records Zoom, Teams, and Google Meet calls, transcribes them locally with Parakeet V3, and summarizes them with Gemma 4. No cloud, no bot in the call. $6.99 once.

Whisper Notes recording a Zoom meeting on Mac with real-time transcription showing Me and Others speaker labels

Recording a Zoom call in Whisper Notes — "Me" and "Others" are labeled by audio source

A Typical Monday

10 AM, Zoom call with a client. You open Whisper Notes, click record. The app captures system audio and your microphone simultaneously — nobody in the meeting sees a bot, nobody gets a notification, nothing shows up in the participant list.

An hour later, the call ends. You stop recording. Parakeet V3 transcribes 60 minutes of audio in about a minute, entirely on your Mac's Neural Engine. You tap Summarize — Gemma 4 extracts the key points. You tap Action Items — it pulls out every task and deadline mentioned. You send the meeting notes to the client. The audio never left your machine.

That's the whole workflow. Record, transcribe, summarize. All local.

What It Does

Recording

Whisper Notes captures system audio — the sound coming out of your speakers or headphones. If you can hear it on your Mac, we can transcribe it. Zoom, Teams, Google Meet, Webex, GoTo, Whereby, Jitsi, YouTube, podcasts, or any other app. It also records your microphone at the same time, so both sides of the conversation are captured.

No bot joins the call. This matters more than it sounds. If you've ever seen "Otter.ai Notetaker has joined the meeting" pop up in a Zoom call, you know what happens next — someone asks what it is, someone else gets uncomfortable, and the conversation shifts. With system audio capture, nobody knows you're recording except you.

Transcription

Parakeet V3 runs on Apple Silicon via CoreML. It processes English and 24 European languages at roughly 60× real-time — a 60-minute meeting finishes in about a minute. For Chinese, Japanese, or Korean, SenseVoice handles CJK at 52× speed. Pyannote VAD strips silence before transcription, so the model only processes actual speech.

Whisper Notes transcript view on Mac showing inline text editing with timestamps and audio waveform

Transcript with timestamps and inline editing — click any segment to jump to that moment in the audio

AI Features — All Local

Gemma 4 runs on your Mac. No API key, no cloud call, no usage limits. After transcription:

  • Summarize — main points of a 60-minute meeting, in seconds
  • Action Items — tasks and deadlines, extracted automatically
  • Translate — Apple Intelligence translates the transcript into another language
  • Chat — ask "what did we agree on pricing?" and get an answer grounded in the transcript
Whisper Notes AI Assistant sidebar with Summarize, Action Items, Translate buttons and chat interface

Gemma 4 AI sidebar — Summarize, Action Items, Translate, and free-form chat, all running locally

Why We Built It This Way

Meeting audio is some of the most sensitive data a company produces. Client negotiations, HR reviews, board discussions, legal consultations — the kind of conversations where the wrong leak ends careers.

Most transcription tools upload this audio to cloud servers, process it there, and store it under their data retention policies. Some add a bot to the call that everyone can see. Some keep your recordings indefinitely for "model improvement."

We took a different approach: everything runs on your Mac. The ASR model, the LLM, the audio storage — all local. There's no server to breach, no data retention policy to read, no third-party subpoena risk. For teams under GDPR, HIPAA, or attorney-client privilege, this architecture is the point.

How It Compares

Whisper Notes Otter.ai Fireflies Jamie
Processing 100% on-device Cloud Cloud Hybrid
Bot in call No Yes Yes No
Price $6.99 once $16.99/mo (Pro) from $18/mo $24/mo
Works offline Yes No No Partial
AI summary Local (Gemma 4) Cloud Cloud Cloud
Speaker diarization Not yet Yes Yes Yes

Different Meetings, Different Languages

Pick the model that matches your meeting language:

English / European Parakeet V3 — ~60× real-time, 6.32% WER, zero hallucinations on silence
Chinese / Japanese / Korean SenseVoice — 52× speed, handles Cantonese, GPU-accelerated via MLX
Other languages Whisper Large V3 Turbo — 99 languages, high accuracy, slower

What's Missing

We don't have speaker diarization yet. Right now, Whisper Notes labels audio as "Me" (your microphone) and "Others" (system audio) — which covers most one-on-one and small group meetings. But for a 10-person call where you need to know who said what, that's not enough.

It's the obvious next step and we're working on it. The goal is local diarization that runs alongside Parakeet V3 and SenseVoice, without sending audio anywhere.