Resonant
Back to resources
GuideMay 3, 2026
Share

Transcribe Audio to Text on Mac
3 Free Ways (2026)

You have a recording — a meeting, interview, lecture, or voice memo — and you need it as text. macOS doesn't have a built-in file transcription tool, but there are three free options that run locally on your Mac.

Option 1: macOS Dictation (workaround)

macOS Dictation is designed for live speech, not file transcription. The workaround — playing audio through speakers and running dictation simultaneously — is unreliable and inaccurate.

Why it doesn't work well

  • 30–60 second timeout — you'd need to restart it constantly through a long recording.
  • Picks up ambient noise — transcribing from speakers means room noise degrades accuracy.
  • No file input — there is no way to feed an audio file directly to Apple Dictation.
  • No timestamps or speaker labels — just raw text.

Option 2: Whisper via command line

OpenAI's Whisper is an open-source speech recognition model. You can run it locally on your Mac for free.

Setup

  1. Install Python 3.8+ (via Homebrew: brew install python).
  2. Install Whisper: pip install openai-whisper
  3. Install ffmpeg: brew install ffmpeg
  4. Transcribe: whisper audio.mp3 --model medium

What it's good at

High accuracy, especially with medium or large models. Supports 100+ languages. Outputs TXT, SRT, and VTT. Fully local — no data sent anywhere.

Where it falls short

  • Technical setup — requires Python, pip, and command line comfort.
  • Slower without optimization — the vanilla Python Whisper doesn't leverage Apple Neural Engine. Transcription can be slow without CoreML compilation.
  • No GUI — command line only. No drag-and-drop.
  • No speaker labels out of the box (needs additional tools for diarization).

Option 3: Resonant

Resonant combines modern on-device transcription accuracy with the convenience of a native Mac app. Drag and drop an audio file, get a transcript.

What it's good at

  • Drag and drop — no command line, no Python, no setup.
  • Fast — speech models compiled to CoreML for Apple Neural Engine. 50–150x realtime on M-series chips.
  • Parakeet on Apple Silicon — NVIDIA's state-of-the-art STT model running on the Neural Engine.
  • Speaker labels — identifies who said what.
  • Timestamps — word and segment-level timing.
  • Export — TXT, Markdown, SRT, VTT.
  • Fully offline — your audio never leaves your Mac.

Setup

  1. Download Resonant and open it.
  2. Drag an audio or video file onto the app.
  3. Choose your model and language.
  4. The transcript appears in seconds to minutes depending on file length.

Side-by-side comparison

Here's how the three options compare for audio file transcription.

FeaturemacOS DictationWhisper CLIResonant
Transcribes audio filesNo — live speech onlyYes — any audio/video fileYes — drag and drop
Setup difficultyNone (built in)High (Python, pip, command line)None (download and run)
AccuracyLow for playback transcriptionHigh (Whisper medium/large)High (Parakeet)
Speed on Apple SiliconN/A~10–30x realtime (depends on setup)~50–150x realtime (CoreML optimized)
Supported formatsN/AMP3, WAV, M4A, FLAC, MP4, etc.MP3, WAV, M4A, FLAC, MP4, MOV, etc.
Speaker labelsNoBasic (with extensions)Yes
TimestampsNoYes (SRT/VTT output)Yes
Export formatsN/ATXT, SRT, VTT, JSONTXT, Markdown, SRT, VTT
Internet requiredPartiallyNo — fully localNo — fully local
CostFreeFree (open source)Free

Which one should you use?

If you're comfortable with the command line: Whisper is excellent. Free, accurate, and local. Just know the setup takes some effort.

If you want the same accuracy without the setup: Resonant. Drag a file, get a transcript. CoreML-optimized for speed on Apple Silicon.

Don't use macOS Dictation for file transcription. It's not designed for it and the results are poor.

Ready to transcribe recordings locally?

Download Resonant for Mac

Free · macOS 14+ · Apple Silicon

Frequently asked questions

Can macOS transcribe audio files for free?

Not directly. macOS Dictation is for live speech. For file transcription, use Whisper (command line) or Resonant (drag and drop).

Is Whisper transcription free?

Yes. The Whisper model is open-source and runs locally. No subscription, no API cost.

What audio formats can be transcribed?

MP3, M4A, WAV, FLAC, OGG, MP4, MOV — most common audio and video formats work with both Whisper and Resonant.

How long does transcription take?

On Apple Silicon, 10–150x realtime. A 1-hour recording takes roughly 30 seconds to 6 minutes depending on model and chip.

Does transcription require internet?

Not with Whisper or Resonant. Both run speech models locally. Cloud services (Otter.ai, Rev) require internet.

What Resonant offers beyond dictation

Resonant isn't just a faster way to type. It's a voice workspace with capabilities no other dictation tool provides.

MCP server for AI tools

Resonant exposes 11 MCP tools that let any AI agent — Claude, Codex, and more — query your entire voice workspace — meetings, dictations, memos, ambient context, and daily journal. Your AI assistant knows what you said this morning. Learn more

Meeting transcription with speaker labels

Dual-channel recording — your mic and system audio on separate channels. NVIDIA Sortformer diarization identifies who said what. No bot joins the call. No audio leaves your Mac. Learn more

Ambient context capture

Passively records which apps you use, window titles, URLs, and dwell time — all locally. This makes dictation context-aware and gives your AI tools a queryable work timeline. Learn more

Two on-device speech models

NVIDIA Parakeet TDT v3 (0.6B, 25 languages) and Qwen3 ASR (0.6B, 30+ languages), both compiled to CoreML and running on Apple Neural Engine. Under 4% WER on English benchmarks. Learn more

Cloud cleanup with hallucination detection

Optional AI post-processing fixes STT errors and adapts to context (email, message, code). Guardrails detect when the LLM rewrites your meaning instead of cleaning your grammar. Learn more

Share

Start with private Mac dictation

Local speech recognition is free and runs on your Mac. Pro adds cloud cleanup, rewrites, summaries, and sharing when you want the full workflow.