Moonshine v2 Medium: Fast, Lightweight English Dictation on Mac
Not every English dictation setup needs the largest model. Moonshine v2 Medium is a 200 MB English speech-to-text model designed for edge devices, where efficiency matters more than raw scale. Here is what it is, when it makes sense, and how to run it on Mac.
What Moonshine is
Moonshine was built by Useful Sensors, a company focused on running powerful models on constrained hardware. The v2 Medium model has 245 million parameters and achieves a 6.65% word error rate on standard English benchmarks — a strong result for a model this compact.
The architecture makes a key choice that distinguishes it from Whisper-based models: it doesn't pad audio to fixed 30-second chunks. Whisper processes all audio as if it were 30 seconds long, which adds overhead for short utterances. Moonshine processes each segment at its actual length, which means fast responses for short bursts — a sentence, a quick note, a voice command.
On Apple Silicon, that efficiency translates directly to responsiveness for short-form input.
Moonshine vs. Parakeet for English
Worth comparing directly: NVIDIA's Parakeet TDT 0.6B v3 is also an excellent English model and is more accurate on most content, with the tradeoff of being roughly 3x larger. Moonshine's strength is footprint and short-utterance latency. Parakeet's strength is raw accuracy and multilingual coverage.
Storage is limited. If you can only budget 200 MB for a dictation model, Moonshine delivers strong quality for the size.
You only ever dictate in English. Moonshine is trained purely on English and optimized accordingly.
You want the smallest viable footprint. Moonshine v2 Medium at 200 MB is a compact, capable choice.
How to run it on Mac
Moonshine is published as ONNX and PyTorch checkpoints by Useful Sensors. To run it locally on Apple Silicon:
- Moonshine ONNX runtime — the official path; runs on Mac via ONNX Runtime with CoreML execution provider.
- moonshine-cpp — community C++ implementation, similar in spirit to whisper.cpp.
- HuggingFace Transformers — works on Mac via MPS for batch inference and experimentation.
Live dictation on Mac
If you want press-a-key, speak, clean-text-in-any-app dictation, Resonant runs Parakeet locally on Apple Silicon's Neural Engine — tuned for live latency, with audio that never leaves your Mac. Free to use.
If you specifically want Moonshine for its tiny footprint or edge-device profile, the runtimes above are the right path. For low-friction live English dictation, download Resonant.