Resonant
Back to resources
ModelsMar 3, 2026
Share

Zipformer Japanese: Offline Japanese Dictation on Mac

Japanese is one of the harder languages to serve well with a general-purpose speech model. The phonology is distinct, the script requires careful handling of kanji/hiragana/katakana transitions, and many available models were trained primarily on English and treat Japanese as a secondary language.

Resonant includes a dedicated Japanese model that was built specifically for this. The Zipformer Japanese model was trained on 35,000 hours of ReazonSpeech v2.0 data — one of the largest and highest-quality Japanese speech datasets publicly available. It runs entirely offline on your Mac.

What ReazonSpeech is

ReazonSpeech is a Japanese speech corpus compiled by Reazon Human Interaction Lab. Version 2.0 contains approximately 35,000 hours of natural Japanese speech, drawn from diverse sources including broadcast audio and conversational content. Training on this volume of native Japanese data means the model captures the full range of natural speech patterns, not just textbook pronunciation.

The result is a model that handles real-world Japanese speech well: connected speech, informal register, names, and the natural rhythm of conversational dictation.

Fast enough to feel instant

Zipformer Japanese runs at a real-time factor of 0.08. That means an eight-second clip transcribes in about 0.6 seconds on Apple Silicon. For practical dictation — sentences, paragraphs, notes — the transcription completes before you've registered any delay.

At 148 MB, it's also compact. The combination of small size and high speed makes it practical even on MacBooks where storage and processing headroom are limited.

Zipformer Japanese vs. SenseVoice for Japanese

Resonant also includes SenseVoice Small, which covers Japanese as one of its five supported languages. SenseVoice is faster and handles multilingual content across East Asian languages. If you regularly switch between Japanese and other languages in a single session, SenseVoice may be more practical.

But if Japanese is your primary or only dictation language, Zipformer Japanese is likely to outperform SenseVoice on Japanese-only content. A dedicated model trained on 35,000 hours of native Japanese data has advantages that a multilingual model trading off across five languages can't fully match.

How to enable it

Open Resonant Settings → Transcription and select “Zipformer Japanese”. The 148 MB download completes quickly. No language configuration needed — the model outputs Japanese script natively.

Your audio stays on your machine. Everything is processed locally on Apple Silicon, and nothing is transmitted.

Download Resonant to try Japanese dictation locally on your Mac.

Share