Speak,
Don't Type.
Sumi transforms your speech into context-aware text in real-time. Choose local or cloud — your voice, your rules.
Your Voice, Your Rules
Total flexibility — choose the setup that works best for you.
Offline Freedom
Run Locally
Use Whisper with Metal GPU acceleration for on-device transcription. No internet, no uploads.
AI on Your Machine
Polish text with a local LLM (Llama 3 / Qwen 2.5). All processing stays on your device.
Zero Data Leaves
Your voice recordings and text never leave your device. Complete privacy by design.
Full power, fully offline.
Cloud Flexibility
Bring Your Own API
Connect to Groq, OpenAI, Deepgram, Azure, or any custom endpoint with your own keys.
Faster Processing
Leverage cloud APIs for faster transcription and more powerful AI polishing when speed matters.
Switch Anytime
Seamlessly switch between local and cloud modes. No lock-in, no subscriptions required.
Maximum speed, maximum choice.
Works Everywhere You Do
Sumi works in any app on any platform. Context-aware AI adapts your text automatically.
...and every other app you use. If you can type in it, Sumi works there.
Why Sumi?
Premium speed and flow, with maximum flexibility.
Local & Private
Run Whisper entirely on-device with Metal GPU acceleration. No internet required; data stays local.
AI Refinement
Automatically removes fillers like 'umm' and 'uhh', corrects grammar, and adapts tone to match the app you're using.
100+ Languages
Supports fast recognition for 100+ languages with automatic code-switching detection.
Open Source Power
Built on Whisper and open LLMs. Connect to free cloud providers or run fully offline — your choice.
See How Sumi Compares
The only voice tool that's free, open source, and runs AI polishing entirely on-device.
| Feature | Sumi | Wispr Flow | VoiceInk | SuperWhisper |
|---|---|---|---|---|
| Price | Free | $12–15/mo | $39.99 | $10/mo |
| Local STT | Whisper + Metal | Cloud only | ||
| Cloud STT | BYOK | Optional | ||
| AI Text Polish | ||||
| Local LLM Polish | Only Sumi | |||
| Custom Prompts | Per-app rules | Custom modes | ||
| Context-Aware | App + URL | App | Manual modes |
Based on publicly available information. Features may have changed.
Loved by Early Adopters
Hear from developers and creators who've made the switch.
“I dictate code comments and Slack messages while debugging. The local LLM polish is a game-changer — my ramblings become clean text without touching the cloud.”
“Sumi cut my drafting time in half. The context-aware feature auto-adjusts my tone for emails vs. chat — like having a personal editor.”
“Being fully open source was the deciding factor. I can verify my audio never leaves my device. No other voice tool gives me that level of trust.”
“I write my thesis entirely by voice now. Switching between English and Mandarin mid-sentence just works. Saved me weeks of typing.”
“As a PM, I dictate meeting summaries right after standup. By the time I reach my desk, polished notes are already pasted in Notion.”
“I dictate code comments and Slack messages while debugging. The local LLM polish is a game-changer — my ramblings become clean text without touching the cloud.”
“Sumi cut my drafting time in half. The context-aware feature auto-adjusts my tone for emails vs. chat — like having a personal editor.”
“Being fully open source was the deciding factor. I can verify my audio never leaves my device. No other voice tool gives me that level of trust.”
“I write my thesis entirely by voice now. Switching between English and Mandarin mid-sentence just works. Saved me weeks of typing.”
“As a PM, I dictate meeting summaries right after standup. By the time I reach my desk, polished notes are already pasted in Notion.”
“The multilingual support is incredible. I dictate translations in three languages and Sumi handles the code-switching perfectly.”
“I draft all my show notes by voice during my commute. What used to take an hour now takes 10 minutes.”
“Responding to 50+ emails a day used to drain me. Now I just speak naturally and Sumi gives me polished, professional responses.”
“I document design decisions by voice while working in Figma. The context-aware polish adjusts the formality perfectly for our wiki.”
“Running analyses and writing reports simultaneously was impossible before. Now I dictate findings while coding — Sumi handles the rest.”
“The multilingual support is incredible. I dictate translations in three languages and Sumi handles the code-switching perfectly.”
“I draft all my show notes by voice during my commute. What used to take an hour now takes 10 minutes.”
“Responding to 50+ emails a day used to drain me. Now I just speak naturally and Sumi gives me polished, professional responses.”
“I document design decisions by voice while working in Figma. The context-aware polish adjusts the formality perfectly for our wiki.”
“Running analyses and writing reports simultaneously was impossible before. Now I dictate findings while coding — Sumi handles the rest.”
Simple & Transparent
Choose to use it for free or support the community.
Open Source Free
For individuals and developers
Pro Sponsor
Support our development and unlock advanced features
Frequently Asked Questions
Everything you need to know about Sumi.
Yes. Sumi is 100% free and open source under a permissive license. All core features — local transcription, AI polishing, 100+ languages — work without paying anything. The optional Pro Sponsor tier is for users who want to support development and get priority access to advanced features.
Sumi can run entirely offline. Speech recognition uses Whisper locally on your device with Metal GPU acceleration, and AI text polishing can use a local LLM (Llama 3 / Qwen 2.5). Your voice recordings and text never leave your machine unless you explicitly choose to use cloud APIs with your own keys.
Sumi requires macOS 13 (Ventura) or later with an Apple Silicon (M1/M2/M3/M4) or Intel Mac with at least 8 GB RAM. Apple Silicon is recommended for the best local Whisper performance via Metal GPU acceleration.
After transcription, Sumi automatically removes filler words (umm, uhh), corrects grammar, and adjusts tone based on the app you're using. You can run polishing locally with an on-device LLM, or connect cloud providers like OpenAI or Groq using your own API keys for faster results.
Sumi works in any application where you can type — Slack, Notion, VS Code, Gmail, Chrome, Safari, Discord, and hundreds more. It's a system-wide tool that adapts its context-aware AI polishing based on the app and even the URL you're currently using.
Absolutely! Sumi is fully open source on GitHub. We welcome contributions — whether it's code, bug reports, translations, or documentation improvements. Check out our GitHub repository to get started.
Ready to speak your mind?
Download Sumi for free and start typing at the speed of thought.