Productivity

Voice for thinking with AI: why your keyboard is the wrong tool

A founder note on why spoken prompts often preserve the idea the keyboard edits away.

Shuran Zhou, Founder · 2026-05-08 ·5 min ·Updated 2026-05-08

TL;DR

Voice for thinking is not about typing faster. Loqua is a Mac-native voice typing tool that helps you get half-formed ideas into AI tools before the keyboard compresses them. When working with an LLM, the bottleneck is often preserving nuance, not producing perfectly typed words.

I used to think voice was an accessibility interface or a convenience feature. I changed my mind after using AI tools every day. The keyboard is excellent for precision, but it is a poor instrument for thinking with AI because it forces the idea through a narrow channel too early.

The keyboard bottleneck

A fast typist can peak around 70 words per minute. Conversational speech is often closer to 150 words per minute, and internal thought can move faster than either. The exact numbers matter less than the shape: the keyboard makes you serialize thinking into polished fragments before the idea is ready.

That is the keyboard bottleneck. It is not that typing is slow in an absolute sense. It is that typing pushes you to edit while forming the thought. With AI, that early compression often removes the useful ambiguity: the caveat, the alternative, the thing you are unsure about but want the model to consider.

I notice it most when I am tired. Late in the day, my typed prompts get shorter and the model gets correspondingly less useful. The same evening, dictating the same intent into the same tool produces a better answer because the spoken version still carries the context I would have edited out by hand. The keyboard does not just slow me down; it makes me a worse collaborator with the model after a certain hour.

AI changes the prompt shape

Working with an LLM is closer to briefing a collaborator than issuing a command. The best prompts often include context, motive, constraints, examples, and uncertainty. Voice prompts ai tools better when the problem is still fuzzy because you can speak the surrounding context without stopping to make it elegant.

This is why voice for thinking matters. You can say, "I think the bug is in the cache key, but I'm not sure if the user locale is part of it, inspect that path first and tell me if I'm wrong." Typed, that often becomes "check cache bug." The shorter prompt loses the thought.

The shape of a prompt is now part of the work, not a preamble to it. Treat the prompt as the artifact you are producing, and voice becomes the natural authoring tool: it preserves the structure of how you actually understand the problem, including the parts you are unsure about. A model that gets the half-formed shape often returns a better answer than a model that gets a confident but partial command.

Three moments that changed my mind

The first was a debugging session. I typed a short prompt into an agent asking it to inspect a regression. It went down the wrong path. Then I dictated the messy version: what changed, what I suspected, what I doubted, and what would disprove my theory. The agent found the issue faster because I had finally given it the shape of my uncertainty.

The second was writing. I typed a crisp paragraph about our model stack and it sounded correct but dead. I spoke the same idea while pacing, including the frustration that led us to the architecture. The dictated version had the actual argument. I still edited it, but I edited from a living draft rather than a sterile outline.

The third was a long, awkward customer reply. The customer had asked a question that did not have a clean answer; the honest response involved tradeoffs and a small apology. Typed, my reply went through six edits and still felt stiff. Dictated, the first take was warmer, more direct, and only needed a one-word fix. I shipped that version and the conversation moved on. I no longer trust typed replies for messages that need any tone at all.

How I use voice now

I use voice for first-pass thinking, not for final precision. I dictate the messy brief into Claude Code, Cursor, Obsidian, or a plain Markdown file. Then I switch to keyboard for exact edits. That division keeps each tool in its lane: voice for context, keyboard for surgery.

Before coding: I dictate the change, risk, and test path. The dictated version usually surfaces a risk I would have skipped if I were typing.
Before writing: I speak the argument out loud before outlining. If I cannot say the argument in two minutes, I do not yet know what I think.
Before meetings: I dictate the decision I need from the call. Walking into a meeting with a named decision changes the conversation.
After failures: I dictate what surprised me before the memory fades. By the next morning, the lesson is gone if it was not captured.

For outside context on speech speed and dictation patterns, the Nielsen Norman Group's speech-recognition writing and words-per-minute references are useful starting points.

The objections I keep hearing

"I work in shared spaces." Fair, and that is a real constraint. My answer is that even ten quiet minutes a day spent dictating the hard prompts is more useful than a full day of typed ones. Voice does not need to dominate the workflow to change it.

"I can think while I type." Some people genuinely can. The test is not whether you can produce text by typing; it is whether the text you produce by typing has the same shape as the thought you would have spoken. For most of us, including me, the typed version is consistently less complete.

"I sound rambling when I dictate." The first week is rough. The second is much better. The skill being learned is not speaking; it is shaping a spoken thought into something a reader (or a model) can use. It comes back faster than expected because everyone has used it before, just in conversation.

Where Loqua fits

We wrote Loqua because I wanted voice for thinking without accepting raw transcript cleanup. It removes false starts, keeps technical names, and formats output for the app I am in. The soft pitch is this: use Loqua when the idea is too large or too fragile to squeeze through the keyboard first.

For the practical version of this argument, see our voice-first workday. That post shows when voice works, when it fails, and when I still reach for the keyboard. The point of this post is the why; that one is the how.

Frequently asked questions

What does voice for thinking mean?

Voice for thinking means using speech to capture the shape of an idea before polishing it. The point is not perfect transcription. The point is preserving context, uncertainty, examples, and motivation so an AI tool or future-you can work with the full thought.

Is voice actually faster than typing?

For first-pass capture, usually yes. Speech can carry more context per minute than typing. For exact editing, typing and keyboard shortcuts are still better. The useful workflow is voice for exploration and keyboard for precision.

Why does this matter more with AI tools?

AI tools respond to context. A terse typed prompt may omit the assumptions and uncertainty that would steer the model correctly. Spoken prompts make it easier to include the full situation, which often matters more than clever prompt wording.

Will dictated prompts be too rambling?

They can be if the tool writes raw transcript. Loqua cleans filler and false starts while preserving the substance. You should still edit important prompts, but the starting point is usually richer than a compressed typed command.

When should I not use voice?

Do not use voice for precise code edits, small navigation actions, or sensitive public spaces where speaking context out loud is inappropriate. Use voice when the work benefits from explanation, nuance, or rapid first-pass capture.

Is this just for developers?

No. Developers feel it because prompts and code reviews are context-heavy, but the same pattern applies to founders, writers, researchers, support teams, and anyone who works with AI tools through natural-language instructions.

I work in an open office — does this still apply?

Yes, with a smaller surface. Even ten quiet minutes a day spent dictating the hardest prompts changes the quality of those prompts. Voice does not need to take over your workflow to be valuable; it needs to take over the moments where typed compression hurts most.

Try Loqua today

Free to start. Mac native. Built by algorithm researchers who use it every day.

Download

More from the Loqua Blog

Productivity

Voice first workflow: a day in our voice-first workday

How-to

Voice typing for AI coding: voice prompt Cursor and Claude Code without typing

Engineering

Omni-modal voice typing: multimodal understanding, MoE, and streaming text output