How-to

Hands-free dictation for writers: how to draft 3000 words of novel, essay, or long-form in one sitting

Walking, pacing, dishes-doing — long-form drafting doesn't need your hands. Filler cleanup, character-name dictionaries, app-aware formatting in your favorite writing app, and five worked examples.

Shuran Zhou, Founder · 2026-03-10 ·9 min ·Updated 2026-03-10

TL;DR

Long-form drafting - novels, essays, blog posts, academic chapters - does not need your hands for the first pass. With Loqua's filler-word cleanup, personal dictionary, and app-aware formatting for Scrivener, Ulysses, Notes, and Obsidian, you can turn spoken thought into editable prose. This dictation for writers workflow is strongest when you need a real draft quickly, then polish with keyboard and trackpad.

Loqua is a context-aware voice typing tool for Mac that strips fillers and false-starts as you talk, learns the proper nouns you use most, and formats output for the writing surface you're in. Where most dictation tools stop at "transcribe what I said," Loqua's language intelligence layer rewrites mid-sentence corrections into clean prose, so you can think out loud without producing a transcript you'd have to clean later.

We've used it to draft fiction, essays, internal memos, and one academic paper this year. The pattern that works for all of them is the same: talk, walk, pause, talk more, let Loqua produce clean text. The polish pass at the end is shorter than the one that follows typing.

This is a practical dictation for writers guide, not a promise that every session produces publishable prose. Voice helps most when it removes the blank-page delay and gets a scene, essay, or argument into editable form quickly.

Why voice for long-form

Three reasons. First, you talk faster than you type, by a margin that grows with prose. Second, your body benefits from not being chained to the desk for the drafting phase — many writers report that walking improves the prose itself. Third, the editing pass after voice drafting tends to be lighter because you produced more sentences per session, which means more material to cut down from rather than fewer sentences to expand.

The trap with most dictation tools is that you spend the time you saved talking in cleanup — removing "um," "so," "you know," and the mid-sentence "actually wait let me start over." Loqua's language intelligence layer removes those as it goes. What lands in your draft is cleaner than what comes out of your mouth.

The cleanup model

The cleanup is not aggressive editorialization. Loqua doesn't reshape your voice or impose a style. (The conservative bias here is intentional — see Apple's speech-input pipeline for the on-device foundations our cleanup layer sits on top of.) It removes the three things any human editor would silently delete:

Filler words: um, uh, you know, like (as filler), I mean, so.
Self-corrections: when you say "so basically the thing is — actually wait, let me start over — we should cache this," you get "We should cache this." The mid-sentence correction is honored; the throat-clearing is gone.
Restarts: false starts before the real sentence begins.

What it keeps: your voice, your sentence shape, your idiom, your pacing as paragraph breaks. If your style includes intentional hesitation or em dashes, they survive — the cleanup is targeted.

You say

"um so basically the thing is we need to uh... actually wait let me start over — we should cache this for like fifteen minutes"

Loqua writes

We should cache this for fifteen minutes.

Character and term dictionary

The biggest pain in long-form fiction dictation is character names. Your protagonist's name might be "Beatrix Ó hUiginn" — there is no transcription model on earth that handles that without help. Loqua has a Personal Dictionary in Settings where you add proper nouns one per line. Once added, the names spell exactly every time, with the correct diacritics.

Same for setting names, invented technologies, magic-system terms, philosophical jargon, or any uncommon word that recurs in your work. The dictionary is per-document if you want it to be — useful if you write multiple novels in different universes.

Tip: load the dictionary at the start of a project. You'll discover the gaps in the first hour; add them, and the rest of the draft is friction-free.

App-specific formatting

Loqua reads which app you're dictating into and adjusts output accordingly. This is the same multimodal-context engine that handles code formatting (see the dictate code on Mac guide), pointed at writing apps:

App	What Loqua does differently
Scrivener	Sentence-per-line draft format; section breaks honored; respects the binder's chapter context
Ulysses	Markdown-clean output; uses Ulysses' first-class markdown syntax
Notes (Apple)	Standard paragraph format; respects bullet-list mode if active
Obsidian	Markdown with frontmatter awareness; if a YAML frontmatter block is open, dictation goes into it correctly
Pages / Word	Clean prose paragraphs, no markdown leakage
iA Writer	Markdown; respects the app's focus-mode conventions

You don't configure this per-app. Loqua detects the active app and adjusts. This is one of those features you only notice when you switch tools mid-draft and the output continues to make sense.

Five worked examples

You say (novel outline, dictating while pacing)

"okay chapter three — Beatrix gets the letter and we have to decide whether she opens it right away or sits with it for a day I think she sits with it that's more her"

Loqua writes (in Scrivener)

Chapter 3 — Beatrix receives the letter.

Decision point: opens immediately vs sits with it for a day. Sits with it — more in character. The delay creates space for Elena's interjection in scene two.

You say (scene description for fiction)

"the kitchen at six in the morning the kind of kitchen where the light through the window has been the same for thirty years and the kettle on the stove is somehow always almost boiling"

Loqua writes (in Ulysses)

The kitchen at six in the morning — the kind of kitchen where the light through the window has been the same for thirty years, and the kettle on the stove is somehow always almost boiling.

You say (academic abstract, between meetings)

"this paper presents a framework for measuring multimodal context integration in voice typing systems we propose three benchmarks and apply them to four production tools showing that on-device multimodal models meaningfully reduce error on app-aware formatting tasks"

Loqua writes (in Pages)

This paper presents a framework for measuring multimodal context integration in voice typing systems. We propose three benchmarks and apply them to four production tools, showing that on-device multimodal models meaningfully reduce error on app-aware formatting tasks.

You say (blog draft, walking the dog)

"the thing about voice typing for writers is that the cleanup model matters more than the speech recognition model — once recognition is good enough most of the difference is in what gets removed and how the sentences land"

Loqua writes (in Obsidian)

The thing about voice typing for writers is that the cleanup model matters more than the speech recognition model. Once recognition is good enough, most of the difference is in what gets removed — and how the sentences land.

You say (editorial annotation on a draft)

"this section is fine but it's three hundred words too long the metaphor about the kettle should move into chapter four where it actually pays off"

Loqua writes (in Scrivener, comment)

Section is fine but ~300 words too long. The kettle metaphor should move into Chapter 4 where it pays off.

Editing by voice

You can dictate edits directly: "replace the last paragraph with...", "add a comma after the word kettle," "strike the second sentence." Loqua reads the current selection (if any) and your phrasing to decide whether to insert, replace, or annotate. For heavier edits, most writers we know switch back to keyboard — but for many adjustments, voice is faster.

The honest version: voice editing is good for additions and deletions, decent for substitutions, and slow for complex restructuring. Restructuring is when you reach for the trackpad.

Multilingual long-form

Many of the writers we've talked to embed quotations or terminology from another language in their long-form. Loqua handles this without a mode toggle:

You say (essay drafting)

"the closest English equivalent is roughly contentment but the Chinese term 安然 carries connotations of resignation that don't quite map"

Loqua writes (in iA Writer)

The closest English equivalent is roughly "contentment," but the Chinese term 安然 carries connotations of resignation that don't quite map.

This is the same architecture that powers code-switching in code (see voice typing for AI coding) — Loqua's language intelligence layer is trained on code-switched data rather than around it.

The workflow we actually use

This is the workflow we've converged on after months of long-form voice drafting:

Open the writing app first. Cursor in the right place; Loqua reads the cursor position to decide format.
Load proper nouns into Personal Dictionary at session start. Five minutes here saves an hour later.
Talk in paragraphs, not sentences. Pause for paragraph breaks, not after every sentence. Loqua honors the pause.
Let yourself wander. The cleanup model strips the wandering. What lands is the line you were reaching for.
Walk if you can. Most novelists who voice-type say the prose changes when they're moving — usually for the better.
Quick polish at the end, not as you go. Voice drafts read differently from typed ones. Read aloud once; it usually flows already.

The thing voice changes most is the relationship between thinking and writing. You think first, write second — and the writing is what you thought, not what you could keep up with at the keyboard. For long-form work where the bottleneck is thinking, that's the entire game. See the broader research on drafting speed for one outside view on why this matters.

If you want the technical detail on how the cleanup and context layers work, the three-model architecture note goes deeper.

Frequently asked questions

Will Loqua mess with my voice as a writer?

No. Loqua removes filler words and false-starts only. Your sentence shapes, idioms, em dashes, intentional repetitions, and pacing all survive. If your style includes "..." pauses or fragments, they survive. We tuned the cleanup model deliberately conservative because writers reacted negatively to anything heavier.

How well does the proper-noun dictionary work?

Add the term once in Settings → Personal Dictionary, and it spells correctly thereafter, including diacritics and hyphenation. For long-running projects (a novel series with 50+ named characters), build the dictionary at session start — five minutes upfront saves an hour over the draft.

Does Loqua support Scrivener specifically?

Yes. Loqua reads Scrivener's binder state and outputs sentence-per-line draft format with section breaks honored. The same dictation drops cleanly into the current Scrivener document at the cursor.

Can I dictate while walking?

Yes, with an AirPods Pro or similar mic. The speech-input pipeline on Apple Silicon handles the variable acoustic conditions of walking (footfalls, distance from mic, occasional wind) reasonably well. Word error rate climbs slightly versus a quiet desk; usually still under 5%.

What about dialect or accent in fiction?

If your characters speak in dialect, Loqua transcribes the dialect (it doesn't "correct" to standard English) as long as you pronounce it. For written dialect in dialogue, dictating with the right vocal register usually produces the desired text.

Can I edit a draft by voice?

Light edits, yes — substitutions, additions, deletions. "Replace 'kettle' with 'pot' in the last paragraph" works. Heavy restructuring (move scenes around, rework a chapter's arc) is faster with mouse and keyboard. Voice is best for drafting and light revision; keyboard is best for structural editing.

Try Loqua today

Free to start. Mac native. Built by algorithm researchers who use it every day.

Download

More from the Loqua Blog

Productivity

Voice productivity stack: 9 tools we actually use to write, ship, and think

How-to

Mac meeting notes voice: from voice to done with notes and action items

Productivity

Voice first workflow: a day in our voice-first workday

Compare

Loqua vs Typeless: a Mac-native Typeless alternative for context, coding, and depth

Engineering

Omni-modal voice typing: multimodal understanding, MoE, and streaming text output