Konubinix' opinionated web of thoughts

How to Do Literate Programming

Braindump

Prose is the primary artifact

A literate program is first and foremost a piece of writing. The prose carries the full reasoning — problem, solution, and motivation. Code blocks are inline translations of the prose’s solution; they render the technical form of what the prose already stated.

Because readers’ attention is narrow, a code block must sit a glance away from the prose it translates. Ideally, the reader sees the motivating sentence and the whole associated block without scrolling. Even long code can be split with noweb to show only the fragment relevant to the sentence at hand.

Prose is richer than code: the why is never translated, and a code comment is not where it lives. A comment carrying a why — a system state, a design constraint, a justification of structure — is prose stranded in the wrong file: invisible to the narrative reader, out of context for the code reader. The reasoning belongs in the paragraph that motivates the block. What remains in code is at most a one-liner annotating the line below.

Sanity check: mentally remove all code blocks. The remaining text must form a coherent and complete narrative. If a gap appears, either a paragraph is missing, or an existing paragraph was only paraphrasing a block rather than advancing the reasoning. In either case, rewrite the section accordingly. This is what makes the document literate: code and prose are interleaved, not inserted side-by-side. Tests give a binary signal, but prose coherence doesn’t — a neutral agent reading the prose-only version and assessing whether the narrative fulfills the stated need provides a stronger signal than code-level tests alone.

The reader’s flow is sacred

The reader should glide through the prose uninterrupted, glancing at a code block whenever they want to see how a sentence is concretely coded. Everything else — detours, meta-commentary, secondary justifications — is noise and must be removed. If something is valuable but breaks the flow — a technical derivation, a self-contained construction whose result the narrative will consume — move it to an annex and link to it from the main narrative at the moment its result is used.

Brevity serves flow. The longer a sentence, the more its message dilutes — human attention is narrow, and a convoluted clause already costs a glance back. Where an image, a diagram, or a bulleted breakdown carries the idea more directly than a paragraph, prefer it. Lists in particular: prefer bullets when items have parallel structure — they are easier to read.

Push this hard: most things are better expressed as diagrams than as prose. The default is not “a paragraph, with a diagram when it helps” but “a diagram, with prose for what only words can carry.” A sequence of interactions between objects, a state machine, a dependency graph, a layout — each of these a diagram delivers in one glance, where a paragraph hands it over clause by clause and the reader has to reassemble it. So before writing a paragraph, ask whether a picture would say it better; usually it would. Reach for one: a plantuml (or equivalent) block renders a sequence or interaction diagram inline at export, and the rendered picture — not its source — is what the reader sees. An image is worth far more than a pile of words, almost every time. This does not unseat prose as the primary artifact — prose still carries the reasoning a diagram cannot state: the motivation, the texture, the judgment. Even an argument has a diagrammable skeleton (Argdown maps it — see the why/what/how principle); what stays in prose is what no diagram captures. The prose simply stops re-describing what a picture states better, and shrinks to the work only words can do.

Flow is more than absence of friction. The reader should feel the writer’s thrill, pulled forward by the chain of reasoning, eager to find out how it resolves. Style matters: a document that is technically clear but emotionally flat loses readers as surely as one that breaks flow with detours.

Put a lot of links. Link out to external sites for external definitions and references, and link internally within the note for easy navigation between its sections. If the reader wants to follow a concept referred to and described elsewhere, per must be able to simply click and go there. But make sure not adding links from exported content to non exported content.

Once the document has grown enough chapters that a reader could lose the overall shape, add a table of contents so the whole structure is visible at a glance and every section is one click away — in org-mode, a #+TOC: headlines N directive placed where the overview belongs, or #+OPTIONS: toc:t to let the exporter render one. A short note with two or three sections does not need it; the cost is a directive earning its keep only when the outline has become too long to hold in the head.

A new term enters the document where the document has just laid out what motivates it — not earlier. Referring to a concept the reader has not yet met forces them to suspend reading, which breaks flow as surely as a detour. The fix is almost always reordering — introduce the grounding sentence first, then the term that names what was just established. When the structure constrains the sequence, defer the reference itself: name a more general concept the reader already has, and let the specific term arrive when its time comes.

This applies everywhere the reader’s eyes go — prose, comments, docstrings, function names, variable names, symbols, any identifier surfacing in any form. The reading flow doesn’t care which fence a name sits behind; an ungrounded name breaks it wherever it appears.

When the ungrounded name is a code-level term — a field, a function, a class — the forward-reference often hides a readback: the prose was reaching for the implementation when it should have been naming the need. Removing the term then fixes both at once.

Content hidden from the published document — by tag, by build switch, by any mechanism that keeps it out of what the reader will read — does not exist for that reader. A reference surviving in the exported flow whose grounding lives only in hidden content stays ungrounded forever. Either move the grounding into the export, or remove the reference from the exported flow. Hiding is not a substitute for an annex.

A block that cannot be taken in at a glance breaks the flow just as surely as a detour. Around 30 lines is a reasonable ceiling; past that, the reader loses the thread between the sentence that motivated the block and the code that is supposed to translate it. The fix is almost always to split the block along the seams the prose already has — each responsibility that was introduced as a distinct idea becomes its own block, with a sentence of motivation just before. The noweb composition reassembles the pieces at tangle time. A block that resists splitting signals that the prose itself has collapsed several ideas into one — expand the prose first, and the block will split naturally.

Not every short conversational line is noise — many are encouraged. A single sentence like “let’s see what this gives” just before a block that produces something visible — an image, an exported model, a plot — synchronizes the prose with the reader’s impatience to leave abstraction and see a concrete result. It does not interrupt the flow; it paces it. The test is whether the line names something the reader is already feeling: if it does, keep it. If it instead introduces new content, it is no longer a pacing line but a regular paragraph in disguise — expand it or cut it.

More broadly, a document is a story, not a recitation. Stakes, a tension building toward a result, a payoff at the end — these are what carry rigorous content through. Substance without narrative rarely lands; style is what makes the logic count.

Code blocks are not repeated in the export

A direct consequence of the reader’s-flow principle: a code block appears in the export only when the prose talks about it. Re-showing a block that the reader has already seen, just because the next block depends on it, adds nothing to the reading and becomes noise. The dependency is resolved at tangle time, invisibly to the reader. Repetition is only justified when seeing the surrounding code genuinely helps the reading.

Prose states the what, the how, and the why

Every code block is motivated by what precedes it in the prose:

  • The why (the reason for the choice) → stays in prose, untranslated
  • The what (the desired behavior) → translated into a test
  • The how (the chosen technical approach) → translated into code

The why is the one thing here that prose alone carries, and the easiest to fake: a plausible-sounding justification reads almost exactly like a real one. Never invent it. When the reason behind a choice is not known, do not manufacture one — say so, ask, or discuss it with whoever can settle it. A gap in the why is an honest signal that something still needs establishing; a fabricated why hides that signal and ships a lie in its place. No bullshit: a sentence whose only job is to make a paragraph sound complete is removed, not polished. This is the auditor’s sharpest test — tracing every why back to a real constraint, measurement, or decision someone actually made.

A why with structure need not stay a wall of prose. When the reasoning is an argument — a thesis carried by reasons, met by objections, resting on assumptions — Argdown maps that skeleton into an argument diagram, and the logical shape the reader would otherwise reconstruct from “because X, yet Y, therefore Z” becomes visible at a glance. Reach for it whenever the argument has enough joints to be worth seeing. The prose then carries what the map leaves out — the motivation, the texture, the nuance — not the bare structure the diagram already shows.

No block should exist unless the prose has made it necessary. If you find yourself adding a block without the prose having led to it, a piece of reasoning is missing. The inverse also holds: prose that paraphrases the block — restating what the code does — is duplication, not justification. Prose states the problem or need, not a readback of the code: never a what without a why. The document never lists “this does X, that does Y” — facts no one asked for. It moves in a single register — I need to meet this objective, so I will do this, and so I will do it this way — need, then intent, then technique. The exception is the translation aside (see below): when the gap between intent and form is wide enough that a short readback is what bridges it, the readback earns its place.

The rule applies to the reading — not to the source. A document’s tangle scaffolding lives invisibly, usually in :noexport: sections or marked :exports none. These blocks need no motivating prose because they are not read; they exist only to make tangling work.

How prose and code are sliced against each other is arbitrary: one block per symbol, several symbols per block, several blocks per symbol — all are legitimate. The only criterion is the reading flow. If the reader pauses on a function and asks “wait, what is this for? I was reading about something abstract and I don’t see the connection,” the slicing has failed. Boilerplate — every getter and setter of a Java bean, say — does not earn a sentence each; that would drown the reader in motivation no one needed. Conversely, a block dropping several unrelated functions under a single vague paragraph leaves the reader unmoored, and calls either for more sentences or for splitting the block. The fix is whichever restores the flow.

Vocabulary follows the same rule. The states a field can take, the thresholds that govern behavior, the constants that pace the system are named in prose, not in a type comment listing valid values, a magic number setting a tempo, or a literal fixing a bound.

The ordering falls out of this split: the test (the what) comes before the code (the how), because that is how we talk to each other — we state the desired behavior first, then the technique used to achieve it. A test that stands apart as a separate paragraph, not flowing from an explicit assertion in the prose, is a sign that a sentence is missing from the narrative. This is why the prose never talks about “the test”: it states what is needed and what we are going to do — “we need the list to reject a duplicate,” “so we block creation” — and the block just below is that need made visible, code that happens to be a test. The word test barely surfaces in the reading; the reader meets a promised behavior and, right under it, the code that pins it down. Test-specific machinery — a tricky fixture, an unusual harness step — earns a sentence only when it is genuinely hard to follow, which is rare.

Three rules govern test-and-prose weaving:

  1. Always split the blocks in the document. Each prose paragraph promises one what; the test block below it checks that promise. A test block checking several whats under a single paragraph breaks the prose flow — the reader cannot trace which sentence promised what the test is checking.

  2. Always fuse the runtime when possible. A single function runs all the whats together, sharing fixture and state across steps, saving setup time.

  3. “Possible” means you can prove the chaining doesn’t dilute any test’s check. Each prose paragraph has promised a behaviour; the test below it must still fail observably when that behaviour breaks, even after being chained behind earlier steps. When an earlier step happens to mask the failure mode a downstream test should catch, the test no longer checks what the prose promised — the runtime splits too.

The two layers compose via noweb: each what lives in its own visible block (with prose), all sharing :noweb-ref test-X-body. A hidden wrapper (:exports none) does <<test-X-body>> to recompose the body into a runtime function.

Prose: "row removal splices both arrays in lockstep"

  [block — :noweb-ref row-removal-body
     setup + modal-side assertions]

Prose: "the post-splice state propagates through the doc to the ballot"

  [block — :noweb-ref row-removal-body
     click "Créer", advance, bulletin-side assertions]

  [hidden wrapper — :noweb-ref tests :exports none
     @testcase
     def test_row_removal(page):
         <<row-removal-body>>]
Prose: "a duplicate in the list would make the tally incoherent,
        so we block creation and display an error message"

  [test — translates "we block and display a message"]
  [code — translates "we validate the list and expose a createError getter"]

One constraint binds every test, however it is woven: it must exercise the program the way the final user does — same surfaces, same gestures, same outputs. The closer the test’s path matches the user’s, the more its green bar means the promised behavior actually holds. This forbids, absolutely, bending the code to make testing convenient: a data-testid, a hidden hook, a selector or attribute that exists only so a test can reach an element is a path no human will ever take — the final user never reads a test id. Reach instead for the handles the user has: the visible label, the role, the rendered text. When the program turns out hard to test that way, the difficulty is real information — usually the interface is just as unclear to the user, and the fix belongs in the design, not in a test-only affordance bolted onto the code. The code carries nothing that exists solely for the test.

Translation asides

Sometimes the translation is not obvious: the intent is clear but its shape in code is surprising. In that case a short paragraph can act as a translator’s note: “we want X, but in code this is written as Y because Z”.

This is not code description — it is an explanation of the gap between intent and form. These asides should be rare and clearly identifiable as such (e.g. introduced by “In practice,” or “Under the hood,”).

When an aside grows beyond its role — explaining more than the translation gap — it is better moved to an annex to avoid breaking the reader’s flow. The prose then points to it with a short link, e.g. “in order to understand X, see annex” (with a link).

Feeling the need for an aside is also a diagnostic. Sometimes it reflects a genuine gap between what human language and code can express — that is what the aside is for. But it can also reveal that the code itself is poorly organized: the surprising shape that needs a translator’s note may simply be the wrong shape. The structure of the code is not bound to the structure of the prose — noweb reassembles the pieces at tangle time, so the code can be carved along its own joints. A poorly written block is never excused by the fact that the prose happened to flow in another direction; in that case, refactor the code rather than paper over it with an aside.

Prose: "we want the getters to behave as real properties"

  Under the hood, this means using defineProperty rather than Object.assign,
  which does not copy accessors.

  [code — translates this technical constraint]

Rewrite, don’t patch

When something forces the document to change — a bug, an oversight, a change of mind, a breaking change in an upstream library, a refactor that obsoletes an earlier choice — the temptation is to add a test and an explanatory paragraph describing the mitigation. This does not restore the narrative — it leaves a visible scar. Instead, rewrite the affected prose section so it reads as if the issue never existed: not a narrative that fixes something, but a narrative that was always correct. The reader’s flow is sacred, and a scar interrupts it just as surely as a detour does — the final artifact should betray no trace of the path that led to it.

The scar is retrospective — a trace of a revision already applied, a debate already settled. Prospective language is different: honest framing of a value still to validate empirically, of an explicit test plan, or of a guard-rail against a tempting wrong path. These are not scars. The distinguishing test is tense: does the sentence describe a fix already made (scar — rewrite), or an open question the reader should know about (caveat — keep)?

Scar — rewrite:
  "initially at 120 mm/s, since revised to 60 mm/s for volumetric flow"

Caveat — keep:
  "clearance at 0.5 mm — to adjust by trial if the real fit is too loose"

Code carries scars too. When reasoning leaves the code — a justification moved to prose, a constraint absorbed into a cleaner design — the code that only carried it leaves as well. An empty body, a one-armed conditional, a vestigial helper: each is a fragment whose work now lives elsewhere. The artifact reads as if neither prose nor code remembered the revision.