Project Documentation
v2 — March 12, 2026

AIX Zettelkasten
Architecture & Workflow

Personal knowledge refinery for AI conversations — multi-platform, lazy-rendered, model-augmented

Status  Architecture revised Scope  ~1000 conversations / 3 platforms Supersedes  v1 March 3, 2026
What changed from v1

Core Distinction — Distillation vs Synthesis

These are two different operations that happen at different times and different costs.

OperationWhat it doesWhen / cost
Distillation Breaks conversations into atomic tiddlers — one concept, method, or insight per node. Removes scaffolding, retains essence. Upfront, once per conversation. Costs API tokens.
Synthesis Combines tiddlers into new understanding — patterns, connections, arguments. Operates on the distilled store. On demand, any time. Cheap — queries the tiddler store, not raw conversations.
The tiddler corpus is the RAG index. Synthesis happens against atoms, not against conversation tokens — which is why distillation is worth the upfront cost.

Three-Layer Tiddler Architecture

Not all tiddlers are generated at the same time or in the same way.

1 Upfront
Idea Tiddlers
Atomic concepts extracted by the distillation pipeline. One claim, method, or insight per tiddler. Generated in batch from the conversation corpus via API. These are the permanent nodes of the knowledge network.
source-conversation + exchange number as provenance fields
2 Lazy
Conversation Tiddlers
Generated on demand when you follow a provenance link from an idea tiddler. TW Node.js reads the source JSON and renders the conversation — or just the relevant exchange — without pre-generating anything. You only pay the cost when you actually need to trace back.
not pre-stored — rendered from JSON on request
3 Permanent
JSON Archive
The full conversation exports. Ground truth. Never modified. All three platforms normalized to a common schema. The distillation pipeline reads from here; the lazy renderer reads from here.
Claude JSON · ChatGPT export · Gemini export → normalized

Full Pipeline

Claude / ChatGPT / Gemini
3 platforms, ~1000 conversations
↓ export (JSON / markdown)
Normalization script
common schema: uuid, platform, timestamp, turns[ ]
JSON Archive
permanent ground truth — never modified

↓ distillation (Claude API, batch)
Idea Tiddlers (.tid files)
sparse yield — most cycles are scaffolding, not ideas
Vocabulary normalization script
post-hoc synonym consolidation — run after corpus stabilizes
TiddlyWiki Node.js
display · probe · lazy conversation render · live model access

Tiddler Schema (v2)

identity
title:CamelCaseConceptPhrase
created:YYYYMMDDHHMMSS
modified:YYYYMMDDHHMMSS
classification
tags:[platform] [domain] [semantic]
type:text/x-markdown
provenance — new in v2
source-platform:claude | chatgpt | gemini
source-conversation:[normalized conversation ID]
source-exchange:[turn index within conversation]
content
text:markdown with [[wikilinks]] inline

Platform tags:

claude chatgpt gemini

Domain tags:

MICA DAC-AC M-DUN ICA PKM TiddlyWiki AILiteracy Prompting RTW

Semantic tags:

concept method insight question revision finding

Vocabulary Normalization Strategy

Extract now, normalize later. Taxonomy anxiety should not block distillation. The corpus itself surfaces what the canonical terms are — synonym clusters appear once you have a full title list.

  1. Distill all conversations — messy vocabulary is fine at this stage
  2. Export full tiddler title list — one pass over the corpus
  3. Feed title list to Claude — ask it to identify synonym clusters
  4. Build normalization map — e.g. {"MDUN": "M-DUN", "inference cycle": "ICA"}
  5. Run normalization script — find-replace across all tiddler titles and text fields in JSON export
  6. Reimport to TW — coherent corpus from this point forward
The domain vocabulary is largely already known — M-DUN, ICA, DAC-AC, context contribution assay, paired artifact. Normalization here is enforcement, not discovery.

TiddlyWiki as Interface, Not Refinery

TW's job is display, navigation, and probe — not storage or reasoning. The model lives behind it as an on-demand reasoning engine.

What TW does

Renders idea tiddlers. Generates conversation views lazily from JSON when you follow a provenance link. Maintains forward and backlinks automatically via [[wikilinks]]. Provides the navigation surface for browsing the distilled corpus.

What the model does (live)

When you're browsing a cluster of tiddlers and want to understand what connects them — ask the model in real time. It queries against the tiddler store and synthesizes across it. This is the AIX Workbench pattern applied to the personal knowledge base: TW as navigation layer, model as on-demand reasoning, tiddler corpus as the RAG index it queries against.

What the JSON archive does

Sits underneath everything as permanent ground truth. Never touched by TW or the model directly. The distillation pipeline reads from it once; the lazy renderer reads from it on demand.

Build Order

  1. Normalization script — flatten Claude JSON, ChatGPT exports, and Gemini exports to common schema
  2. Distillation prompt design — get this right on 2-3 sample conversations before running batch; the prompt is the intellectual core
  3. Batch distillation — run all ~1000 conversations through Claude API; collect .tid output
  4. Vocabulary normalization — title list → Claude → normalization map → script → reimport
  5. TW Node.js setup — GitHub repo, server config, lazy conversation renderer
  6. Live model integration — synthesis queries against tiddler store from within TW