AIX Zettelkasten v2 — Architecture & Workflow

What changed from v1

Scale: 27 DAC-AC conversations → ~1000 conversations across Claude, ChatGPT, Gemini
Normalization layer added: all platforms → common schema before distillation
Tiddler architecture split into three layers: atoms (upfront), conversation views (lazy), JSON archive (permanent)
Distillation vs synthesis formally distinguished — synthesis is now cheap, post-hoc, token-free
Inference cycle retained as provenance pointer only, not extraction unit
TW repositioned as display/probe interface with live model access
Vocabulary normalization deferred and scripted, not manual
RAG role of tiddler corpus made explicit

Core Distinction — Distillation vs Synthesis

These are two different operations that happen at different times and different costs.

Operation	What it does	When / cost
Distillation	Breaks conversations into atomic tiddlers — one concept, method, or insight per node. Removes scaffolding, retains essence.	Upfront, once per conversation. Costs API tokens.
Synthesis	Combines tiddlers into new understanding — patterns, connections, arguments. Operates on the distilled store.	On demand, any time. Cheap — queries the tiddler store, not raw conversations.

The tiddler corpus is the RAG index. Synthesis happens against atoms, not against conversation tokens — which is why distillation is worth the upfront cost.

Three-Layer Tiddler Architecture

Not all tiddlers are generated at the same time or in the same way.

        1
        Upfront
      
Idea Tiddlers
Atomic concepts extracted by the distillation pipeline. One claim, method, or insight per tiddler. Generated in batch from the conversation corpus via API. These are the permanent nodes of the knowledge network.
source-conversation + exchange number as provenance fields

        2
        Lazy
      
Conversation Tiddlers
Generated on demand when you follow a provenance link from an idea tiddler. TW Node.js reads the source JSON and renders the conversation — or just the relevant exchange — without pre-generating anything. You only pay the cost when you actually need to trace back.
not pre-stored — rendered from JSON on request

        3
        Permanent
      
JSON Archive
The full conversation exports. Ground truth. Never modified. All three platforms normalized to a common schema. The distillation pipeline reads from here; the lazy renderer reads from here.
Claude JSON · ChatGPT export · Gemini export → normalized

Full Pipeline

Claude / ChatGPT / Gemini

3 platforms, ~1000 conversations

↓ export (JSON / markdown)

Normalization script

common schema: uuid, platform, timestamp, turns[ ]

↓

JSON Archive

permanent ground truth — never modified

↓ distillation (Claude API, batch)

Idea Tiddlers (.tid files)

sparse yield — most cycles are scaffolding, not ideas

↓

Vocabulary normalization script

post-hoc synonym consolidation — run after corpus stabilizes

↓

TiddlyWiki Node.js

display · probe · lazy conversation render · live model access

Tiddler Schema (v2)

identity

title:CamelCaseConceptPhrase

created:YYYYMMDDHHMMSS

modified:YYYYMMDDHHMMSS

classification

tags:[platform] [domain] [semantic]

type:text/x-markdown

provenance — new in v2

source-platform:claude | chatgpt | gemini

source-conversation:[normalized conversation ID]

source-exchange:[turn index within conversation]

content

text:markdown with [[wikilinks]] inline

Platform tags:

claude chatgpt gemini

Domain tags:

MICA DAC-AC M-DUN ICA PKM TiddlyWiki AILiteracy Prompting RTW

Semantic tags:

concept method insight question revision finding

Vocabulary Normalization Strategy

Extract now, normalize later. Taxonomy anxiety should not block distillation. The corpus itself surfaces what the canonical terms are — synonym clusters appear once you have a full title list.

Distill all conversations — messy vocabulary is fine at this stage
Export full tiddler title list — one pass over the corpus
Feed title list to Claude — ask it to identify synonym clusters
Build normalization map — e.g. {"MDUN": "M-DUN", "inference cycle": "ICA"}
Run normalization script — find-replace across all tiddler titles and text fields in JSON export
Reimport to TW — coherent corpus from this point forward

The domain vocabulary is largely already known — M-DUN, ICA, DAC-AC, context contribution assay, paired artifact. Normalization here is enforcement, not discovery.

TiddlyWiki as Interface, Not Refinery

TW's job is display, navigation, and probe — not storage or reasoning. The model lives behind it as an on-demand reasoning engine.

What TW does

Renders idea tiddlers. Generates conversation views lazily from JSON when you follow a provenance link. Maintains forward and backlinks automatically via [[wikilinks]]. Provides the navigation surface for browsing the distilled corpus.

What the model does (live)

When you're browsing a cluster of tiddlers and want to understand what connects them — ask the model in real time. It queries against the tiddler store and synthesizes across it. This is the AIX Workbench pattern applied to the personal knowledge base: TW as navigation layer, model as on-demand reasoning, tiddler corpus as the RAG index it queries against.

What the JSON archive does

Sits underneath everything as permanent ground truth. Never touched by TW or the model directly. The distillation pipeline reads from it once; the lazy renderer reads from it on demand.

Build Order

Normalization script — flatten Claude JSON, ChatGPT exports, and Gemini exports to common schema
Distillation prompt design — get this right on 2-3 sample conversations before running batch; the prompt is the intellectual core
Batch distillation — run all ~1000 conversations through Claude API; collect .tid output
Vocabulary normalization — title list → Claude → normalization map → script → reimport
TW Node.js setup — GitHub repo, server config, lazy conversation renderer
Live model integration — synthesis queries against tiddler store from within TW