Artifact Platform
Mission
Your AI work, remembered, retrievable, and — optionally — shareable.
A platform where AI conversations, prompts, specs, essays, and codebases become structured, searchable, shareable units — artifacts — authored under pseudonyms, curated by Stoka, and circulated through the messaging layer.
Read README.md for platform mission, principles, and phase map.
Solo-first thesis
Before this is a social platform, it is a personal knowledge infrastructure for people who work with AI every day. The target user is someone whose AI conversations are the primary IP of their work — engineers, prompt engineers, PMs, researchers, writers. For them, losing a conversation = losing intellectual capital.
Phase 1 ships that tool alone. Zero social features. Just:
- Paste a conversation → Stoka structures it → it's in your library
- Natural-language retrieval: "that thing about caching embeddings" finds it in seconds
- Lateral linking: viewing one artifact surfaces adjacent ones
- Export as markdown + JSON — your library is yours, always
The artifact primitive
Artifact = permalinked + author-signed + share-button-able unit of content. Universal across the platform. Everything with those three properties is an artifact.
Kinds
The artifact kind determines rendering, metadata schema, and editorial policy. Stoka applies kind-aware ranking.
| Kind | What it is |
|---|---|
conversation |
Bon Appétit conversation extract — recipe + method + source turns + raw transcript (opt-in) |
spec |
A design document (this file, for example) |
essay |
Long-form writing. Existing blog posts become this kind on migration. |
prompt |
A standalone prompt with model target, params, sample output. No conversation context. |
note |
Short observation. Tweet-length, Bon Appétit-grade craft. |
codebase |
Anonymous codebase share (Tsukuyomi — see memory: Codebase Sharing Vision) |
link |
Annotated external link with commentary |
New kinds get added by contract: they must have a permalink, an author pseudonym, a share button, a Stoka-indexable body, and a defined visibility model.
The Bon Appétit granularity ladder
Every conversation artifact (and optionally other kinds) is rendered in layers. The reader drills as deep as they care.
- 01
Recipe card
what you scanHeadline · extracted prompt/technique · why it works · variations · tags
- 02
Method notes
expand onceWhat was tried · what failed · the turn it clicked · Stoka's framing of why this matters
- 03
Source exchanges
expand twiceSpecific turns Stoka pulled from, in conversational form · speakers + order preserved
- 04
Full transcript
opt-in by authoropt-inRaw unparsed conversation, PII-scrubbed by Stoka. Only present if author explicitly publishes this layer.
This maps onto the Bon Appétit magazine layout — top of page is the dish, scroll down is method, further is origin story / variations. Same content primitive, different reader appetite.
Visibility tiers
Author picks per artifact. Default is private.
| Tier | Who sees it | Directory? | Stoka index? | Share-to-DM? |
|---|---|---|---|---|
| Private | Only the author (under any of their pseudonyms) | No | Personal library only | Yes, share implies grant |
| Link-only | Anyone with the URL | No | Personal library only | Yes, share implies grant |
| Public anonymous | Everyone, signed by a pseudonym | Yes | Public index | Yes |
Permission cascade: sharing a link-only or private artifact to a recipient's pseudonym via DM grants that pseudonym access automatically. The share IS the permission. No separate access-control flow.
Pseudonym attribution & bodies of work
Every artifact is signed by one of the author's pseudonyms. The user→pseudonym link is stored but never user-visible. See messaging.md for the full pseudonym model.
Consequence: a pseudonym becomes a body of work. Browse @ghost-7c0a's artifacts like browsing a chef's recipes on Bon Appétit. Over time, a pseudonym earns recognition through craft — not follower counts, not engagement metrics, not post-frequency.
Pseudonym profile page:
- Bio + topics (opt-in)
- List of public artifacts authored
- Inbox mode (closed / allowlist / open / public)
- "DM" button (if inbox allows)
Capture flows
Phase 1 (MVP)
- Paste textarea — paste anything (Claude Code JSON export, ChatGPT text, raw markdown). Stoka detects format, structures.
- File upload — Claude Code session JSON, ChatGPT export JSON, .md, .txt.
Both flows: one click to "capture," Stoka runs extraction, user lands on a draft artifact for review.
Phase 2+
- Browser extension — captures from open AI chat tabs directly. Highest-UX flow, biggest build. Deferred.
- API endpoint — programmatic capture from other tools (for power users wiring up workflows).
Stoka's role in the artifact pipeline
Same brain across surfaces (see stoka-bot.md → Stoka Across Surfaces). For artifacts specifically:
| Step | Stoka's job |
|---|---|
| Ingest | Detect source format (Claude Code JSON vs raw text vs ChatGPT export), parse into turns |
| Extract | Identify the "recipe" — the prompt/technique at the core of what worked. Write the why-it-works framing. Surface method notes. |
| Scrub | PII pass (conservative default — names, emails, URLs, file paths flagged). Author reviews + un-redacts selectively before publish. |
| Structure | Build the layered representation (recipe / method / source). Fill kind-specific metadata. |
| Index | Chunk + embed for retrieval. Tag for editorial policy. |
| Retrieve | Natural-language queries against the user's library (solo) or the public corpus (Phase 3). |
| Surface | "You solved this before" lateral links. Time-based surfacing ("what was I working on in March"). |
Stoka's voice applies throughout: terse, editorial, opinionated. The extracted framing in a recipe card is written in Stoka's voice, not a generic summary.
Schema
The universal artifact record — kind-discriminated, pseudonym-attributed, visibility-tiered.
- idUUIDPK
- kindTEXTNNenumconversation · spec · essay · prompt · note · codebase · link
- pseudonym_idUUIDFKNN→ identity.pseudonyms.id
- statusTEXTNNdefaultdraftenumdraft · published · archived
- visibilityTEXTNNdefaultprivateenumprivate · link_only · public_anonymous
- titleTEXTNN
- summaryTEXT
- body_mdTEXTuniversal rendered markdown — recipe card for conversations
- metadataJSONBdefault'{}'kind-specific schema
- created_atTIMESTAMPTZdefaultnow()
- updated_atTIMESTAMPTZdefaultnow()
- published_atTIMESTAMPTZ
Granularity layers — recipe/method/source/raw — each with own visibility cap.
- idUUIDPK
- artifact_idUUIDFK→ identity.artifacts.id
- layerTEXTNNenumrecipe · method · source · raw
- content_mdTEXT
- content_jsonJSONB
- visibilityTEXTcan be stricter than parent artifact
- sort_orderINTdefault0
Retrieval index — chunked + embedded for Stoka natural-language search.
- idUUIDPK
- artifact_idUUIDFK→ identity.artifacts.id
- layerTEXTNN
- chunk_indexINTNN
- contentTEXTNN
- embeddingVECTOR(1024)
- metadataJSONBdefault'{}'
Edit history — each save snapshots body + metadata + diff summary.
- idUUIDPK
- artifact_idUUIDFK→ identity.artifacts.id
- versionINTNN
- body_md_snapshotTEXT
- metadata_snapshotJSONB
- diff_summaryTEXT
- edited_atTIMESTAMPTZdefaultnow()
- artifact_layers.artifact_id→artifacts.idcomposes
- artifact_chunks.artifact_id→artifacts.idindexes
- artifact_versions.artifact_id→artifacts.idsnapshots
- artifacts.pseudonym_id→pseudonyms.idauthored by
Raw DDL (portable)
CREATE TABLE identity.artifacts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
kind TEXT NOT NULL, -- conversation | spec | essay | prompt | note | codebase | link
pseudonym_id UUID REFERENCES identity.pseudonyms(id),
status TEXT NOT NULL DEFAULT 'draft', -- draft | published | archived
visibility TEXT NOT NULL DEFAULT 'private', -- private | link_only | public_anonymous
title TEXT NOT NULL,
summary TEXT,
body_md TEXT, -- universal rendered markdown (recipe card for conversations, full body for essays)
metadata JSONB DEFAULT '{}', -- kind-specific schema
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now(),
published_at TIMESTAMPTZ
);
CREATE INDEX ON identity.artifacts (pseudonym_id, status);
CREATE INDEX ON identity.artifacts (kind, visibility, published_at DESC);
CREATE TABLE identity.artifact_layers (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
artifact_id UUID REFERENCES identity.artifacts(id) ON DELETE CASCADE,
layer TEXT NOT NULL, -- recipe | method | source | raw
content_md TEXT,
content_json JSONB,
visibility TEXT, -- can be more restrictive than parent artifact
sort_order INT DEFAULT 0,
UNIQUE(artifact_id, layer)
);
CREATE TABLE identity.artifact_chunks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
artifact_id UUID REFERENCES identity.artifacts(id) ON DELETE CASCADE,
layer TEXT NOT NULL,
chunk_index INT NOT NULL,
content TEXT NOT NULL,
embedding VECTOR(1024),
metadata JSONB DEFAULT '{}'
);
CREATE INDEX ON identity.artifact_chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 40);
CREATE TABLE identity.artifact_versions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
artifact_id UUID REFERENCES identity.artifacts(id) ON DELETE CASCADE,
version INT NOT NULL,
body_md_snapshot TEXT,
metadata_snapshot JSONB,
diff_summary TEXT,
edited_at TIMESTAMPTZ DEFAULT now()
);
Kind-specific metadata
Conversations:
{
"recipe": "{extracted prompt/technique}",
"why_it_works": "{Stoka's framing}",
"variations": ["...", "..."],
"tags": ["rag", "prompt-engineering"],
"source_format": "claude-code-json",
"source_turns": 42,
"scrub_report": {
"emails_redacted": 2,
"paths_redacted": 5,
"user_approved": true
}
}
Specs:
{
"phase": "1-3",
"status": "draft",
"supersedes": ["spec-id"],
"depends_on": ["spec-id"]
}
Prompts:
{
"model_target": "claude-opus-4-7",
"params": { "temperature": 0.7, "max_tokens": 4000 },
"sample_output": "...",
"use_case": "long-form technical writing"
}
Capture flow — technical detail
User pastes conversation into /library/new
↓
POST /api/artifacts/capture
body: { source_text, detected_format? }
↓
FastAPI handler:
1. Detect format (Claude Code JSON | ChatGPT export | raw markdown | text)
2. Parse into normalized turn array [{speaker, text, ts?}]
3. Call Stoka extraction pipeline (see below)
4. Write artifact (status=draft) + layers + chunks
5. Return artifact_id
↓
User lands on /library/<id>?mode=review
↓
Review UI:
- Recipe card (editable)
- Method notes (editable)
- Source turns (selectable — which to include)
- Scrub report (review redactions)
- Visibility picker (default: private)
↓
User hits "Save to library"
→ status=published (if visibility=private/link-only)
→ or → Stoka final review pass → status=published (if public_anonymous)
Stoka extraction pipeline
async def extract_conversation(turns: list[Turn]) -> ConversationArtifact:
# 1. Coarse-pass: identify the 'crux' — the turn(s) where
# the useful technique emerged
crux_indices = await stoka.identify_crux(turns)
# 2. Extract the recipe: what prompt/technique worked?
recipe = await stoka.extract_recipe(turns[crux_indices])
# 3. Write the 'why it works' framing in Stoka's voice
why_it_works = await stoka.frame(recipe, turns)
# 4. Method notes: what was tried before the crux?
method = await stoka.extract_method_notes(turns[:min(crux_indices)])
# 5. PII scrub — conservative default
scrubbed_turns = await stoka.scrub_pii(turns)
# 6. Variations: suggest adjacent uses
variations = await stoka.suggest_variations(recipe, why_it_works)
# 7. Tags: topical auto-tagging for retrieval
tags = await stoka.auto_tag(recipe, why_it_works)
return ConversationArtifact(
recipe=recipe,
why_it_works=why_it_works,
method=method,
source_turns=scrubbed_turns,
raw_transcript=None, # opt-in separately
variations=variations,
tags=tags,
)
The pipeline uses the same underlying Qwen3-VL-8B (port 8050) as Stoka v1, with kind-specific prompts.
Personal library UI
Main interface for solo users. Primary surface of the product.
- Search bar at top — natural-language query powered by Stoka's retrieval engine
- Recency shelf — recently captured / recently viewed artifacts
- Topic clusters — auto-grouped by Stoka's tagging
- Time navigation — "March 2026", "last week", "this month"
- Pseudonym filter — view artifacts by which pseudonym authored them (personal org tool)
- Lateral surfacing — "you solved this before" widget when viewing any artifact
Solo users only see their own artifacts. Public directory is a separate surface (Phase 3).
Export
First-class Phase 1 deliverable.
- Markdown export — one file per artifact, preserving layer structure via headers
- JSON export — full structured dump with metadata, suitable for re-import elsewhere
- Batch export — whole library as a zip
Anti-lock-in is the trust story, not a retention problem. A user who knows they can leave at any time is a user who stays for the right reasons.
Phase plan
Phase 1 — Solo core (the must-ship)
Ship these or nothing:
-
identity.artifacts+artifact_layers+artifact_chunks+artifact_versionstables -
identity.pseudonymstable (for authorship — see messaging.md) - Stoka extraction pipeline (conversation kind)
- Paste-in + file-upload capture flows
- Review UI (recipe/method/source layers editable)
- Personal library page with Stoka-powered search
- Bon Appétit single-artifact page rendering
- Markdown + JSON export
- Conservative PII scrub (TBD on exact aggressiveness)
Phase 2 — Sharing core (after invite round)
- Share-to-handle primitive on artifact pages
- Rich artifact-card embed in DM threads
- Permission cascade via shared links
- Recipient picker (recent / search / paste handle)
- Conversation permalinks
- (Messaging/DM infrastructure: see messaging.md)
Phase 3 — Public layer
-
public_anonymousvisibility tier live - Stoka final-review pass on public flip
- Public artifact directory (kind-filterable:
/specs,/prompts,/conversations) - Pseudonym profile pages with published artifact lists
- Public Stoka discovery across the published corpus
- Existing blog posts migrated to
kind=essayartifacts (or rendered via legacy view — decide at migration time)
Phase N (deferred)
- Browser extension capture
- Programmatic API capture
kind=codebase(Tsukuyomi return — see memory)kind=link,kind=note
Open questions (TBDs)
- PII scrub aggressiveness. Conservative (names/emails/URLs/paths) vs permissive (just credentials). Default conservative + manual un-redact in review UI. Decide after testing on real conversations.
- Source ingestion quality. How well can Stoka parse Claude Code JSON vs ChatGPT exports vs random pasted text? Test against a corpus before committing to formats.
- Distilio port opportunity. Distilio may have distillation-pipeline / extraction-UX code that ports to the artifact extraction flow. Status: TBD. Scope once Phase 1 design is more concrete.
- Existing blog posts migration. Option A: migrate to
kind=essayartifacts now. Option B: keep blog table as-is, migrate in Phase 2/3. Lean B — less risk, faster ship. - Non-conversation kinds in Phase 1. Does
kind=specship in Phase 1 or Phase 3? Leaning: Phase 1 gets it free because this repo's specs become the first test corpus. - Artifact versioning UX. Diffs between versions — shown as Git-style diffs, or as semantic "what changed in this edit"? TBD after Phase 1 ships.
Related specs
- README.md — platform mission + phases
- stoka-bot.md — Stoka identity, voice, RAG pipeline, and the Across-Surfaces pattern that makes artifact extraction share a brain with discovery
- messaging.md — pseudonyms, DM, share fabric