Artifact Platform

Mission

Your AI work, remembered, retrievable, and — optionally — shareable.

A platform where AI conversations, prompts, specs, essays, and codebases become structured, searchable, shareable units — artifacts — authored under pseudonyms, curated by Stoka, and circulated through the messaging layer.

Read README.md for platform mission, principles, and phase map.

Solo-first thesis

Before this is a social platform, it is a personal knowledge infrastructure for people who work with AI every day. The target user is someone whose AI conversations are the primary IP of their work — engineers, prompt engineers, PMs, researchers, writers. For them, losing a conversation = losing intellectual capital.

Phase 1 ships that tool alone. Zero social features. Just:

Paste a conversation → Stoka structures it → it's in your library
Natural-language retrieval: "that thing about caching embeddings" finds it in seconds
Lateral linking: viewing one artifact surfaces adjacent ones
Export as markdown + JSON — your library is yours, always

The artifact primitive

Artifact = permalinked + author-signed + share-button-able unit of content. Universal across the platform. Everything with those three properties is an artifact.

Kinds

The artifact kind determines rendering, metadata schema, and editorial policy. Stoka applies kind-aware ranking.

Kind	What it is
`conversation`	Bon Appétit conversation extract — recipe + method + source turns + raw transcript (opt-in)
`spec`	A design document (this file, for example)
`essay`	Long-form writing. Existing blog posts become this kind on migration.
`prompt`	A standalone prompt with model target, params, sample output. No conversation context.
`note`	Short observation. Tweet-length, Bon Appétit-grade craft.
`codebase`	Anonymous codebase share (Tsukuyomi — see memory: Codebase Sharing Vision)
`link`	Annotated external link with commentary

New kinds get added by contract: they must have a permalink, an author pseudonym, a share button, a Stoka-indexable body, and a defined visibility model.

The Bon Appétit granularity ladder

Every conversation artifact (and optionally other kinds) is rendered in layers. The reader drills as deep as they care.

Bon Appétit granularity ladder

01
Recipe card
what you scan
Headline · extracted prompt/technique · why it works · variations · tags
02
Method notes
expand once
What was tried · what failed · the turn it clicked · Stoka's framing of why this matters
03
Source exchanges
expand twice
Specific turns Stoka pulled from, in conversational form · speakers + order preserved
04
Full transcript
opt-in by authoropt-in
Raw unparsed conversation, PII-scrubbed by Stoka. Only present if author explicitly publishes this layer.

This maps onto the Bon Appétit magazine layout — top of page is the dish, scroll down is method, further is origin story / variations. Same content primitive, different reader appetite.

Visibility tiers

Author picks per artifact. Default is private.

Tier	Who sees it	Directory?	Stoka index?	Share-to-DM?
Private	Only the author (under any of their pseudonyms)	No	Personal library only	Yes, share implies grant
Link-only	Anyone with the URL	No	Personal library only	Yes, share implies grant
Public anonymous	Everyone, signed by a pseudonym	Yes	Public index	Yes

Permission cascade: sharing a link-only or private artifact to a recipient's pseudonym via DM grants that pseudonym access automatically. The share IS the permission. No separate access-control flow.

Pseudonym attribution & bodies of work

Every artifact is signed by one of the author's pseudonyms. The user→pseudonym link is stored but never user-visible. See messaging.md for the full pseudonym model.

Consequence: a pseudonym becomes a body of work. Browse @ghost-7c0a's artifacts like browsing a chef's recipes on Bon Appétit. Over time, a pseudonym earns recognition through craft — not follower counts, not engagement metrics, not post-frequency.

Pseudonym profile page:

Bio + topics (opt-in)
List of public artifacts authored
Inbox mode (closed / allowlist / open / public)
"DM" button (if inbox allows)

Capture flows

Phase 1 (MVP)

Paste textarea — paste anything (Claude Code JSON export, ChatGPT text, raw markdown). Stoka detects format, structures.
File upload — Claude Code session JSON, ChatGPT export JSON, .md, .txt.

Both flows: one click to "capture," Stoka runs extraction, user lands on a draft artifact for review.

Phase 2+

Browser extension — captures from open AI chat tabs directly. Highest-UX flow, biggest build. Deferred.
API endpoint — programmatic capture from other tools (for power users wiring up workflows).

Stoka's role in the artifact pipeline

Same brain across surfaces (see stoka-bot.md → Stoka Across Surfaces). For artifacts specifically:

Step	Stoka's job
Ingest	Detect source format (Claude Code JSON vs raw text vs ChatGPT export), parse into turns
Extract	Identify the "recipe" — the prompt/technique at the core of what worked. Write the why-it-works framing. Surface method notes.
Scrub	PII pass (conservative default — names, emails, URLs, file paths flagged). Author reviews + un-redacts selectively before publish.
Structure	Build the layered representation (recipe / method / source). Fill kind-specific metadata.
Index	Chunk + embed for retrieval. Tag for editorial policy.
Retrieve	Natural-language queries against the user's library (solo) or the public corpus (Phase 3).
Surface	"You solved this before" lateral links. Time-based surfacing ("what was I working on in March").

Stoka's voice applies throughout: terse, editorial, opinionated. The extracted framing in a recipe card is written in Stoka's voice, not a generic summary.

Schema

identity schema — artifacts

identity.artifacts12

The universal artifact record — kind-discriminated, pseudonym-attributed, visibility-tiered.

idUUIDPK
kindTEXTNN
enumconversation · spec · essay · prompt · note · codebase · link
pseudonym_idUUIDFKNN
→ identity.pseudonyms.id
statusTEXTNN
defaultdraftenumdraft · published · archived
visibilityTEXTNN
defaultprivateenumprivate · link_only · public_anonymous
titleTEXTNN
summaryTEXT
body_mdTEXT
universal rendered markdown — recipe card for conversations
metadataJSONB
default'{}'kind-specific schema
created_atTIMESTAMPTZ
defaultnow()
updated_atTIMESTAMPTZ
defaultnow()
published_atTIMESTAMPTZ

identity.artifact_layers7

Granularity layers — recipe/method/source/raw — each with own visibility cap.

idUUIDPK
artifact_idUUIDFK
→ identity.artifacts.id
layerTEXTNN
enumrecipe · method · source · raw
content_mdTEXT
content_jsonJSONB
visibilityTEXT
can be stricter than parent artifact
sort_orderINT
default0

identity.artifact_chunks7

Retrieval index — chunked + embedded for Stoka natural-language search.

idUUIDPK
artifact_idUUIDFK
→ identity.artifacts.id
layerTEXTNN
chunk_indexINTNN
contentTEXTNN
embeddingVECTOR(1024)
metadataJSONB
default'{}'

identity.artifact_versions7

Edit history — each save snapshots body + metadata + diff summary.

idUUIDPK
artifact_idUUIDFK
→ identity.artifacts.id
versionINTNN
body_md_snapshotTEXT
metadata_snapshotJSONB
diff_summaryTEXT
edited_atTIMESTAMPTZ
defaultnow()

relationships

artifact_layers.artifact_id→artifacts.idcomposes
artifact_chunks.artifact_id→artifacts.idindexes
artifact_versions.artifact_id→artifacts.idsnapshots
artifacts.pseudonym_id→pseudonyms.idauthored by

Raw DDL (portable)

CREATE TABLE identity.artifacts (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  kind            TEXT NOT NULL,          -- conversation | spec | essay | prompt | note | codebase | link
  pseudonym_id    UUID REFERENCES identity.pseudonyms(id),
  status          TEXT NOT NULL DEFAULT 'draft',  -- draft | published | archived
  visibility      TEXT NOT NULL DEFAULT 'private', -- private | link_only | public_anonymous
  title           TEXT NOT NULL,
  summary         TEXT,
  body_md         TEXT,                   -- universal rendered markdown (recipe card for conversations, full body for essays)
  metadata        JSONB DEFAULT '{}',     -- kind-specific schema
  created_at      TIMESTAMPTZ DEFAULT now(),
  updated_at      TIMESTAMPTZ DEFAULT now(),
  published_at    TIMESTAMPTZ
);

CREATE INDEX ON identity.artifacts (pseudonym_id, status);
CREATE INDEX ON identity.artifacts (kind, visibility, published_at DESC);

CREATE TABLE identity.artifact_layers (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  artifact_id     UUID REFERENCES identity.artifacts(id) ON DELETE CASCADE,
  layer           TEXT NOT NULL,          -- recipe | method | source | raw
  content_md      TEXT,
  content_json    JSONB,
  visibility      TEXT,                   -- can be more restrictive than parent artifact
  sort_order      INT DEFAULT 0,
  UNIQUE(artifact_id, layer)
);

CREATE TABLE identity.artifact_chunks (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  artifact_id     UUID REFERENCES identity.artifacts(id) ON DELETE CASCADE,
  layer           TEXT NOT NULL,
  chunk_index     INT NOT NULL,
  content         TEXT NOT NULL,
  embedding       VECTOR(1024),
  metadata        JSONB DEFAULT '{}'
);

CREATE INDEX ON identity.artifact_chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 40);

CREATE TABLE identity.artifact_versions (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  artifact_id     UUID REFERENCES identity.artifacts(id) ON DELETE CASCADE,
  version         INT NOT NULL,
  body_md_snapshot TEXT,
  metadata_snapshot JSONB,
  diff_summary    TEXT,
  edited_at       TIMESTAMPTZ DEFAULT now()
);

Kind-specific metadata

Conversations:

{
  "recipe": "{extracted prompt/technique}",
  "why_it_works": "{Stoka's framing}",
  "variations": ["...", "..."],
  "tags": ["rag", "prompt-engineering"],
  "source_format": "claude-code-json",
  "source_turns": 42,
  "scrub_report": {
    "emails_redacted": 2,
    "paths_redacted": 5,
    "user_approved": true
  }
}

Specs:

{
  "phase": "1-3",
  "status": "draft",
  "supersedes": ["spec-id"],
  "depends_on": ["spec-id"]
}

Prompts:

{
  "model_target": "claude-opus-4-7",
  "params": { "temperature": 0.7, "max_tokens": 4000 },
  "sample_output": "...",
  "use_case": "long-form technical writing"
}

Capture flow — technical detail

User pastes conversation into /library/new
  ↓
POST /api/artifacts/capture
  body: { source_text, detected_format? }
  ↓
FastAPI handler:
  1. Detect format (Claude Code JSON | ChatGPT export | raw markdown | text)
  2. Parse into normalized turn array [{speaker, text, ts?}]
  3. Call Stoka extraction pipeline (see below)
  4. Write artifact (status=draft) + layers + chunks
  5. Return artifact_id
  ↓
User lands on /library/<id>?mode=review
  ↓
Review UI:
  - Recipe card (editable)
  - Method notes (editable)
  - Source turns (selectable — which to include)
  - Scrub report (review redactions)
  - Visibility picker (default: private)
  ↓
User hits "Save to library"
  → status=published (if visibility=private/link-only)
  → or → Stoka final review pass → status=published (if public_anonymous)

Stoka extraction pipeline

async def extract_conversation(turns: list[Turn]) -> ConversationArtifact:
    # 1. Coarse-pass: identify the 'crux' — the turn(s) where
    #    the useful technique emerged
    crux_indices = await stoka.identify_crux(turns)

    # 2. Extract the recipe: what prompt/technique worked?
    recipe = await stoka.extract_recipe(turns[crux_indices])

    # 3. Write the 'why it works' framing in Stoka's voice
    why_it_works = await stoka.frame(recipe, turns)

    # 4. Method notes: what was tried before the crux?
    method = await stoka.extract_method_notes(turns[:min(crux_indices)])

    # 5. PII scrub — conservative default
    scrubbed_turns = await stoka.scrub_pii(turns)

    # 6. Variations: suggest adjacent uses
    variations = await stoka.suggest_variations(recipe, why_it_works)

    # 7. Tags: topical auto-tagging for retrieval
    tags = await stoka.auto_tag(recipe, why_it_works)

    return ConversationArtifact(
        recipe=recipe,
        why_it_works=why_it_works,
        method=method,
        source_turns=scrubbed_turns,
        raw_transcript=None,  # opt-in separately
        variations=variations,
        tags=tags,
    )

The pipeline uses the same underlying Qwen3-VL-8B (port 8050) as Stoka v1, with kind-specific prompts.

Personal library UI

Main interface for solo users. Primary surface of the product.

Search bar at top — natural-language query powered by Stoka's retrieval engine
Recency shelf — recently captured / recently viewed artifacts
Topic clusters — auto-grouped by Stoka's tagging
Time navigation — "March 2026", "last week", "this month"
Pseudonym filter — view artifacts by which pseudonym authored them (personal org tool)
Lateral surfacing — "you solved this before" widget when viewing any artifact

Solo users only see their own artifacts. Public directory is a separate surface (Phase 3).

Export

First-class Phase 1 deliverable.

Markdown export — one file per artifact, preserving layer structure via headers
JSON export — full structured dump with metadata, suitable for re-import elsewhere
Batch export — whole library as a zip

Anti-lock-in is the trust story, not a retention problem. A user who knows they can leave at any time is a user who stays for the right reasons.

Phase plan

Phase 1 — Solo core (the must-ship)

Ship these or nothing:

identity.artifacts + artifact_layers + artifact_chunks + artifact_versions tables
identity.pseudonyms table (for authorship — see messaging.md)
Stoka extraction pipeline (conversation kind)
Paste-in + file-upload capture flows
Review UI (recipe/method/source layers editable)
Personal library page with Stoka-powered search
Bon Appétit single-artifact page rendering
Markdown + JSON export
Conservative PII scrub (TBD on exact aggressiveness)

Phase 2 — Sharing core (after invite round)

Share-to-handle primitive on artifact pages
Rich artifact-card embed in DM threads
Permission cascade via shared links
Recipient picker (recent / search / paste handle)
Conversation permalinks
(Messaging/DM infrastructure: see messaging.md)

Phase 3 — Public layer

public_anonymous visibility tier live
Stoka final-review pass on public flip
Public artifact directory (kind-filterable: /specs, /prompts, /conversations)
Pseudonym profile pages with published artifact lists
Public Stoka discovery across the published corpus
Existing blog posts migrated to kind=essay artifacts (or rendered via legacy view — decide at migration time)

Phase N (deferred)

Browser extension capture
Programmatic API capture
kind=codebase (Tsukuyomi return — see memory)
kind=link, kind=note

Open questions (TBDs)

PII scrub aggressiveness. Conservative (names/emails/URLs/paths) vs permissive (just credentials). Default conservative + manual un-redact in review UI. Decide after testing on real conversations.
Source ingestion quality. How well can Stoka parse Claude Code JSON vs ChatGPT exports vs random pasted text? Test against a corpus before committing to formats.
Distilio port opportunity. Distilio may have distillation-pipeline / extraction-UX code that ports to the artifact extraction flow. Status: TBD. Scope once Phase 1 design is more concrete.
Existing blog posts migration. Option A: migrate to kind=essay artifacts now. Option B: keep blog table as-is, migrate in Phase 2/3. Lean B — less risk, faster ship.
Non-conversation kinds in Phase 1. Does kind=spec ship in Phase 1 or Phase 3? Leaning: Phase 1 gets it free because this repo's specs become the first test corpus.
Artifact versioning UX. Diffs between versions — shown as Git-style diffs, or as semantic "what changed in this edit"? TBD after Phase 1 ships.

Related specs

README.md — platform mission + phases
stoka-bot.md — Stoka identity, voice, RAG pipeline, and the Across-Surfaces pattern that makes artifact extraction share a brain with discovery
messaging.md — pseudonyms, DM, share fabric

Artifact Platform

Artifact Platform

Mission

Solo-first thesis

The artifact primitive

Kinds

The Bon Appétit granularity ladder

Recipe card

Method notes

Source exchanges

Full transcript

Visibility tiers

Pseudonym attribution & bodies of work

Capture flows

Phase 1 (MVP)

Phase 2+

Stoka's role in the artifact pipeline

Schema

Raw DDL (portable)

Kind-specific metadata

Capture flow — technical detail

Stoka extraction pipeline

Personal library UI

Export

Phase plan

Phase 1 — Solo core (the must-ship)

Phase 2 — Sharing core (after invite round)

Phase 3 — Public layer

Phase N (deferred)

Open questions (TBDs)

Related specs