← All specs
Phase 1-3 draft

Artifact Platform

Your AI work, remembered, retrievable, and — optionally — shareable. The universal content primitive and its Bon Appétit granularity ladder.

Artifact Platform

Mission

Your AI work, remembered, retrievable, and — optionally — shareable.

A platform where AI conversations, prompts, specs, essays, and codebases become structured, searchable, shareable units — artifacts — authored under pseudonyms, curated by Stoka, and circulated through the messaging layer.

Read README.md for platform mission, principles, and phase map.

Solo-first thesis

Before this is a social platform, it is a personal knowledge infrastructure for people who work with AI every day. The target user is someone whose AI conversations are the primary IP of their work — engineers, prompt engineers, PMs, researchers, writers. For them, losing a conversation = losing intellectual capital.

Phase 1 ships that tool alone. Zero social features. Just:

  • Paste a conversation → Stoka structures it → it's in your library
  • Natural-language retrieval: "that thing about caching embeddings" finds it in seconds
  • Lateral linking: viewing one artifact surfaces adjacent ones
  • Export as markdown + JSON — your library is yours, always

The artifact primitive

Artifact = permalinked + author-signed + share-button-able unit of content. Universal across the platform. Everything with those three properties is an artifact.

Kinds

The artifact kind determines rendering, metadata schema, and editorial policy. Stoka applies kind-aware ranking.

Kind What it is
conversation Bon Appétit conversation extract — recipe + method + source turns + raw transcript (opt-in)
spec A design document (this file, for example)
essay Long-form writing. Existing blog posts become this kind on migration.
prompt A standalone prompt with model target, params, sample output. No conversation context.
note Short observation. Tweet-length, Bon Appétit-grade craft.
codebase Anonymous codebase share (Tsukuyomi — see memory: Codebase Sharing Vision)
link Annotated external link with commentary

New kinds get added by contract: they must have a permalink, an author pseudonym, a share button, a Stoka-indexable body, and a defined visibility model.

The Bon Appétit granularity ladder

Every conversation artifact (and optionally other kinds) is rendered in layers. The reader drills as deep as they care.

Bon Appétit granularity ladder
  1. 01

    Recipe card

    what you scan

    Headline · extracted prompt/technique · why it works · variations · tags

  2. 02

    Method notes

    expand once

    What was tried · what failed · the turn it clicked · Stoka's framing of why this matters

  3. 03

    Source exchanges

    expand twice

    Specific turns Stoka pulled from, in conversational form · speakers + order preserved

  4. 04

    Full transcript

    opt-in by authoropt-in

    Raw unparsed conversation, PII-scrubbed by Stoka. Only present if author explicitly publishes this layer.

This maps onto the Bon Appétit magazine layout — top of page is the dish, scroll down is method, further is origin story / variations. Same content primitive, different reader appetite.

Visibility tiers

Author picks per artifact. Default is private.

Tier Who sees it Directory? Stoka index? Share-to-DM?
Private Only the author (under any of their pseudonyms) No Personal library only Yes, share implies grant
Link-only Anyone with the URL No Personal library only Yes, share implies grant
Public anonymous Everyone, signed by a pseudonym Yes Public index Yes

Permission cascade: sharing a link-only or private artifact to a recipient's pseudonym via DM grants that pseudonym access automatically. The share IS the permission. No separate access-control flow.

Pseudonym attribution & bodies of work

Every artifact is signed by one of the author's pseudonyms. The user→pseudonym link is stored but never user-visible. See messaging.md for the full pseudonym model.

Consequence: a pseudonym becomes a body of work. Browse @ghost-7c0a's artifacts like browsing a chef's recipes on Bon Appétit. Over time, a pseudonym earns recognition through craft — not follower counts, not engagement metrics, not post-frequency.

Pseudonym profile page:

  • Bio + topics (opt-in)
  • List of public artifacts authored
  • Inbox mode (closed / allowlist / open / public)
  • "DM" button (if inbox allows)

Capture flows

Phase 1 (MVP)

  1. Paste textarea — paste anything (Claude Code JSON export, ChatGPT text, raw markdown). Stoka detects format, structures.
  2. File upload — Claude Code session JSON, ChatGPT export JSON, .md, .txt.

Both flows: one click to "capture," Stoka runs extraction, user lands on a draft artifact for review.

Phase 2+

  1. Browser extension — captures from open AI chat tabs directly. Highest-UX flow, biggest build. Deferred.
  2. API endpoint — programmatic capture from other tools (for power users wiring up workflows).

Stoka's role in the artifact pipeline

Same brain across surfaces (see stoka-bot.md → Stoka Across Surfaces). For artifacts specifically:

Step Stoka's job
Ingest Detect source format (Claude Code JSON vs raw text vs ChatGPT export), parse into turns
Extract Identify the "recipe" — the prompt/technique at the core of what worked. Write the why-it-works framing. Surface method notes.
Scrub PII pass (conservative default — names, emails, URLs, file paths flagged). Author reviews + un-redacts selectively before publish.
Structure Build the layered representation (recipe / method / source). Fill kind-specific metadata.
Index Chunk + embed for retrieval. Tag for editorial policy.
Retrieve Natural-language queries against the user's library (solo) or the public corpus (Phase 3).
Surface "You solved this before" lateral links. Time-based surfacing ("what was I working on in March").

Stoka's voice applies throughout: terse, editorial, opinionated. The extracted framing in a recipe card is written in Stoka's voice, not a generic summary.

Schema

identity schema — artifacts
identity.artifacts12

The universal artifact record — kind-discriminated, pseudonym-attributed, visibility-tiered.

  • idUUIDPK
  • kindTEXTNN
    enumconversation · spec · essay · prompt · note · codebase · link
  • pseudonym_idUUIDFKNN
    → identity.pseudonyms.id
  • statusTEXTNN
    defaultdraftenumdraft · published · archived
  • visibilityTEXTNN
    defaultprivateenumprivate · link_only · public_anonymous
  • titleTEXTNN
  • summaryTEXT
  • body_mdTEXT
    universal rendered markdown — recipe card for conversations
  • metadataJSONB
    default'{}'kind-specific schema
  • created_atTIMESTAMPTZ
    defaultnow()
  • updated_atTIMESTAMPTZ
    defaultnow()
  • published_atTIMESTAMPTZ
indexes
  • (pseudonym_id, status)
  • (kind, visibility, published_at)
identity.artifact_layers7

Granularity layers — recipe/method/source/raw — each with own visibility cap.

  • idUUIDPK
  • artifact_idUUIDFK
    → identity.artifacts.id
  • layerTEXTNN
    enumrecipe · method · source · raw
  • content_mdTEXT
  • content_jsonJSONB
  • visibilityTEXT
    can be stricter than parent artifact
  • sort_orderINT
    default0
indexes
  • unique (artifact_id, layer)
identity.artifact_chunks7

Retrieval index — chunked + embedded for Stoka natural-language search.

  • idUUIDPK
  • artifact_idUUIDFK
    → identity.artifacts.id
  • layerTEXTNN
  • chunk_indexINTNN
  • contentTEXTNN
  • embeddingVECTOR(1024)
  • metadataJSONB
    default'{}'
indexes
  • ivfflat (embedding)
identity.artifact_versions7

Edit history — each save snapshots body + metadata + diff summary.

  • idUUIDPK
  • artifact_idUUIDFK
    → identity.artifacts.id
  • versionINTNN
  • body_md_snapshotTEXT
  • metadata_snapshotJSONB
  • diff_summaryTEXT
  • edited_atTIMESTAMPTZ
    defaultnow()
relationships
  • artifact_layers.artifact_idartifacts.idcomposes
  • artifact_chunks.artifact_idartifacts.idindexes
  • artifact_versions.artifact_idartifacts.idsnapshots
  • artifacts.pseudonym_idpseudonyms.idauthored by

Raw DDL (portable)

CREATE TABLE identity.artifacts (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  kind            TEXT NOT NULL,          -- conversation | spec | essay | prompt | note | codebase | link
  pseudonym_id    UUID REFERENCES identity.pseudonyms(id),
  status          TEXT NOT NULL DEFAULT 'draft',  -- draft | published | archived
  visibility      TEXT NOT NULL DEFAULT 'private', -- private | link_only | public_anonymous
  title           TEXT NOT NULL,
  summary         TEXT,
  body_md         TEXT,                   -- universal rendered markdown (recipe card for conversations, full body for essays)
  metadata        JSONB DEFAULT '{}',     -- kind-specific schema
  created_at      TIMESTAMPTZ DEFAULT now(),
  updated_at      TIMESTAMPTZ DEFAULT now(),
  published_at    TIMESTAMPTZ
);

CREATE INDEX ON identity.artifacts (pseudonym_id, status);
CREATE INDEX ON identity.artifacts (kind, visibility, published_at DESC);

CREATE TABLE identity.artifact_layers (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  artifact_id     UUID REFERENCES identity.artifacts(id) ON DELETE CASCADE,
  layer           TEXT NOT NULL,          -- recipe | method | source | raw
  content_md      TEXT,
  content_json    JSONB,
  visibility      TEXT,                   -- can be more restrictive than parent artifact
  sort_order      INT DEFAULT 0,
  UNIQUE(artifact_id, layer)
);

CREATE TABLE identity.artifact_chunks (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  artifact_id     UUID REFERENCES identity.artifacts(id) ON DELETE CASCADE,
  layer           TEXT NOT NULL,
  chunk_index     INT NOT NULL,
  content         TEXT NOT NULL,
  embedding       VECTOR(1024),
  metadata        JSONB DEFAULT '{}'
);

CREATE INDEX ON identity.artifact_chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 40);

CREATE TABLE identity.artifact_versions (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  artifact_id     UUID REFERENCES identity.artifacts(id) ON DELETE CASCADE,
  version         INT NOT NULL,
  body_md_snapshot TEXT,
  metadata_snapshot JSONB,
  diff_summary    TEXT,
  edited_at       TIMESTAMPTZ DEFAULT now()
);

Kind-specific metadata

Conversations:

{
  "recipe": "{extracted prompt/technique}",
  "why_it_works": "{Stoka's framing}",
  "variations": ["...", "..."],
  "tags": ["rag", "prompt-engineering"],
  "source_format": "claude-code-json",
  "source_turns": 42,
  "scrub_report": {
    "emails_redacted": 2,
    "paths_redacted": 5,
    "user_approved": true
  }
}

Specs:

{
  "phase": "1-3",
  "status": "draft",
  "supersedes": ["spec-id"],
  "depends_on": ["spec-id"]
}

Prompts:

{
  "model_target": "claude-opus-4-7",
  "params": { "temperature": 0.7, "max_tokens": 4000 },
  "sample_output": "...",
  "use_case": "long-form technical writing"
}

Capture flow — technical detail

User pastes conversation into /library/new
  ↓
POST /api/artifacts/capture
  body: { source_text, detected_format? }
  ↓
FastAPI handler:
  1. Detect format (Claude Code JSON | ChatGPT export | raw markdown | text)
  2. Parse into normalized turn array [{speaker, text, ts?}]
  3. Call Stoka extraction pipeline (see below)
  4. Write artifact (status=draft) + layers + chunks
  5. Return artifact_id
  ↓
User lands on /library/<id>?mode=review
  ↓
Review UI:
  - Recipe card (editable)
  - Method notes (editable)
  - Source turns (selectable — which to include)
  - Scrub report (review redactions)
  - Visibility picker (default: private)
  ↓
User hits "Save to library"
  → status=published (if visibility=private/link-only)
  → or → Stoka final review pass → status=published (if public_anonymous)

Stoka extraction pipeline

async def extract_conversation(turns: list[Turn]) -> ConversationArtifact:
    # 1. Coarse-pass: identify the 'crux' — the turn(s) where
    #    the useful technique emerged
    crux_indices = await stoka.identify_crux(turns)

    # 2. Extract the recipe: what prompt/technique worked?
    recipe = await stoka.extract_recipe(turns[crux_indices])

    # 3. Write the 'why it works' framing in Stoka's voice
    why_it_works = await stoka.frame(recipe, turns)

    # 4. Method notes: what was tried before the crux?
    method = await stoka.extract_method_notes(turns[:min(crux_indices)])

    # 5. PII scrub — conservative default
    scrubbed_turns = await stoka.scrub_pii(turns)

    # 6. Variations: suggest adjacent uses
    variations = await stoka.suggest_variations(recipe, why_it_works)

    # 7. Tags: topical auto-tagging for retrieval
    tags = await stoka.auto_tag(recipe, why_it_works)

    return ConversationArtifact(
        recipe=recipe,
        why_it_works=why_it_works,
        method=method,
        source_turns=scrubbed_turns,
        raw_transcript=None,  # opt-in separately
        variations=variations,
        tags=tags,
    )

The pipeline uses the same underlying Qwen3-VL-8B (port 8050) as Stoka v1, with kind-specific prompts.

Personal library UI

Main interface for solo users. Primary surface of the product.

  • Search bar at top — natural-language query powered by Stoka's retrieval engine
  • Recency shelf — recently captured / recently viewed artifacts
  • Topic clusters — auto-grouped by Stoka's tagging
  • Time navigation — "March 2026", "last week", "this month"
  • Pseudonym filter — view artifacts by which pseudonym authored them (personal org tool)
  • Lateral surfacing — "you solved this before" widget when viewing any artifact

Solo users only see their own artifacts. Public directory is a separate surface (Phase 3).

Export

First-class Phase 1 deliverable.

  • Markdown export — one file per artifact, preserving layer structure via headers
  • JSON export — full structured dump with metadata, suitable for re-import elsewhere
  • Batch export — whole library as a zip

Anti-lock-in is the trust story, not a retention problem. A user who knows they can leave at any time is a user who stays for the right reasons.

Phase plan

Phase 1 — Solo core (the must-ship)

Ship these or nothing:

  • identity.artifacts + artifact_layers + artifact_chunks + artifact_versions tables
  • identity.pseudonyms table (for authorship — see messaging.md)
  • Stoka extraction pipeline (conversation kind)
  • Paste-in + file-upload capture flows
  • Review UI (recipe/method/source layers editable)
  • Personal library page with Stoka-powered search
  • Bon Appétit single-artifact page rendering
  • Markdown + JSON export
  • Conservative PII scrub (TBD on exact aggressiveness)

Phase 2 — Sharing core (after invite round)

  • Share-to-handle primitive on artifact pages
  • Rich artifact-card embed in DM threads
  • Permission cascade via shared links
  • Recipient picker (recent / search / paste handle)
  • Conversation permalinks
  • (Messaging/DM infrastructure: see messaging.md)

Phase 3 — Public layer

  • public_anonymous visibility tier live
  • Stoka final-review pass on public flip
  • Public artifact directory (kind-filterable: /specs, /prompts, /conversations)
  • Pseudonym profile pages with published artifact lists
  • Public Stoka discovery across the published corpus
  • Existing blog posts migrated to kind=essay artifacts (or rendered via legacy view — decide at migration time)

Phase N (deferred)

  • Browser extension capture
  • Programmatic API capture
  • kind=codebase (Tsukuyomi return — see memory)
  • kind=link, kind=note

Open questions (TBDs)

  1. PII scrub aggressiveness. Conservative (names/emails/URLs/paths) vs permissive (just credentials). Default conservative + manual un-redact in review UI. Decide after testing on real conversations.
  2. Source ingestion quality. How well can Stoka parse Claude Code JSON vs ChatGPT exports vs random pasted text? Test against a corpus before committing to formats.
  3. Distilio port opportunity. Distilio may have distillation-pipeline / extraction-UX code that ports to the artifact extraction flow. Status: TBD. Scope once Phase 1 design is more concrete.
  4. Existing blog posts migration. Option A: migrate to kind=essay artifacts now. Option B: keep blog table as-is, migrate in Phase 2/3. Lean B — less risk, faster ship.
  5. Non-conversation kinds in Phase 1. Does kind=spec ship in Phase 1 or Phase 3? Leaning: Phase 1 gets it free because this repo's specs become the first test corpus.
  6. Artifact versioning UX. Diffs between versions — shown as Git-style diffs, or as semantic "what changed in this edit"? TBD after Phase 1 ships.

Related specs

  • README.md — platform mission + phases
  • stoka-bot.md — Stoka identity, voice, RAG pipeline, and the Across-Surfaces pattern that makes artifact extraction share a brain with discovery
  • messaging.md — pseudonyms, DM, share fabric