diff --git a/mythosforge/SKILL.md b/mythosforge/SKILL.md new file mode 100644 index 0000000000..16e428749d --- /dev/null +++ b/mythosforge/SKILL.md @@ -0,0 +1,112 @@ +--- +name: mythosforge +description: MythosForge image generation — create an image from a text prompt via Replicate. Use when the user wants to generate an image, illustration, artwork, avatar, banner, scene, concept art, or pixel-art / game asset from a description — e.g. "make an image of a neon cyberpunk fox", "generate a banner for my token", "pixel-art sprite of a knight". Choose a model with --model: flux-schnell (default, fast/general), nano-banana-2 (high-quality, general), or retro-diffusion (pixel-art / game assets). Supports square / landscape / portrait and webp/png/jpg. Returns an image file. Requires a Replicate API token (REPLICATE_API_TOKEN). +metadata: + { + "clawdbot": + { + "emoji": "🖼️", + "homepage": "https://www.mythosforge.xyz", + "requires": + { "bins": ["bash", "curl", "jq"], "env": ["REPLICATE_API_TOKEN"] }, + }, + } +--- + +## Overview + +This is **MythosForge's image-generation skill** — turn a text prompt into an +image. Its default engine is the exact one MythosForge +([mythosforge.xyz](https://www.mythosforge.xyz)) uses in production: **Flux +Schnell** on [Replicate](https://replicate.com). It also exposes two more +Replicate models through one `--model` flag, so you can pick the right engine +for the job: + +| `--model` | Replicate model | Best for | Formats | +| ----------------- | --------------------------------- | --------------------------------- | --------- | +| `flux-schnell` ⭐ | `black-forest-labs/flux-schnell` | fast, general (default) | webp/png/jpg | +| `nano-banana-2` | `google/nano-banana-2` | high-quality, general (2K) | png/jpg | +| `retro-diffusion` | `retro-diffusion/rd-plus` | pixel-art / game assets | png | + +Bankr's own LLM gateway is text-only, so image generation runs through this +dedicated provider, mirroring MythosForge's production call pattern. + +## Getting Started + +### 1. Get a Replicate API token + +Create one at +[replicate.com/account/api-tokens](https://replicate.com/account/api-tokens). +Replicate uses Bearer auth and bills per prediction. + +### 2. Export the token + +```bash +export REPLICATE_API_TOKEN="r8_your_token" +``` + +Keep it in an environment variable — **never hardcode it**, and add your +`.env` to `.gitignore`. + +### 3. Generate an image + +```bash +./scripts/generate.sh "a neon cyberpunk fox, glowing eyes, rain" +# → ✓ wrote mythosforge-20260605-201500.webp (model: flux-schnell, square, webp, prompt: "a neon cyberpunk fox, ...") +``` + +## Usage + +``` +scripts/generate.sh "" [options] + +Options: + -m, --model NAME flux-schnell | nano-banana-2 | retro-diffusion + (default: flux-schnell — MythosForge's production default) + -o, --out FILE Output image path (default: mythosforge-.) + -a, --aspect RATIO square | landscape | portrait (default: square) + -f, --format FMT webp | png | jpg (default: webp; coerced per model) + -h, --help Show help +``` + +### Examples + +```bash +# Default (Flux Schnell): landscape banner as PNG +./scripts/generate.sh "epic fantasy castle at sunset" -a landscape -f png -o banner.png + +# High-quality general image (Nano Banana 2) +./scripts/generate.sh "portrait of a cyber-samurai" -m nano-banana-2 -a portrait -o avatar.png + +# Pixel-art game asset (Retro Diffusion) +./scripts/generate.sh "pixel-art knight with a sword" -m retro-diffusion -o knight.png +``` + +## How it works + +`generate.sh` selects the chosen model's Replicate endpoint +(`https://api.replicate.com/v1/models///predictions`), builds that +model's input body (each model has its own schema — see references), POSTs with +`Prefer: wait` (so the prediction usually completes in one call), polls +`urls.get` if it's still processing, then downloads the resulting image URL to a +file. Output is a URL array (flux / retro-diffusion) or a single URL +(nano-banana-2) — both are handled. It is **fail-closed**: a missing token, a +non-2xx status, a `failed` / `canceled` prediction, a timeout, or an empty +download all exit non-zero with a clear message and leave no partial file. + +See [`references/image-gen-api.md`](references/image-gen-api.md) for +the exact request/response shape and the prediction lifecycle. + +## Notes & limits + +- **Pick the model for the job.** `flux-schnell` / `nano-banana-2` for general + images; `retro-diffusion` for pixel-art and game assets. +- **Aspect handling.** flux & nano-banana-2 use `aspect_ratio`; retro-diffusion + has no aspect ratio, so square/landscape/portrait map to pixel canvas sizes + (256×256 / 384×256 / 256×384). +- **Format.** `nano-banana-2` outputs png/jpg only (webp is coerced to png); + `retro-diffusion` always outputs png. +- **Cost.** Each generation is a billed Replicate prediction (nano-banana-2 at + 2K costs more than flux-schnell). +- **Determinism.** Output varies per run; refine the prompt for control. +- **Requires** `bash`, `curl`, and `jq` on PATH. diff --git a/mythosforge/references/image-gen-api.md b/mythosforge/references/image-gen-api.md new file mode 100644 index 0000000000..349a08bc75 --- /dev/null +++ b/mythosforge/references/image-gen-api.md @@ -0,0 +1,132 @@ +# MythosForge image gen — Replicate API reference + +Auth: `Authorization: Bearer $REPLICATE_API_TOKEN` on every request. + +This skill uses Replicate's **official-model predictions** endpoint +(`POST /v1/models///predictions`), which runs a model by name +without pinning a version hash. The prediction lifecycle, headers, polling, and +error handling are identical across models — only the `input` schema and the +output shape differ. The three supported models and their schemas are below; +all were verified from each model's `llms.txt` on Replicate. + +## flux-schnell — `POST /v1/models/black-forest-labs/flux-schnell/predictions` + +Create a prediction (one image). + +Full URL: +`https://api.replicate.com/v1/models/black-forest-labs/flux-schnell/predictions` + +### Headers + +| Header | Value | Notes | +| --------------- | ------------------------------ | ----------------------------------------------------------- | +| `Authorization` | `Bearer $REPLICATE_API_TOKEN` | Required. | +| `Content-Type` | `application/json` | Required. | +| `Prefer` | `wait` | Block up to ~60s so the response is usually already done. | + +### Request body — `input` fields + +| Field | Type | Notes | +| ---------------- | ------- | ------------------------------------------------------ | +| `prompt` | string | The text description (required). | +| `num_outputs` | integer | Number of images (this skill uses `1`). | +| `aspect_ratio` | string | `1:1` (square), `16:9` (landscape), `3:4` (portrait). | +| `output_format` | string | `webp` (default), `png`, or `jpg`. | +| `output_quality` | integer | 0–100 (this skill uses `85`). | +| `go_fast` | boolean | `true` → fastest path for Schnell. | + +### Example + +```bash +curl -sS -X POST \ + "https://api.replicate.com/v1/models/black-forest-labs/flux-schnell/predictions" \ + -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ + -H "Content-Type: application/json" \ + -H "Prefer: wait" \ + -d '{ + "input": { + "prompt": "a neon cyberpunk fox, glowing eyes, rain", + "num_outputs": 1, + "aspect_ratio": "1:1", + "output_format": "webp", + "output_quality": 85, + "go_fast": true + } + }' +``` + +### Response (prediction object) + +```json +{ + "id": "abc123", + "status": "succeeded", + "output": ["https://replicate.delivery/.../out-0.webp"], + "urls": { "get": "https://api.replicate.com/v1/predictions/abc123", "cancel": "..." }, + "error": null +} +``` + +- `output` is an **array of image URLs** (take `output[0]`). On some models it + can be a single string — handle both. +- `status` is one of `starting`, `processing`, `succeeded`, `failed`, + `canceled`. + +## nano-banana-2 — `POST /v1/models/google/nano-banana-2/predictions` + +High-quality general model (Google). Same headers/lifecycle as above. + +### Request body — `input` fields + +| Field | Type | Notes | +| --------------- | ------- | --------------------------------------------------------------------- | +| `prompt` | string | The text description (required). | +| `aspect_ratio` | string | `1:1`, `3:4`, `4:3`, `16:9`, `9:16`, … (`match_input_image` also valid). | +| `resolution` | string | `512px`, `1K`, `2K`, `4K` (this skill uses `2K`). | +| `output_format` | string | `jpg` (default) or `png` — **no webp** (skill coerces webp→png). | +| `image_input` | array | Optional reference images (unused here). | + +> Output is a **single URL string** in `output` (not an array). `generate.sh` +> handles both array and string. + +## retro-diffusion — `POST /v1/models/retro-diffusion/rd-plus/predictions` + +Pixel-art / game-asset model. Same headers/lifecycle as above. + +### Request body — `input` fields + +| Field | Type | Notes | +| ---------- | ------- | --------------------------------------------------------------------------- | +| `prompt` | string | The text description (required). | +| `style` | string | enum incl. `default`, `retro`, `cartoon`, `item_sheet`, `isometric`, `topdown_map`, … (skill uses `default`). | +| `width` | integer | Pixel width (skill maps aspect → 256 / 384). | +| `height` | integer | Pixel height (skill maps aspect → 256 / 384). | +| `remove_bg`| boolean | Optional transparent background (unused here). | +| `tile_x` / `tile_y` | boolean | Optional seamless tiling (unused here). | + +> No `aspect_ratio` field — set `width`/`height` directly. Output is an **array +> of URLs**; always PNG. + +## Prediction lifecycle (polling) + +With `Prefer: wait` the first response is usually `succeeded`. If it's still +`starting`/`processing`, poll `urls.get` until terminal: + +```bash +curl -sS "https://api.replicate.com/v1/predictions/" \ + -H "Authorization: Bearer $REPLICATE_API_TOKEN" +``` + +Stop on `succeeded` (read the output URL), or fail on `failed` / `canceled` +(read `error`). `generate.sh` polls up to 30× at 2s intervals, then treats it as +a timeout and exits non-zero. + +> ⚠️ Output URLs on `replicate.delivery` are **temporary** (they expire, often +> within ~1h). Download the bytes to a file or re-host them immediately — don't +> store the raw URL as if it were permanent. `generate.sh` downloads on success. + +### Common errors + +Non-2xx responses carry a JSON body with `detail` / `title` — e.g. `401` +(bad/missing token), `402` (payment/credits), `422` (invalid input). +`generate.sh` treats any non-2xx as a hard failure and writes no file. diff --git a/mythosforge/scripts/generate.sh b/mythosforge/scripts/generate.sh new file mode 100755 index 0000000000..03d2c2faec --- /dev/null +++ b/mythosforge/scripts/generate.sh @@ -0,0 +1,120 @@ +#!/usr/bin/env bash +# generate.sh — MythosForge image generation: text prompt → image, via Replicate. +# Mirrors MythosForge's production image pipeline; supports multiple Replicate models. +# +# Usage: +# REPLICATE_API_TOKEN=... ./generate.sh "" [options] +# +# Options: +# -m, --model NAME flux-schnell | nano-banana-2 | retro-diffusion +# (default: flux-schnell — MythosForge's production default) +# -o, --out FILE Output image path (default: mythosforge-.) +# -a, --aspect RATIO square | landscape | portrait (default: square) +# -f, --format FMT webp | png | jpg (default: webp; coerced per model) +# -h, --help Show this help +# +# Models: +# flux-schnell fast, general (black-forest-labs/flux-schnell) — webp/png/jpg +# nano-banana-2 high-quality, general (google/nano-banana-2) — png/jpg, 2K +# retro-diffusion pixel-art / game assets (retro-diffusion/rd-plus) — png +# +# Requires: bash, curl, jq. Exits non-zero on any error (fail-closed). + +set -euo pipefail + +die() { echo "error: $*" >&2; exit 1; } +usage() { sed -n '5,22p' "$0" | sed 's/^# \{0,1\}//'; } + +# --- args -------------------------------------------------------------------- +PROMPT=""; OUT=""; ASPECT="square"; FORMAT="webp"; MODEL="flux-schnell" +while [ $# -gt 0 ]; do + case "$1" in + -m|--model) MODEL="${2:?--model needs a value}"; shift 2;; + -o|--out) OUT="${2:?--out needs a path}"; shift 2;; + -a|--aspect) ASPECT="${2:?--aspect needs a value}"; shift 2;; + -f|--format) FORMAT="${2:?--format needs a value}"; shift 2;; + -h|--help) usage; exit 0;; + --) shift; break;; + -*) die "unknown option: $1";; + *) if [ -z "$PROMPT" ]; then PROMPT="$1"; else PROMPT="$PROMPT $1"; fi; shift;; + esac +done + +# --- preconditions (after parsing so --help never needs them) ---------------- +command -v curl >/dev/null 2>&1 || die "curl is required" +command -v jq >/dev/null 2>&1 || die "jq is required" +[ -n "${REPLICATE_API_TOKEN:-}" ] || \ + die "REPLICATE_API_TOKEN is not set (get one at https://replicate.com/account/api-tokens)" +[ -n "$PROMPT" ] || die "a text prompt is required (e.g. \"a neon cyberpunk fox\")" +case "$ASPECT" in square|landscape|portrait) :;; *) die "invalid --aspect: $ASPECT (square|landscape|portrait)";; esac +case "$FORMAT" in webp|png|jpg) :;; *) die "invalid --format: $FORMAT (webp|png|jpg)";; esac + +# --- per-model adapter: sets SLUG, BODY, EXT --------------------------------- +# Each Replicate model has its own input schema; we map the shared +# prompt/aspect/format args onto each model's fields (verified from their llms.txt). +case "$MODEL" in + flux-schnell) + SLUG="black-forest-labs/flux-schnell" + case "$ASPECT" in square) R="1:1";; landscape) R="16:9";; portrait) R="3:4";; esac + EXT="$FORMAT" # webp/png/jpg all supported + BODY=$(jq -n --arg p "$PROMPT" --arg r "$R" --arg f "$EXT" \ + '{input:{prompt:$p, num_outputs:1, aspect_ratio:$r, output_format:$f, output_quality:85, go_fast:true}}') + ;; + nano-banana-2) + SLUG="google/nano-banana-2" + case "$ASPECT" in square) R="1:1";; landscape) R="16:9";; portrait) R="3:4";; esac + EXT="$FORMAT"; [ "$EXT" = "webp" ] && EXT="png" # nano-banana-2 supports png/jpg only + BODY=$(jq -n --arg p "$PROMPT" --arg r "$R" --arg f "$EXT" \ + '{input:{prompt:$p, aspect_ratio:$r, resolution:"2K", output_format:$f}}') + ;; + retro-diffusion) + SLUG="retro-diffusion/rd-plus" + # rd-plus has no aspect_ratio — map aspect to a pixel-art canvas; output is PNG. + case "$ASPECT" in square) W=256; H=256;; landscape) W=384; H=256;; portrait) W=256; H=384;; esac + EXT="png" + BODY=$(jq -n --arg p "$PROMPT" --argjson w "$W" --argjson h "$H" \ + '{input:{prompt:$p, style:"default", width:$w, height:$h}}') + ;; + *) die "invalid --model: $MODEL (flux-schnell|nano-banana-2|retro-diffusion)";; +esac + +API="https://api.replicate.com/v1/models/$SLUG/predictions" +[ -n "$OUT" ] || OUT="mythosforge-$(date +%Y%m%d-%H%M%S).$EXT" + +# --- submit (Prefer: wait often returns succeeded in one call) --------------- +RESP=$(curl -sS -w $'\n%{http_code}' -X POST "$API" \ + -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ + -H "Content-Type: application/json" \ + -H "Prefer: wait" \ + -d "$BODY") || die "request to Replicate failed (network/curl)" +CODE=$(printf '%s' "$RESP" | tail -n1) +JSON=$(printf '%s' "$RESP" | sed '$d') +case "$CODE" in + 200|201) :;; + *) die "Replicate ($MODEL) returned HTTP $CODE: $(printf '%s' "$JSON" | jq -r '.detail // .title // .' 2>/dev/null || printf '%s' "$JSON")";; +esac + +st() { printf '%s' "$1" | jq -r '.status // empty'; } +# output is an array of URLs (flux, rd-plus) or a single URL string (nano-banana-2) +url() { printf '%s' "$1" | jq -r 'if (.output|type)=="array" then (.output[0] // empty) else (.output // empty) end'; } + +# --- poll until terminal ----------------------------------------------------- +ST=$(st "$JSON"); URL=$(url "$JSON"); i=0 +while [ "$ST" != "succeeded" ] && [ "$i" -lt 30 ]; do + case "$ST" in + failed|canceled) die "Replicate prediction $ST: $(printf '%s' "$JSON" | jq -r '.error // empty')";; + esac + GET=$(printf '%s' "$JSON" | jq -r '.urls.get // empty') + [ -n "$GET" ] || die "prediction pending but no poll URL (status: ${ST:-unknown})" + sleep 2 + JSON=$(curl -sS "$GET" -H "Authorization: Bearer $REPLICATE_API_TOKEN") || die "poll request failed" + ST=$(st "$JSON"); URL=$(url "$JSON"); i=$((i+1)) +done +[ "$ST" = "succeeded" ] || die "Replicate prediction did not succeed (status: ${ST:-timeout})" +[ -n "$URL" ] || die "no output URL in succeeded prediction" + +# --- download ---------------------------------------------------------------- +curl -sSL -o "$OUT" "$URL" || die "failed to download image" +[ -s "$OUT" ] || { rm -f "$OUT"; die "downloaded image was empty"; } + +echo "✓ wrote $OUT (model: $MODEL, $ASPECT, $EXT, prompt: \"$PROMPT\")"