Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions mythosforge/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
name: mythosforge
description: MythosForge image generation — create an image from a text prompt via Replicate. Use when the user wants to generate an image, illustration, artwork, avatar, banner, scene, concept art, or pixel-art / game asset from a description — e.g. "make an image of a neon cyberpunk fox", "generate a banner for my token", "pixel-art sprite of a knight". Choose a model with --model: flux-schnell (default, fast/general), nano-banana-2 (high-quality, general), or retro-diffusion (pixel-art / game assets). Supports square / landscape / portrait and webp/png/jpg. Returns an image file. Requires a Replicate API token (REPLICATE_API_TOKEN).
metadata:
{
"clawdbot":
{
"emoji": "🖼️",
"homepage": "https://www.mythosforge.xyz",
"requires":
{ "bins": ["bash", "curl", "jq"], "env": ["REPLICATE_API_TOKEN"] },
},
}
---

## Overview

This is **MythosForge's image-generation skill** — turn a text prompt into an
image. Its default engine is the exact one MythosForge
([mythosforge.xyz](https://www.mythosforge.xyz)) uses in production: **Flux
Schnell** on [Replicate](https://replicate.com). It also exposes two more
Replicate models through one `--model` flag, so you can pick the right engine
for the job:

| `--model` | Replicate model | Best for | Formats |
| ----------------- | --------------------------------- | --------------------------------- | --------- |
| `flux-schnell` ⭐ | `black-forest-labs/flux-schnell` | fast, general (default) | webp/png/jpg |
| `nano-banana-2` | `google/nano-banana-2` | high-quality, general (2K) | png/jpg |
| `retro-diffusion` | `retro-diffusion/rd-plus` | pixel-art / game assets | png |

Bankr's own LLM gateway is text-only, so image generation runs through this
dedicated provider, mirroring MythosForge's production call pattern.

## Getting Started

### 1. Get a Replicate API token

Create one at
[replicate.com/account/api-tokens](https://replicate.com/account/api-tokens).
Replicate uses Bearer auth and bills per prediction.

### 2. Export the token

```bash
export REPLICATE_API_TOKEN="r8_your_token"
```

Keep it in an environment variable — **never hardcode it**, and add your
`.env` to `.gitignore`.

### 3. Generate an image

```bash
./scripts/generate.sh "a neon cyberpunk fox, glowing eyes, rain"
# → ✓ wrote mythosforge-20260605-201500.webp (model: flux-schnell, square, webp, prompt: "a neon cyberpunk fox, ...")
```

## Usage

```
scripts/generate.sh "<prompt>" [options]

Options:
-m, --model NAME flux-schnell | nano-banana-2 | retro-diffusion
(default: flux-schnell — MythosForge's production default)
-o, --out FILE Output image path (default: mythosforge-<timestamp>.<ext>)
-a, --aspect RATIO square | landscape | portrait (default: square)
-f, --format FMT webp | png | jpg (default: webp; coerced per model)
-h, --help Show help
```

### Examples

```bash
# Default (Flux Schnell): landscape banner as PNG
./scripts/generate.sh "epic fantasy castle at sunset" -a landscape -f png -o banner.png

# High-quality general image (Nano Banana 2)
./scripts/generate.sh "portrait of a cyber-samurai" -m nano-banana-2 -a portrait -o avatar.png

# Pixel-art game asset (Retro Diffusion)
./scripts/generate.sh "pixel-art knight with a sword" -m retro-diffusion -o knight.png
```

## How it works

`generate.sh` selects the chosen model's Replicate endpoint
(`https://api.replicate.com/v1/models/<owner>/<model>/predictions`), builds that
model's input body (each model has its own schema — see references), POSTs with
`Prefer: wait` (so the prediction usually completes in one call), polls
`urls.get` if it's still processing, then downloads the resulting image URL to a
file. Output is a URL array (flux / retro-diffusion) or a single URL
(nano-banana-2) — both are handled. It is **fail-closed**: a missing token, a
non-2xx status, a `failed` / `canceled` prediction, a timeout, or an empty
download all exit non-zero with a clear message and leave no partial file.

See [`references/image-gen-api.md`](references/image-gen-api.md) for
the exact request/response shape and the prediction lifecycle.

## Notes & limits

- **Pick the model for the job.** `flux-schnell` / `nano-banana-2` for general
images; `retro-diffusion` for pixel-art and game assets.
- **Aspect handling.** flux & nano-banana-2 use `aspect_ratio`; retro-diffusion
has no aspect ratio, so square/landscape/portrait map to pixel canvas sizes
(256×256 / 384×256 / 256×384).
- **Format.** `nano-banana-2` outputs png/jpg only (webp is coerced to png);
`retro-diffusion` always outputs png.
- **Cost.** Each generation is a billed Replicate prediction (nano-banana-2 at
2K costs more than flux-schnell).
- **Determinism.** Output varies per run; refine the prompt for control.
- **Requires** `bash`, `curl`, and `jq` on PATH.
132 changes: 132 additions & 0 deletions mythosforge/references/image-gen-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# MythosForge image gen — Replicate API reference

Auth: `Authorization: Bearer $REPLICATE_API_TOKEN` on every request.

This skill uses Replicate's **official-model predictions** endpoint
(`POST /v1/models/<owner>/<model>/predictions`), which runs a model by name
without pinning a version hash. The prediction lifecycle, headers, polling, and
error handling are identical across models — only the `input` schema and the
output shape differ. The three supported models and their schemas are below;
all were verified from each model's `llms.txt` on Replicate.

## flux-schnell — `POST /v1/models/black-forest-labs/flux-schnell/predictions`

Create a prediction (one image).

Full URL:
`https://api.replicate.com/v1/models/black-forest-labs/flux-schnell/predictions`

### Headers

| Header | Value | Notes |
| --------------- | ------------------------------ | ----------------------------------------------------------- |
| `Authorization` | `Bearer $REPLICATE_API_TOKEN` | Required. |
| `Content-Type` | `application/json` | Required. |
| `Prefer` | `wait` | Block up to ~60s so the response is usually already done. |

### Request body — `input` fields

| Field | Type | Notes |
| ---------------- | ------- | ------------------------------------------------------ |
| `prompt` | string | The text description (required). |
| `num_outputs` | integer | Number of images (this skill uses `1`). |
| `aspect_ratio` | string | `1:1` (square), `16:9` (landscape), `3:4` (portrait). |
| `output_format` | string | `webp` (default), `png`, or `jpg`. |
| `output_quality` | integer | 0–100 (this skill uses `85`). |
| `go_fast` | boolean | `true` → fastest path for Schnell. |

### Example

```bash
curl -sS -X POST \
"https://api.replicate.com/v1/models/black-forest-labs/flux-schnell/predictions" \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-H "Prefer: wait" \
-d '{
"input": {
"prompt": "a neon cyberpunk fox, glowing eyes, rain",
"num_outputs": 1,
"aspect_ratio": "1:1",
"output_format": "webp",
"output_quality": 85,
"go_fast": true
}
}'
```

### Response (prediction object)

```json
{
"id": "abc123",
"status": "succeeded",
"output": ["https://replicate.delivery/.../out-0.webp"],
"urls": { "get": "https://api.replicate.com/v1/predictions/abc123", "cancel": "..." },
"error": null
}
```

- `output` is an **array of image URLs** (take `output[0]`). On some models it
can be a single string — handle both.
- `status` is one of `starting`, `processing`, `succeeded`, `failed`,
`canceled`.

## nano-banana-2 — `POST /v1/models/google/nano-banana-2/predictions`

High-quality general model (Google). Same headers/lifecycle as above.

### Request body — `input` fields

| Field | Type | Notes |
| --------------- | ------- | --------------------------------------------------------------------- |
| `prompt` | string | The text description (required). |
| `aspect_ratio` | string | `1:1`, `3:4`, `4:3`, `16:9`, `9:16`, … (`match_input_image` also valid). |
| `resolution` | string | `512px`, `1K`, `2K`, `4K` (this skill uses `2K`). |
| `output_format` | string | `jpg` (default) or `png` — **no webp** (skill coerces webp→png). |
| `image_input` | array | Optional reference images (unused here). |

> Output is a **single URL string** in `output` (not an array). `generate.sh`
> handles both array and string.

## retro-diffusion — `POST /v1/models/retro-diffusion/rd-plus/predictions`

Pixel-art / game-asset model. Same headers/lifecycle as above.

### Request body — `input` fields

| Field | Type | Notes |
| ---------- | ------- | --------------------------------------------------------------------------- |
| `prompt` | string | The text description (required). |
| `style` | string | enum incl. `default`, `retro`, `cartoon`, `item_sheet`, `isometric`, `topdown_map`, … (skill uses `default`). |
| `width` | integer | Pixel width (skill maps aspect → 256 / 384). |
| `height` | integer | Pixel height (skill maps aspect → 256 / 384). |
| `remove_bg`| boolean | Optional transparent background (unused here). |
| `tile_x` / `tile_y` | boolean | Optional seamless tiling (unused here). |

> No `aspect_ratio` field — set `width`/`height` directly. Output is an **array
> of URLs**; always PNG.

## Prediction lifecycle (polling)

With `Prefer: wait` the first response is usually `succeeded`. If it's still
`starting`/`processing`, poll `urls.get` until terminal:

```bash
curl -sS "https://api.replicate.com/v1/predictions/<id>" \
-H "Authorization: Bearer $REPLICATE_API_TOKEN"
```

Stop on `succeeded` (read the output URL), or fail on `failed` / `canceled`
(read `error`). `generate.sh` polls up to 30× at 2s intervals, then treats it as
a timeout and exits non-zero.

> ⚠️ Output URLs on `replicate.delivery` are **temporary** (they expire, often
> within ~1h). Download the bytes to a file or re-host them immediately — don't
> store the raw URL as if it were permanent. `generate.sh` downloads on success.

### Common errors

Non-2xx responses carry a JSON body with `detail` / `title` — e.g. `401`
(bad/missing token), `402` (payment/credits), `422` (invalid input).
`generate.sh` treats any non-2xx as a hard failure and writes no file.
120 changes: 120 additions & 0 deletions mythosforge/scripts/generate.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
#!/usr/bin/env bash
# generate.sh — MythosForge image generation: text prompt → image, via Replicate.
# Mirrors MythosForge's production image pipeline; supports multiple Replicate models.
#
# Usage:
# REPLICATE_API_TOKEN=... ./generate.sh "<prompt>" [options]
#
# Options:
# -m, --model NAME flux-schnell | nano-banana-2 | retro-diffusion
# (default: flux-schnell — MythosForge's production default)
# -o, --out FILE Output image path (default: mythosforge-<timestamp>.<ext>)
# -a, --aspect RATIO square | landscape | portrait (default: square)
# -f, --format FMT webp | png | jpg (default: webp; coerced per model)
# -h, --help Show this help
#
# Models:
# flux-schnell fast, general (black-forest-labs/flux-schnell) — webp/png/jpg
# nano-banana-2 high-quality, general (google/nano-banana-2) — png/jpg, 2K
# retro-diffusion pixel-art / game assets (retro-diffusion/rd-plus) — png
#
# Requires: bash, curl, jq. Exits non-zero on any error (fail-closed).

set -euo pipefail

die() { echo "error: $*" >&2; exit 1; }
usage() { sed -n '5,22p' "$0" | sed 's/^# \{0,1\}//'; }

# --- args --------------------------------------------------------------------
PROMPT=""; OUT=""; ASPECT="square"; FORMAT="webp"; MODEL="flux-schnell"
while [ $# -gt 0 ]; do
case "$1" in
-m|--model) MODEL="${2:?--model needs a value}"; shift 2;;
-o|--out) OUT="${2:?--out needs a path}"; shift 2;;
-a|--aspect) ASPECT="${2:?--aspect needs a value}"; shift 2;;
-f|--format) FORMAT="${2:?--format needs a value}"; shift 2;;
-h|--help) usage; exit 0;;
--) shift; break;;
-*) die "unknown option: $1";;
*) if [ -z "$PROMPT" ]; then PROMPT="$1"; else PROMPT="$PROMPT $1"; fi; shift;;
esac
done

# --- preconditions (after parsing so --help never needs them) ----------------
command -v curl >/dev/null 2>&1 || die "curl is required"
command -v jq >/dev/null 2>&1 || die "jq is required"
[ -n "${REPLICATE_API_TOKEN:-}" ] || \
die "REPLICATE_API_TOKEN is not set (get one at https://replicate.com/account/api-tokens)"
[ -n "$PROMPT" ] || die "a text prompt is required (e.g. \"a neon cyberpunk fox\")"
case "$ASPECT" in square|landscape|portrait) :;; *) die "invalid --aspect: $ASPECT (square|landscape|portrait)";; esac
case "$FORMAT" in webp|png|jpg) :;; *) die "invalid --format: $FORMAT (webp|png|jpg)";; esac

# --- per-model adapter: sets SLUG, BODY, EXT ---------------------------------
# Each Replicate model has its own input schema; we map the shared
# prompt/aspect/format args onto each model's fields (verified from their llms.txt).
case "$MODEL" in
flux-schnell)
SLUG="black-forest-labs/flux-schnell"
case "$ASPECT" in square) R="1:1";; landscape) R="16:9";; portrait) R="3:4";; esac
EXT="$FORMAT" # webp/png/jpg all supported
BODY=$(jq -n --arg p "$PROMPT" --arg r "$R" --arg f "$EXT" \
'{input:{prompt:$p, num_outputs:1, aspect_ratio:$r, output_format:$f, output_quality:85, go_fast:true}}')
;;
nano-banana-2)
SLUG="google/nano-banana-2"
case "$ASPECT" in square) R="1:1";; landscape) R="16:9";; portrait) R="3:4";; esac
EXT="$FORMAT"; [ "$EXT" = "webp" ] && EXT="png" # nano-banana-2 supports png/jpg only
BODY=$(jq -n --arg p "$PROMPT" --arg r "$R" --arg f "$EXT" \
'{input:{prompt:$p, aspect_ratio:$r, resolution:"2K", output_format:$f}}')
;;
retro-diffusion)
SLUG="retro-diffusion/rd-plus"
# rd-plus has no aspect_ratio — map aspect to a pixel-art canvas; output is PNG.
case "$ASPECT" in square) W=256; H=256;; landscape) W=384; H=256;; portrait) W=256; H=384;; esac
EXT="png"
BODY=$(jq -n --arg p "$PROMPT" --argjson w "$W" --argjson h "$H" \
'{input:{prompt:$p, style:"default", width:$w, height:$h}}')
;;
*) die "invalid --model: $MODEL (flux-schnell|nano-banana-2|retro-diffusion)";;
esac

API="https://api.replicate.com/v1/models/$SLUG/predictions"
[ -n "$OUT" ] || OUT="mythosforge-$(date +%Y%m%d-%H%M%S).$EXT"

# --- submit (Prefer: wait often returns succeeded in one call) ---------------
RESP=$(curl -sS -w $'\n%{http_code}' -X POST "$API" \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-H "Prefer: wait" \
-d "$BODY") || die "request to Replicate failed (network/curl)"
CODE=$(printf '%s' "$RESP" | tail -n1)
JSON=$(printf '%s' "$RESP" | sed '$d')
case "$CODE" in
200|201) :;;
*) die "Replicate ($MODEL) returned HTTP $CODE: $(printf '%s' "$JSON" | jq -r '.detail // .title // .' 2>/dev/null || printf '%s' "$JSON")";;
esac

st() { printf '%s' "$1" | jq -r '.status // empty'; }
# output is an array of URLs (flux, rd-plus) or a single URL string (nano-banana-2)
url() { printf '%s' "$1" | jq -r 'if (.output|type)=="array" then (.output[0] // empty) else (.output // empty) end'; }

# --- poll until terminal -----------------------------------------------------
ST=$(st "$JSON"); URL=$(url "$JSON"); i=0
while [ "$ST" != "succeeded" ] && [ "$i" -lt 30 ]; do
case "$ST" in
failed|canceled) die "Replicate prediction $ST: $(printf '%s' "$JSON" | jq -r '.error // empty')";;
esac
GET=$(printf '%s' "$JSON" | jq -r '.urls.get // empty')
[ -n "$GET" ] || die "prediction pending but no poll URL (status: ${ST:-unknown})"
sleep 2
JSON=$(curl -sS "$GET" -H "Authorization: Bearer $REPLICATE_API_TOKEN") || die "poll request failed"
ST=$(st "$JSON"); URL=$(url "$JSON"); i=$((i+1))
done
[ "$ST" = "succeeded" ] || die "Replicate prediction did not succeed (status: ${ST:-timeout})"
[ -n "$URL" ] || die "no output URL in succeeded prediction"

# --- download ----------------------------------------------------------------
curl -sSL -o "$OUT" "$URL" || die "failed to download image"
[ -s "$OUT" ] || { rm -f "$OUT"; die "downloaded image was empty"; }

echo "✓ wrote $OUT (model: $MODEL, $ASPECT, $EXT, prompt: \"$PROMPT\")"