GGUF Metadata Reader (Browser)

A single-file, static web app to read GGUF model metadata directly in the browser and estimate memory usage (RAM/VRAM) for a chosen context window and KV cache quantization.

Works with remote URLs that support HTTP Range requests (e.g., many Hugging Face files)
Works with local .gguf files (drag-and-drop via file picker)
Detects sharded models (e.g., -00001-of-00013) and sums total size
No server required; everything runs client-side

Quick Start

Option A: Open the file directly

Open index.html in a modern browser (Chrome, Edge, Safari).
Paste a GGUF URL or choose a local .gguf file and click the corresponding button.

Option B: Serve locally (helps with some CORS setups)

cd path/to/model-memory-calculator
python -m http.server 8000
# Then open http://localhost:8000 in your browser

Usage

GGUF URL: Paste a direct link to a .gguf (e.g., a Hugging Face “resolve/main” URL). Many hosts allow partial download via HTTP Range.
Or choose a local GGUF file: Uses the browser’s File API; no upload leaves your machine.
Context size (tokens): Select a preset (e.g., 4K, 16K, 128K) or choose "Custom…" and enter any positive integer token length.
KV cache quantization: Choose how keys/values are stored in memory. Options show approximate bytes per value.
Verbose: Prints debug logs of what’s read and how size is determined.

Click “Read URL” or “Read File”. If successful, you’ll see:

Extracted params: attention_heads, kv_heads, hidden_layers, hidden_size, split_count (if present)
Memory estimate: model size + KV cache size at your chosen context/quantization

How It Works

GGUF parsing: Reads just enough of the GGUF header to extract:
- .attention.head_count
- .attention.head_count_kv
- .block_count
- .embedding_length
- split.count (if present)
Remote file size:
- Tries HEAD to get Content-Length.
- Falls back to a Range: bytes=0-0 request and reads Content-Range.
Sharded models:
- Detects -00001-of-000NN style patterns in URLs or uses split.count metadata.
- Sums sizes across parts (remote) or estimates total from a single shard (local) when possible.
KV cache estimate:
- Uses a simplified formula: bytes_per_value × hidden_size × hidden_layers × context_tokens.
- Shows total as: Model + KV (MB/GB). Actual usage can vary by backend/implementation.

Notes & Limitations

GGUF versions: Supports GGUF v1–v3 headers for the fields listed above.
CORS & Range: Remote hosts must allow cross-origin requests and HTTP Range. If not, size detection may fail; download the file and use the local option instead.
Range ignored: Some servers respond 200 without honoring Range. The app avoids downloading the full body for size only; estimates can fail in this case.
Sanity limits: Very long strings/arrays in metadata are bounded to avoid huge reads.
Estimates only: KV cache math is intentionally simplified. Different runtimes store KV differently (e.g., layout, precision, per-head factors).

Troubleshooting

“Failed to read params.”
- The file may not be GGUF or uses unsupported/unexpected metadata. Try another file or update the URL.
“Could not determine file size or compute usage (CORS/Range?).”
- The remote host may block CORS or not report size via HEAD/Range. Try serving the page locally, a different host, or the local file picker.
Split detection issues
- Ensure URLs use a stable pattern (e.g., -00001-of-000xx) or that split.count is present in metadata.

Privacy

The local file option never uploads your file; parsing happens entirely in your browser.
For remote URLs, the app performs small range requests to read the header and determine file size. It aborts early once required metadata is read.

Development

No build step required. The app is a single page:
- index.html — All logic and UI
Open in a browser or serve with any static server.

License

This project is licensed under the terms in LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
LICENSE		LICENSE
MEMORY_CALCULATION.md		MEMORY_CALCULATION.md
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

GGUF Metadata Reader (Browser)

Quick Start

Usage

How It Works

Notes & Limitations

Troubleshooting

Privacy

Development

License

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Uh oh!

License

Uh oh!

KolosalAI/model-memory-calculator

Folders and files

Latest commit

History

Repository files navigation

GGUF Metadata Reader (Browser)

Quick Start

Usage

How It Works

Notes & Limitations

Troubleshooting

Privacy

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages