feat: add JAX/Flax reference implementation for inference #217

atveit · 2025-10-22T09:54:14Z

Add JAX backend for GPT-OSS. This implementation uses BF16 precision with non-quantized KV caching.

Key features:

Clean, simplified implementation without experimental optimizations
Support for conversion from SafeTensors to Orbax checkpoint formats
Automatic checkpoint format detection
Integrated with existing gpt_oss.generate interface

Structure:

gpt_oss/jax/: Core model and inference files
gpt_oss/jax/scripts/: Checkpoint conversion utilities
Clear file naming: loader_safetensors.py, loader_orbax.py

Usage:
# Convert checkpoint (optional, for faster loading) python -m gpt_oss.jax --input gpt-oss-20b/original/ --output gpt-oss-20b-orbax/

# Run inference python -m gpt_oss.generate --backend jax gpt-oss-20b-orbax/ -p "your prompt"

Files: 136 KB across 14 Python files in gpt_oss/jax/

This follows the convention of:

Using feat: prefix for new features
Clear, concise summary line
Detailed description of what and why
Key features listed
Usage examples
Metrics (file size/count)

Add JAX backend for CPU-based inference on Apple Silicon and x86-64 platforms. This implementation uses BF16 precision with non-quantized KV caching for efficient autoregressive generation. Key features: - Clean, simplified implementation without experimental optimizations - Support for both SafeTensors and Orbax checkpoint formats - Fast Orbax loading (~18x speedup: 5s vs 90s) - Automatic checkpoint format detection - Integrated with existing gpt_oss.generate interface Structure: - gpt_oss/jax/: Core model and inference files - gpt_oss/jax/scripts/: Checkpoint conversion utilities - Clear file naming: loader_safetensors.py, loader_orbax.py Usage: # Convert checkpoint (optional, for faster loading) python -m gpt_oss.jax --input gpt-oss-20b/original/ --output gpt-oss-20b-orbax/ # Run inference python -m gpt_oss.generate --backend jax gpt-oss-20b-orbax/ -p "your prompt" Files: 136 KB across 14 Python files in gpt_oss/jax/ This follows the convention of: - Using feat: prefix for new features - Clear, concise summary line - Detailed description of what and why - Key features listed - Usage examples - Metrics (file size/count)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

gpt_oss/jax/token_generator.py

gpt_oss/jax/inference.py

….lax.cond for token sampling and jax.lax.dynamic_update_slice for KV cache updates. Add @jax.jit decorators to performance-critical functions (token sampling, cache extension, RoPE, SDPA) while removing 40+ assert statements that prevent JIT compilation.

… KVCache as PyTree - Detect Orbax vs SafeTensors format before attempting to load config.json - Use load_config_from_orbax fallback for Orbax checkpoints without config files - Register KVCache as JAX PyTree to enable JIT compilation with KV caching - Fixes TypeError when using experimental jit_generate_loop=True mode Addresses code review feedback on commit c85ba18

atsentia · 2025-10-22T19:13:10Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

gpt_oss/jax/loader_orbax.py

…fig.json instead of hardcoding values to support both gpt-oss-20B and gpt-oss-120B models

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

gpt_oss/jax/loader_orbax.py

atveit · 2025-10-23T09:09:05Z

@codex review - believe all issues have been resolved now?

chatgpt-codex-connector · 2025-10-23T09:13:24Z

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

atveit changed the title ~~feat: add JAX/Flax reference implementation for CPU inference~~ feat: add JAX/Flax reference implementation for inference Oct 22, 2025

chatgpt-codex-connector bot reviewed Oct 22, 2025

View reviewed changes

gpt_oss/jax/token_generator.py Show resolved Hide resolved

gpt_oss/jax/inference.py Show resolved Hide resolved

Amund Tveit added 2 commits October 22, 2025 13:52

chatgpt-codex-connector bot reviewed Oct 22, 2025

View reviewed changes

gpt_oss/jax/loader_orbax.py Outdated Show resolved Hide resolved

atveit and others added 2 commits October 23, 2025 10:18

Merge branch 'openai:main' into main

c29bedb

fix(jax): load model config from Orbax checkpoint directory. Read con…

71e9d86

…fig.json instead of hardcoding values to support both gpt-oss-20B and gpt-oss-120B models

chatgpt-codex-connector bot reviewed Oct 23, 2025

View reviewed changes

gpt_oss/jax/loader_orbax.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add JAX/Flax reference implementation for inference #217

feat: add JAX/Flax reference implementation for inference #217

atveit commented Oct 22, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

atsentia commented Oct 22, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

atveit commented Oct 23, 2025

Uh oh!

chatgpt-codex-connector bot commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add JAX/Flax reference implementation for inference #217

Are you sure you want to change the base?

feat: add JAX/Flax reference implementation for inference #217

Conversation

atveit commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

atsentia commented Oct 22, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

atveit commented Oct 23, 2025

Uh oh!

chatgpt-codex-connector bot commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

atveit commented Oct 22, 2025 •

edited

Loading