Skip to content

[SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT] Seed 42 log contains ttt_hash_embed/ttt_hash_buckets not present in submitted train_gpt.py #1594

@GabrieleCirillo

Description

@GabrieleCirillo

ttt_hash_embed and ttt_hash_buckets present in seed 42 log but not in other seeds or train_gpt.py

Record: records/track_10min_16mb/2026-04-09_SP8192_3LayerRecur_ParResid_QK525_LegalTTT

While running experiments based on this record, I noticed a discrepancy across the three seed logs. Seed 42 includes two hyperparameters — ttt_hash_embed: True and ttt_hash_buckets: 16384 — that don't appear in seeds 314 or 999, and don't appear anywhere in the committed train_gpt.py:

Field Seed 42 Seed 314 Seed 999
ttt_hash_embed True not present not present
ttt_hash_buckets 16384 not present not present
Code size 16630 bytes 16594 bytes 16594 bytes
quantized_ttt val_bpb 1.08079 1.08103 1.08118

Seeds 314 and 999 both report Code size: 16594 bytes, consistent with the submitted code. Seed 42 reports 16630 bytes, suggesting it was run with a different version of train_gpt.py that includes the ttt_hash_embed / ttt_hash_buckets feature.

A question:
Could the version of train_gpt.py used for seed 42 be shared? I'm also curious about ttt_hash_embed and ttt_hash_buckets as techniques.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions