Official implementation of the ICLR 2026 paper “Spectral Attention Steering for Prompt Highlighting” (SEKA & AdaSEKA) by Weixian Waylon Li, Yuchen Niu, Yongxin Yang, Keshuang Li, Tiejun Ma, and Shay B. Cohen.
- [Feb 2026] Initial codebase and datasets released.
- [Jan 2026] We're excited to announce that our paper has been accepted to ICLR 2026! 🎉
- SEKA – Learn low-rank projections from synthetic QA pairs and inject them into attention keys during inference (via
SEKALLM). - AdaSEKA – Maintain a small ensemble of SEKA experts plus an expert selector (
AdaptiveSEKALLM) that chooses the best projection for each input.
SEKA/
├── adaptive-seka-config/ # Example AdaSEKA expert configs (JSON templates)
├── benchmarks/ # Evaluation drivers + utilities (see benchmarks/README.md)
├── notebook/ # Analysis / plotting notebooks
├── pastalib/ # PASTA steering library (used for comparisons)
├── scripts/ # Bash helpers for projection building & evaluation
├── src/ # Core SEKA / AdaSEKA implementation
├── requirements.txt # Minimal dependencies
└── requirements-complete.txt # Full environment (paper reproduction)
conda create -n seka python=3.10 -y
conda activate seka
pip install -r requirements.txtGrab the preprocessed datasets bundle from Hugging Face (https://huggingface.co/datasets/waylonli/SEKA-datasets), create a data/ folder if needed, and unpack the archive into that directory.
-
SEKA projections – Generate with the builders in
src/custom_builders/, e.g.python src/custom_builders/synthetic_qa_builder.py \ --model pretrained/Qwen3-4B-Base \ --data data/synthetic/pair_qa_new.jsonl \ --output_dir seka_projections/biasbios/Qwen3-4B-Base \ --max_samples 200 \ --min_diff 0.20 \ --top_pct 0.90
-
AdaSEKA projections & config
- Build one expert projection per steering expert with the adaptive builder (save the SVD components that AdaSEKA consumes):
Repeat the same command for each expert you need (e.g. change
python src/custom_builders/adaptive/synthetic_qa_builder_adaptive.py \ --model pretrained/Qwen3-8B-Base \ --data data/synthetic/pair_qa_new.jsonl \ --output_dir projections/adaptive/biasbios \ --max_samples 200 \ --min_diff 0.20 \ --top_pct 0.90 \ --save-svd --svd-only
--output_dirtoprojections/adaptive/counterfact,.../pronchange, etc.). - Copy a template from
adaptive-seka-config/and edit it to reference your tensors:The config is a flatcp adaptive-seka-config/Qwen3-8B/Qwen3-8B-mindiff-0.2.json adaseka_config.json
{expert: projection_path}mapping, e.g.These keys must match the experts you supply at inference time.{ "biasbios": "projections/adaptive/biasbios/Qwen3-8B-Base_0.2mindiff_pos_svd.pt", "counterfact": "projections/adaptive/counterfact/Qwen3-8B-Base_0.2mindiff_pos_svd.pt", "pronchange": "projections/adaptive/pronchange/Qwen3-8B-Base_0.2mindiff_pos_svd.pt", "synthetic": "projections/adaptive/synthetic/Qwen3-8B-Base_0.2mindiff_pos_svd.pt" } - Enable AdaSEKA with
--adaptive-sekaand point to the config via--adaptive-expert-path adaseka_config.jsonwhen you launch an evaluation.
- Build one expert projection per steering expert with the adaptive builder (save the SVD components that AdaSEKA consumes):
Bracketed parameters in the examples below are placeholders—set [AMP_POS], [AMP_NEG], and [ADAPTIVE_FACTOR] to your chosen amplification strengths before running.
python benchmarks/eval_bias_gen.py \
--model pretrained/Qwen3-4B-Base \
--data_path data/biasbios/biasbios.json \
--output_dir benchmarks/biasbios/results/seka-qwen3-4b \
--overwrite_output_dir \
--batch_size 32 \
--max_new_tokens 64 \
--seka \
--pos seka_projections/biasbios/Qwen3-4B-Base_pos_proj.pt \
--neg seka_projections/biasbios/Qwen3-4B-Base_neg_proj.pt \
--amplify_pos [AMP_POS] \
--amplify_neg [AMP_NEG] \
--layers last10python benchmarks/eval_fact_gen.py \
--model pretrained/Qwen3-8B-Base \
--data_path data/counterfact \
--output_dir benchmarks/counterfact/results/adaseka-qwen3-8b \
--overwrite_output_dir \
--benchmarks efficacy paraphrase \
--add_unmediated_fact True \
--batch_size 32 \
--max_new_tokens 64 \
--adaptive-seka \
--adaptive-expert-path adaseka_config.json \
--adaptive_amplify_factor [ADAPTIVE_FACTOR] \
--layers last10Detailed instructions live in benchmarks/README.md:
- Prepare the BiasBios / CounterFact / PronChange datasets.
- Download the SEKA and AdaSEKA projection packs (links above) or regenerate them.
- Run the provided bash commands to reproduce the tables from the paper.
Each run writes metric_result.json and sweep_config.json for traceability.
@inproceedings{
li2026spectral,
title={Spectral Attention Steering for Prompt Highlighting},
author={Weixian Waylon Li and Yuchen Niu and Yongxin Yang and Keshuang Li and Tiejun Ma and Shay B Cohen},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=XfLvGIFmAN}
}We build upon open-source contributions from prior steering and editing work:
- SEA-LLM for the Spectral Editing of LLM Activations algorithm that inspired SEKA’s projection formulation.
- PASTA for post-hoc editing baselines and evaluation utilities.
- Selective Prompt Anchoring for anchor-style baselines we reproduce alongside SEKA.
