The Potential of Diffusion Large Language Model in Controllable Generation
Zhen Xiong¹, Yujun Cai², Zhecheng Li³, Yiwei Wang⁴
¹USC, ²UQ, ³UCSD, ⁴UC Merced
This toolbox implements Self-adaptive Schema Scaffolding (S3) for controllable generation with diffusion large language models (dLLMs).
- Build schema-aware scaffolds and prompts that warm-start diffusion LLM decoding.
- Run S3’s top-K remasking denoiser for reliable structured outputs.
- Inspect denoising traces and evaluation metrics with a few lines of Python.
- Customize the output schema (fields, token budgets, null tokens) without touching core code.
git clone https://github.com/eric2i/dLLM-CtrlGen.git
cd dLLM-CtrlGenfrom scaffolding import SelfAdaptiveSchemaScaffolder, SelfAdaptiveSchemaConfig
schema_cfg = SelfAdaptiveSchemaConfig(
fields=("name", "birth_place", "birth_date"),
)
scaffolder = SelfAdaptiveSchemaScaffolder(schema_cfg)Each field receives a 16-token mask budget by default; override specific fields via
token_budgets when you need more or fewer diffusion steps.
The scaffolder also defaults the null token to <none>; adjust null_token if your
pipeline expects a different placeholder.
from models import load_diffusion_llm
from decoding import SelfAdaptiveGenerator, GenerationConfig
model, tokenizer, device = load_diffusion_llm()
template = scaffolder.build_template(tokenizer)
text = "Albert Einstein was born on March 14, 1879, in Ulm, Germany..."
prompt = scaffolder.make_prompt(text)
generator = SelfAdaptiveGenerator(model, tokenizer, device)
result = generator.generate(prompt, template, config=GenerationConfig(steps=16), trace=True)
print(result.text) # JSON-formatted string
print(result.steps_executed)- Override denoising hyperparameters through
GenerationConfig. - Modify scaffold templates (code fences, indentation, mask budgets) by subclassing or configuring the scaffolder.
Please cite the accompanying paper when using this implementation:
@article{xiong2025unveiling,
title={Unveiling the Potential of Diffusion Large Language Model in Controllable Generation},
author={Xiong, Zhen and Cai, Yujun and Li, Zhecheng and Wang, Yiwei},
journal={arXiv preprint arXiv:2507.04504},
year={2025}
}