-
Notifications
You must be signed in to change notification settings - Fork 6
Recommender Engine
engine/recommender/core.py is the flagship. Given a brief, it returns a complete recommended design system in well under 100ms, with full rationale, deterministic outputs, and the 100 anti-pattern rules pre-loaded as active guardrails. No LLM call. Same input, same output bytes.
This page is the deep dive. For the broader engine map, see Architecture. For the slash command that calls it, see All 22 Commands.
The recommender runs an industry lookup first to bias everything downstream, then a style lane, then fans out across five auxiliary searches in parallel:
1. INDUSTRY industries.json (184) sequential
2. STYLE styles.json (84) sequential — biases palette/type/motion
3. PALETTE palettes.json (176) parallel (5-worker pool)
4. TYPE type-pairs.json (70) parallel
5. MOTION motion-presets.json (57) parallel
COMPONENTS components.json (148) parallel
BRANDS brands/*.json (110) parallel
GUARDRAILS anti-patterns.json (100) parallel (starts with style)
The first three lanes (industry, style, palette / type / motion) are the "5-parallel-search" name. Components and brand exemplars are auxiliary — they ride the same threads to keep latency flat. Guardrails are always on; they're not searched, they're loaded.
def _lane_industry(brief: Brief) -> Dict[str, Any]:
target = (brief.industry or "").strip().lower()
# Strategy 1: exact id match
# Strategy 2: fuzzy substring match on id / name / category
# Strategy 3: tag-score fallbackThe 3-strategy fallback is the v2.1 fix for the most common production bug: a user typed "fintech-neobank", the manifest only had "fintech", and the lane fell straight through to score-all — which routinely returned an unrelated industry like "editorial". Now it tries exact first, fuzzy second, score-all only as a last resort.
The returned industry entry carries recommended_styles (boost list) and avoid_styles (penalize list) that the style lane reads next.
def style_score(e):
sid = e.get("id", "")
if sid in avoid: # industry avoid + brief.forbidden
return -100.0
s = _score(e, brief)
if sid in recommended: # industry recommended_styles
s += 10.0
return sAvoid first, score second, +10 bias for industry recommendations third. A "fintech-neobank" brief with recommended_styles: ["fintech-precise", "swiss-grid"] and avoid_styles: ["brutalism", "neon-cyberpunk"] will return fintech-precise over a generic clean-saas even if both score similarly on tone tags.
The returned style entry carries compatible_palettes, compatible_type_pairs, compatible_components, and motion hints in its tokens block. Every downstream lane reads from it.
def _lane_palette(brief, style):
compatible = set(style.get("compatible_palettes", []))
scored = [(e, _score(e, brief)) for e in entries]
valid = [(e, s) for (e, s) in scored if s > -50] # drop forbidden
ranked = sorted(valid,
key=lambda es: (es[0].get("id") in compatible, es[1]),
reverse=True,
)
return ranked[0][0] if ranked else {}Two-key sort: compatibility-first, then score. The v2.2 fix (task #57) is the forbidden drop before the sort — without it, a compatible-but-forbidden palette would beat a non-compatible-but-acceptable palette, because the tuple sort puts (True, -100) ahead of (False, 5). Now forbidden entries are dropped before they enter the rank.
Same shape as palette. Two-key sort, same v2.2 forbidden-drop fix. The forbidden filter is the most aggressive here because typography is the strongest AI fingerprint — Inter as display, Cormorant Garamond as display, those have to fail closed.
def _lane_motion(brief, style):
scored = [(e, _score(e, brief)) for e in entries]
valid = [e for (e, s) in scored if s > -50]
ranked = sorted(valid, key=lambda e: _score(e, brief), reverse=True)
return ranked[:5]Returns the top 5, not just 1. A landing page has different motion needs across entry, hover, scroll, and exit; the recommender hands back a kit, not a single preset.
Filters by compatible_styles containing the chosen style id, drops forbidden, returns top 12. Twelve is enough for a landing page (nav, hero, feature bento, pricing card, dashboard mockup, testimonial card, footer, etc.) without bloating the recommendation payload.
def _lane_brands(style, industry):
brands = load_brands()
bias = set(industry.get("exemplars", []))
ranked = sorted(brands, key=lambda b: (b.get("id") in bias, b.get("name", "")))
return ranked[:5]Industry-biased, alphabetically stable as a secondary sort key. The LLM gets 5 brand exemplars with their full design language — see Brand Library 110.
def _lane_guardrails() -> List[Dict[str, Any]]:
"""Always-on. The anti-pattern rules are non-negotiable."""
data = load("anti-patterns")
return data.get("entries", [])All 100 anti-pattern rules attached to every recommendation. The LLM treats them as hard constraints during generation. See Linter Rules for how they're also enforced post-generation.
def _score(entry: Dict[str, Any], brief: Brief) -> float:
score = 0.0
tags = set(brief.tone) | set(brief.audience)
for field_name in ("tone", "character", "characteristics", "tokens"):
value = entry.get(field_name)
if isinstance(value, list):
score += sum(1.0 for t in value if t in tags)
elif isinstance(value, dict):
score += sum(1.0 for v in value.values() if isinstance(v, str) and v in tags)
# Forbidden filter — apply -100 penalty across id / category / name / typography family
...Two halves: positive scoring on tone | audience tag overlap, negative -100 penalty for anything matching brief.forbidden.
The positive half is intentionally naive. No learned weights, no fancy embedding lookup, no temperature. Just count how many of the brief's tone + audience tags appear in the entry's tone, character, characteristics, or tokens fields. The recommender prefers transparent scoring over clever scoring because explainability matters more than the last 5% of fit accuracy.
brief.forbidden is a list of strings. Each entry passes a four-place substring match:
- The entry's
id."brutalism"inforbiddenkills the style with idbrutalism. - The entry's
category."crypto"inforbiddenkills every brand with categoryFintech / Crypto. - The entry's
name(lower-cased)."cormorant"inforbiddenkills any type pair named "Cormorant + ..." or "... + Cormorant". - Every typography family inside
display,body, ormonoblocks — both space-form and hyphen-form."cormorant-garamond"and"cormorant garamond"both match.
A single match returns -100 immediately. There's no partial credit — forbidden means dropped.
for forbidden in brief.forbidden:
fbd = forbidden.lower()
if fbd == entry_id.lower() or fbd == entry_cat.lower():
return -100.0
if fbd in name_lower:
return -100.0
for fam in type_fams:
if fbd in fam:
return -100.0This is the layer that makes "don't use purple gradients" or "no Inter on display" actually stick. Without it, the LLM is free to ignore the avoid-list inside the brief because it's just another string in a prompt.
Same Brief → same Recommendation. Byte-identical. Verified by the smoke tests.
Three properties enforce determinism:
- No LLM call inside the recommender. Pure dict-to-dict over local manifests. No network, no model, no temperature.
-
Stable sort across all lanes. Python's
sorted()is stable; ties resolve by manifest order, which is fixed at JSON write time. -
Manifests are versioned.
_meta.versionon every manifest. A user can pin the engine to a manifest version and expect identical output across runs.
Determinism is the contract that lets MASTER.md be committed. If two developers on the same team run ux recommend with the same brief, they get the same recommendation. The design system is reproducible the way the build is reproducible.
The pattern is mechanical. Say you want a "layout pattern" lane that returns the top landing-page scaffold:
def _lane_landing(brief, style):
data = load("landing-patterns")
entries = data.get("entries", [])
compatible = set(style.get("compatible_layouts", []))
scored = [(e, _score(e, brief)) for e in entries]
valid = [(e, s) for (e, s) in scored if s > -50]
ranked = sorted(
valid,
key=lambda es: (es[0].get("id") in compatible, es[1]),
reverse=True,
)
return ranked[:3]Wire it into recommend():
with ThreadPoolExecutor(max_workers=6) as pool:
style_future = pool.submit(_lane_style, brief, industry)
...
landing_future = pool.submit(_lane_landing, brief, style)
landing = landing_future.result()Add a landing: List[Dict] = field(default_factory=list) slot to Recommendation. Bump the worker pool to 6. Done.
Replace _score() with your own implementation. The contract is _score(entry: Dict, brief: Brief) -> float. Anything > 0 is a positive signal, anything <= -50 is dropped before the rank, anything <= -100 should mean "never pick this." Keep it explainable — every lane uses the same scorer, so a one-line change ripples through the whole recommender.
Drop a new file under data/, add its name to MANIFESTS in engine/data_loader.py, write a lane that reads it. The data loader caches it on first call.
Input:
{
"project_type": "landing",
"industry": "fintech-neobank",
"audience": ["procurement", "institutional"],
"tone": ["precise", "serious"],
"must_have": ["a11y-AA", "rtl"],
"forbidden": ["purple-to-blue-gradient", "cormorant-garamond", "brutalism"],
"region": "mena"
}What each lane returns (illustrative — exact picks depend on current manifest contents):
| Lane | Pick | Why |
|---|---|---|
| Industry | fintech-neobank |
exact id match (Strategy 1) |
| Style | fintech-precise |
in industry recommended_styles (+10 bias), tone tags precise and serious both score |
| Palette | saudi-institutional-cool |
in style.compatible_palettes, RTL-compatible, region tag mena matches |
| Type pair | ibm-plex-sans-pair |
in style.compatible_type_pairs, cormorant-garamond forbidden filter drops the editorial option |
| Motion |
fade-up-12px, tab-slide, route-crossfade, accordion-reveal, skeleton-shimmer
|
matches precise + serious, no overshoot, no spring |
| Components |
fintech-rate-card, table-with-stripe-hover, comparison-matrix, pricing-tier-pill, ... +7 more
|
compatible with fintech-precise
|
| Brand exemplars |
stripe, ramp, monzo, mercury, n26
|
fintech-neobank.exemplars boost |
| Guardrails | all 100 anti-pattern rules | always on |
Plus a rationale array — five strings the LLM can render verbatim:
[
"Industry: Fintech — Neobank",
"Style: Fintech Precise (matches 'precise' + 'serious' tone)",
"Palette: Saudi institutional cool (RTL-compatible, no purple-blue)",
"Type pair: IBM Plex Sans (geometric, neutral, Cormorant excluded)",
"Motion presets considered: 5",
"Compatible components: 12",
"Guardrails active: 100 anti-pattern rules"
]
The whole recommendation is one JSON document. The LLM doesn't have to guess — it reads the rationale, picks tokens, generates code.
Three taste calls that don't show up in the code but matter:
- One scoring function for every lane. Easier to reason about, easier to debug, easier to extend. Every lane is a one-page Python function — if you can read one, you can read all eight.
-
Forbidden as -100, not as filter. Filtering first would mean an empty result set is possible; the -100 penalty lets the sort still rank everything in case the threshold has to be relaxed for debugging. The
> -50drop happens at sort time, not at score time. - Industry as bias, not as gate. The industry lane could have hard-coded "if industry is X, return style Y." Instead it returns a boost list, so a brief with strong tone signals against the industry's typical picks can still steer the recommender. The user is the boss; the industry is a hint.
See also: Architecture · Linter Rules · Brand Library 110 · All 22 Commands Source: github.com/Laith0003/ux-skill/blob/main/engine/recommender/core.py