Skip to content

Recommender Engine

Laith0003 edited this page May 28, 2026 · 1 revision

Recommender Engine — the 5-parallel-search reasoning core

engine/recommender/core.py is the flagship. Given a brief, it returns a complete recommended design system in well under 100ms, with full rationale, deterministic outputs, and the 100 anti-pattern rules pre-loaded as active guardrails. No LLM call. Same input, same output bytes.

This page is the deep dive. For the broader engine map, see Architecture. For the slash command that calls it, see All 22 Commands.


The five domain lanes plus three auxiliary lanes

The recommender runs an industry lookup first to bias everything downstream, then a style lane, then fans out across five auxiliary searches in parallel:

1. INDUSTRY     industries.json (184)       sequential
2. STYLE        styles.json (84)            sequential — biases palette/type/motion
3. PALETTE      palettes.json (176)         parallel (5-worker pool)
4. TYPE         type-pairs.json (70)        parallel
5. MOTION       motion-presets.json (57)    parallel
   COMPONENTS   components.json (148)       parallel
   BRANDS       brands/*.json (110)         parallel
   GUARDRAILS   anti-patterns.json (100)    parallel (starts with style)

The first three lanes (industry, style, palette / type / motion) are the "5-parallel-search" name. Components and brand exemplars are auxiliary — they ride the same threads to keep latency flat. Guardrails are always on; they're not searched, they're loaded.

Lane 1 — Industry

def _lane_industry(brief: Brief) -> Dict[str, Any]:
    target = (brief.industry or "").strip().lower()
    # Strategy 1: exact id match
    # Strategy 2: fuzzy substring match on id / name / category
    # Strategy 3: tag-score fallback

The 3-strategy fallback is the v2.1 fix for the most common production bug: a user typed "fintech-neobank", the manifest only had "fintech", and the lane fell straight through to score-all — which routinely returned an unrelated industry like "editorial". Now it tries exact first, fuzzy second, score-all only as a last resort.

The returned industry entry carries recommended_styles (boost list) and avoid_styles (penalize list) that the style lane reads next.

Lane 2 — Style

def style_score(e):
    sid = e.get("id", "")
    if sid in avoid:        # industry avoid + brief.forbidden
        return -100.0
    s = _score(e, brief)
    if sid in recommended:  # industry recommended_styles
        s += 10.0
    return s

Avoid first, score second, +10 bias for industry recommendations third. A "fintech-neobank" brief with recommended_styles: ["fintech-precise", "swiss-grid"] and avoid_styles: ["brutalism", "neon-cyberpunk"] will return fintech-precise over a generic clean-saas even if both score similarly on tone tags.

The returned style entry carries compatible_palettes, compatible_type_pairs, compatible_components, and motion hints in its tokens block. Every downstream lane reads from it.

Lane 3 — Palette

def _lane_palette(brief, style):
    compatible = set(style.get("compatible_palettes", []))
    scored = [(e, _score(e, brief)) for e in entries]
    valid = [(e, s) for (e, s) in scored if s > -50]   # drop forbidden
    ranked = sorted(valid,
        key=lambda es: (es[0].get("id") in compatible, es[1]),
        reverse=True,
    )
    return ranked[0][0] if ranked else {}

Two-key sort: compatibility-first, then score. The v2.2 fix (task #57) is the forbidden drop before the sort — without it, a compatible-but-forbidden palette would beat a non-compatible-but-acceptable palette, because the tuple sort puts (True, -100) ahead of (False, 5). Now forbidden entries are dropped before they enter the rank.

Lane 4 — Type pair

Same shape as palette. Two-key sort, same v2.2 forbidden-drop fix. The forbidden filter is the most aggressive here because typography is the strongest AI fingerprint — Inter as display, Cormorant Garamond as display, those have to fail closed.

Lane 5 — Motion

def _lane_motion(brief, style):
    scored = [(e, _score(e, brief)) for e in entries]
    valid = [e for (e, s) in scored if s > -50]
    ranked = sorted(valid, key=lambda e: _score(e, brief), reverse=True)
    return ranked[:5]

Returns the top 5, not just 1. A landing page has different motion needs across entry, hover, scroll, and exit; the recommender hands back a kit, not a single preset.

Auxiliary lane — Components

Filters by compatible_styles containing the chosen style id, drops forbidden, returns top 12. Twelve is enough for a landing page (nav, hero, feature bento, pricing card, dashboard mockup, testimonial card, footer, etc.) without bloating the recommendation payload.

Auxiliary lane — Brand exemplars

def _lane_brands(style, industry):
    brands = load_brands()
    bias = set(industry.get("exemplars", []))
    ranked = sorted(brands, key=lambda b: (b.get("id") in bias, b.get("name", "")))
    return ranked[:5]

Industry-biased, alphabetically stable as a secondary sort key. The LLM gets 5 brand exemplars with their full design language — see Brand Library 110.

Auxiliary lane — Guardrails

def _lane_guardrails() -> List[Dict[str, Any]]:
    """Always-on. The anti-pattern rules are non-negotiable."""
    data = load("anti-patterns")
    return data.get("entries", [])

All 100 anti-pattern rules attached to every recommendation. The LLM treats them as hard constraints during generation. See Linter Rules for how they're also enforced post-generation.


The scoring function

def _score(entry: Dict[str, Any], brief: Brief) -> float:
    score = 0.0
    tags = set(brief.tone) | set(brief.audience)

    for field_name in ("tone", "character", "characteristics", "tokens"):
        value = entry.get(field_name)
        if isinstance(value, list):
            score += sum(1.0 for t in value if t in tags)
        elif isinstance(value, dict):
            score += sum(1.0 for v in value.values() if isinstance(v, str) and v in tags)

    # Forbidden filter — apply -100 penalty across id / category / name / typography family
    ...

Two halves: positive scoring on tone | audience tag overlap, negative -100 penalty for anything matching brief.forbidden.

The positive half is intentionally naive. No learned weights, no fancy embedding lookup, no temperature. Just count how many of the brief's tone + audience tags appear in the entry's tone, character, characteristics, or tokens fields. The recommender prefers transparent scoring over clever scoring because explainability matters more than the last 5% of fit accuracy.


The forbidden filter

brief.forbidden is a list of strings. Each entry passes a four-place substring match:

  1. The entry's id. "brutalism" in forbidden kills the style with id brutalism.
  2. The entry's category. "crypto" in forbidden kills every brand with category Fintech / Crypto.
  3. The entry's name (lower-cased). "cormorant" in forbidden kills any type pair named "Cormorant + ..." or "... + Cormorant".
  4. Every typography family inside display, body, or mono blocks — both space-form and hyphen-form. "cormorant-garamond" and "cormorant garamond" both match.

A single match returns -100 immediately. There's no partial credit — forbidden means dropped.

for forbidden in brief.forbidden:
    fbd = forbidden.lower()
    if fbd == entry_id.lower() or fbd == entry_cat.lower():
        return -100.0
    if fbd in name_lower:
        return -100.0
    for fam in type_fams:
        if fbd in fam:
            return -100.0

This is the layer that makes "don't use purple gradients" or "no Inter on display" actually stick. Without it, the LLM is free to ignore the avoid-list inside the brief because it's just another string in a prompt.


Why the recommender is deterministic

Same Brief → same Recommendation. Byte-identical. Verified by the smoke tests.

Three properties enforce determinism:

  1. No LLM call inside the recommender. Pure dict-to-dict over local manifests. No network, no model, no temperature.
  2. Stable sort across all lanes. Python's sorted() is stable; ties resolve by manifest order, which is fixed at JSON write time.
  3. Manifests are versioned. _meta.version on every manifest. A user can pin the engine to a manifest version and expect identical output across runs.

Determinism is the contract that lets MASTER.md be committed. If two developers on the same team run ux recommend with the same brief, they get the same recommendation. The design system is reproducible the way the build is reproducible.


How to extend the recommender

Add a sixth lane

The pattern is mechanical. Say you want a "layout pattern" lane that returns the top landing-page scaffold:

def _lane_landing(brief, style):
    data = load("landing-patterns")
    entries = data.get("entries", [])
    compatible = set(style.get("compatible_layouts", []))
    scored = [(e, _score(e, brief)) for e in entries]
    valid = [(e, s) for (e, s) in scored if s > -50]
    ranked = sorted(
        valid,
        key=lambda es: (es[0].get("id") in compatible, es[1]),
        reverse=True,
    )
    return ranked[:3]

Wire it into recommend():

with ThreadPoolExecutor(max_workers=6) as pool:
    style_future = pool.submit(_lane_style, brief, industry)
    ...
    landing_future = pool.submit(_lane_landing, brief, style)
    landing = landing_future.result()

Add a landing: List[Dict] = field(default_factory=list) slot to Recommendation. Bump the worker pool to 6. Done.

Custom scoring

Replace _score() with your own implementation. The contract is _score(entry: Dict, brief: Brief) -> float. Anything > 0 is a positive signal, anything <= -50 is dropped before the rank, anything <= -100 should mean "never pick this." Keep it explainable — every lane uses the same scorer, so a one-line change ripples through the whole recommender.

Add a new manifest

Drop a new file under data/, add its name to MANIFESTS in engine/data_loader.py, write a lane that reads it. The data loader caches it on first call.


Example — a fintech-neobank brief

Input:

{
  "project_type": "landing",
  "industry": "fintech-neobank",
  "audience": ["procurement", "institutional"],
  "tone": ["precise", "serious"],
  "must_have": ["a11y-AA", "rtl"],
  "forbidden": ["purple-to-blue-gradient", "cormorant-garamond", "brutalism"],
  "region": "mena"
}

What each lane returns (illustrative — exact picks depend on current manifest contents):

Lane Pick Why
Industry fintech-neobank exact id match (Strategy 1)
Style fintech-precise in industry recommended_styles (+10 bias), tone tags precise and serious both score
Palette saudi-institutional-cool in style.compatible_palettes, RTL-compatible, region tag mena matches
Type pair ibm-plex-sans-pair in style.compatible_type_pairs, cormorant-garamond forbidden filter drops the editorial option
Motion fade-up-12px, tab-slide, route-crossfade, accordion-reveal, skeleton-shimmer matches precise + serious, no overshoot, no spring
Components fintech-rate-card, table-with-stripe-hover, comparison-matrix, pricing-tier-pill, ... +7 more compatible with fintech-precise
Brand exemplars stripe, ramp, monzo, mercury, n26 fintech-neobank.exemplars boost
Guardrails all 100 anti-pattern rules always on

Plus a rationale array — five strings the LLM can render verbatim:

[
  "Industry: Fintech — Neobank",
  "Style: Fintech Precise (matches 'precise' + 'serious' tone)",
  "Palette: Saudi institutional cool (RTL-compatible, no purple-blue)",
  "Type pair: IBM Plex Sans (geometric, neutral, Cormorant excluded)",
  "Motion presets considered: 5",
  "Compatible components: 12",
  "Guardrails active: 100 anti-pattern rules"
]

The whole recommendation is one JSON document. The LLM doesn't have to guess — it reads the rationale, picks tokens, generates code.


Why this design

Three taste calls that don't show up in the code but matter:

  1. One scoring function for every lane. Easier to reason about, easier to debug, easier to extend. Every lane is a one-page Python function — if you can read one, you can read all eight.
  2. Forbidden as -100, not as filter. Filtering first would mean an empty result set is possible; the -100 penalty lets the sort still rank everything in case the threshold has to be relaxed for debugging. The > -50 drop happens at sort time, not at score time.
  3. Industry as bias, not as gate. The industry lane could have hard-coded "if industry is X, return style Y." Instead it returns a boost list, so a brief with strong tone signals against the industry's typical picks can still steer the recommender. The user is the boss; the industry is a hint.

See also: Architecture · Linter Rules · Brand Library 110 · All 22 Commands Source: github.com/Laith0003/ux-skill/blob/main/engine/recommender/core.py

Clone this wiki locally