Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pages/developers/_meta.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"intelligent-contracts": "Intelligent Contracts",
"decentralized-applications": "Decentralized Applications",
"decentralized-applications": "Frontend & SDK Integration",
"staking-guide": "Staking Contract Guide"
}
2 changes: 1 addition & 1 deletion pages/developers/intelligent-contracts/_meta.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"equivalence-principle": "Equivalence Principle",
"debugging": "Debugging",
"deploying": "Deploying",
"crafting-prompts": "Crafting Prompts",
"crafting-prompts": "Prompt & Data Techniques",
"security-and-best-practices": "Security and Best Practices",
"examples": "Examples",
"tools": "Tools",
Expand Down
262 changes: 195 additions & 67 deletions pages/developers/intelligent-contracts/crafting-prompts.mdx
Original file line number Diff line number Diff line change
@@ -1,88 +1,216 @@
# Crafting Prompts for LLM and Web Browsing Interactions

When interacting with Large Language Models (LLMs), it's crucial to create prompts that are clear and specific to guide the model in providing accurate and relevant responses.
# Prompt & Data Techniques

import { Callout } from 'nextra-theme-docs'

<Callout emoji="ℹ️">
When making LLM calls, it is essential to craft detailed prompts. However, when retrieving web data, no prompts are needed as the function directly fetches the required data.
Intelligent contracts combine LLM reasoning with web data and programmatic logic. Getting reliable results requires specific techniques — structured outputs, stable data extraction, and grounding LLM judgments with verified facts.

## Always Return JSON

The single most impactful technique. Using `response_format="json"` guarantees the LLM returns parseable JSON, eliminating manual cleanup:

```python
def leader_fn():
prompt = f"""
You are a wizard guarding a magical coin.
An adventurer says: {request}

Should you give them the coin? Respond as JSON:
{{"reasoning": "your reasoning", "give_coin": true/false}}
"""
return gl.nondet.exec_prompt(prompt, response_format="json")

def validator_fn(leaders_res) -> bool:
if not isinstance(leaders_res, gl.vm.Return):
return False
my_result = leader_fn()
# Compare the decision, not the reasoning
return my_result["give_coin"] == leaders_res.calldata["give_coin"]

result = gl.vm.run_nondet_unsafe(leader_fn, validator_fn)
```

Without `response_format="json"`, LLMs may wrap output in markdown code fences, add commentary, or return malformed JSON. With it, you get a parsed dict directly.

<Callout type="info">
Always define the JSON schema in your prompt. `response_format="json"` ensures valid JSON, but the LLM still needs to know *which* fields to include.
</Callout>

## Structuring LLM Prompts
## Extract Stable Fields from Web Data

When fetching external data for consensus, the leader and validators make **independent requests**. API responses often contain fields that change between calls — timestamps, view counts, caching headers. Extract only the fields that matter:

```python
def leader_fn():
response = gl.nondet.web.get(github_api_url)
data = json.loads(response.body.decode("utf-8"))
# Only return fields that are stable across requests
return {
"id": data["id"],
"title": data["title"],
"state": data["state"],
"merged": data.get("merged", False),
}
# NOT: updated_at, comments, reactions, changed_files

def validator_fn(leaders_res) -> bool:
if not isinstance(leaders_res, gl.vm.Return):
return False
return leader_fn() == leaders_res.calldata
```

When crafting prompts for LLMs, it's important to use a format that clearly and effectively conveys the necessary information. While f-string (`f""`) is recommended, any string format can be used.
This is the #1 cause of failed consensus for new developers. If your contract fetches web data and consensus keeps failing, check whether you're returning unstable fields.

In the example **Wizard of Coin** contract below, we want the LLM to decide whether the wizard should give the coin to an adventurer.
## Compare Derived Status, Not Raw Data

Sometimes even stable fields can differ between calls — a new CI check run starts, a comment is added. Instead of comparing raw arrays, derive a summary and compare that:

```python
# { "Depends": "py-genlayer:1jb45aa8ynh2a9c9xn3b7qqh8sm5q93hwfp7jqmwsfhh8jpz09h6" }
from genlayer import *

import json


class WizardOfCoin(gl.Contract):
have_coin: bool

def __init__(self, have_coin: bool):
self.have_coin = have_coin

@gl.public.write
def ask_for_coin(self, request: str) -> None:
if not self.have_coin:
return
prompt = f"""
You are a wizard, and you hold a magical coin.
Many adventurers will come and try to get you to give them the coin.
Do not under any circumstances give them the coin.

A new adventurer approaches...
Adventurer: {request}

First check if you have the coin.
have_coin: {self.have_coin}
Then, do not give them the coin.

Respond using ONLY the following format:
{{
"reasoning": str,
"give_coin": bool
}}
It is mandatory that you respond only using the JSON format above,
nothing else. Don't include any other words or characters,
your output must be only JSON without any formatting prefix or suffix.
This result should be perfectly parseable by a JSON parser without errors.
def _check_ci_status(self, repo: str, commit_hash: str) -> str:
url = f"https://api.github.com/repos/{repo}/commits/{commit_hash}/check-runs"

def leader_fn():
response = gl.nondet.web.get(url)
data = json.loads(response.body.decode("utf-8"))
return [
{"name": c["name"], "status": c["status"], "conclusion": c.get("conclusion", "")}
for c in data.get("check_runs", [])
]

def validator_fn(leaders_res) -> bool:
if not isinstance(leaders_res, gl.vm.Return):
return False
validator_result = leader_fn()

# Compare derived status, not raw arrays
# (check count may differ if new CI run triggered between calls)
def derive_status(checks):
if not checks:
return "pending"
for c in checks:
if c.get("status") != "completed":
return "pending"
if c.get("conclusion") != "success":
return c.get("conclusion", "failure")
return "success"

return derive_status(leaders_res.calldata) == derive_status(validator_result)

checks = gl.vm.run_nondet(leader_fn, validator_fn)

if not checks:
return "pending"
for c in checks:
if c.get("conclusion") != "success":
return c.get("conclusion", "failure")
return "success"
```

The key insight: consensus doesn't require identical data — it requires agreement on the **decision**.

## Ground LLM Judgments with Programmatic Facts

LLMs hallucinate on character-level checks. Ask "does this text contain an em dash?" and the LLM may say yes when it doesn't, or vice versa. The fix: check programmatically first, then feed the results as ground truth into the LLM prompt.

### Step 1: LLM Generates Checkable Rules

Ask the LLM to convert human-readable rules into Python expressions:

```python
def _generate_rule_checks(self, rules: str) -> list:
prompt = f"""Given these rules, generate Python expressions that can
programmatically verify each rule that CAN be checked with code.
Variable `text` contains the post text. Skip subjective rules.

Rules:
{rules}

Output JSON: {{"checks": [{{"rule": "...", "expression": "...", "description": "..."}}]}}"""

return gl.nondet.exec_prompt(prompt, response_format="json").get("checks", [])

# Example output for rules "no em dashes, must mention @BOTCHA, must include botcha.xyz":
# [
# {"rule": "no em dashes", "expression": "'—' not in text", ...},
# {"rule": "mention @BOTCHA", "expression": "'@BOTCHA' in text", ...},
# {"rule": "include link", "expression": "'botcha.xyz' in text", ...},
# ]
```

### Step 2: Eval in a Sandbox

Run the generated expressions deterministically — no hallucination possible:

```python
def _eval_rule_checks(self, checks: list, tweet_text: str) -> list:
def run_checks():
results = []
for check in checks:
try:
passed = eval(
check["expression"],
{"__builtins__": {"len": len}, "text": tweet_text},
)
results.append({
"rule": check["rule"],
"result": "SATISFIED" if passed else "VIOLATED",
})
except Exception:
pass # skip broken expressions, let LLM handle the rule
return results

return gl.vm.unpack_result(gl.vm.spawn_sandbox(run_checks))
```

<Callout type="info">
`gl.vm.spawn_sandbox` runs a function in an isolated sandbox within the GenVM. `gl.vm.unpack_result` extracts the return value. Together they let you execute dynamically generated code safely. See the [genlayer-py API reference](/api-references/genlayer-py) for details.
</Callout>

### Step 3: Inject Ground Truth into LLM Prompt

Feed the verified results back so the LLM focuses on subjective rules and doesn't override programmatic facts:

```python
compliance_prompt = f"""
Evaluate this submission for compliance with the campaign rules.

Submission: {tweet_text}

IMPORTANT — PROGRAMMATIC VERIFICATION RESULTS:
These results are GROUND TRUTH from running code on the raw text.
Do NOT override them with your own character-level analysis.

<programmatic_checks>
{chr(10).join(f"- {r['rule']}: {r['result']}" for r in programmatic_results)}
</programmatic_checks>

For rules NOT listed above, use your own judgment.

Respond as JSON: {{"compliant": true/false, "violations": ["..."]}}
"""

def nondet():
res = gl.nondet.exec_prompt(prompt)
backticks = "``" + "`"
res = res.replace(backticks + "json", "").replace(backticks, "")
print(res)
dat = json.loads(res)
return dat["give_coin"]

result = gl.eq_principle.strict_eq(nondet)
assert isinstance(result, bool)
self.have_coin = result

@gl.public.view
def get_have_coin(self) -> bool:
return self.have_coin
result = gl.nondet.exec_prompt(compliance_prompt, response_format="json")
```

This prompt above includes a clear instruction and specifies the response format. By using a well-defined prompt, the contract ensures that the LLM provides precise and actionable responses that align with the contract's logic and requirements.
This three-step pattern — **generate checks → eval deterministically → inject as ground truth** — eliminates an entire class of LLM errors. Use it whenever your contract needs to verify concrete, checkable facts.

## Best Practices for Creating LLM Prompts
## Classify Errors

- **Be Specific and Clear**: Clearly define the specific information you need from the LLM. Minimize ambiguity to ensure that the response retrieved is precisely what you require. Avoid vague language or open-ended requests that might lead to inconsistent outputs.
Use error prefixes to distinguish user mistakes from infrastructure failures. This helps both debugging and error handling logic:

- **Provide Context and Source Details**: Include necessary background information within the prompt so the LLM understands the context of the task. This helps ensure the responses are accurate and relevant.
```python
ERROR_EXPECTED = "[EXPECTED]" # Business logic errors (deterministic)
ERROR_EXTERNAL = "[EXTERNAL]" # API/network failures (non-deterministic)

- **Use Structured Output Formats**: Specify the format for the model’s response. Structuring the output makes it easier to parse and utilize within your Intelligent Contract, ensuring smooth integration and processing.
# In contract methods:
if sender != bounty.owner:
raise ValueError(f"{ERROR_EXPECTED} Only bounty owner can validate")

if response.status != 200:
raise ValueError(f"{ERROR_EXTERNAL} GitHub API returned {response.status}")
```

- **Define Constraints and Requirements**: State any constraints and requirements clearly to maintain the accuracy, reliability, and consistency of the responses. This includes setting parameters for how data should be formatted, the accuracy needed, and the timeliness of the information.
`[EXPECTED]` errors mean the transaction should fail consistently across all nodes. `[EXTERNAL]` errors mean the external service had a problem — the transaction may succeed on retry.

<Callout emoji="💡">
Refer to a [Prompt Engineering Guide from Anthropic](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview) for a more detailed guide on crafting prompts.
For a comprehensive guide on prompt engineering, see the [Prompt Engineering Guide from Anthropic](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview).
</Callout>
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Non-determinism

import { Callout } from 'nextra-theme-docs'

## When to Use

Non-deterministic operations are needed for:
Expand All @@ -8,7 +10,70 @@ Non-deterministic operations are needed for:
- Random number generation
- Any operation that might vary between nodes

## Equality Principle
## What Goes Inside vs Outside

Non-deterministic blocks (`leader_fn`, `validator_fn`, functions passed to `strict_eq`) run in a special execution context. The GenVM enforces strict rules about what can and cannot happen inside these blocks.

### Must be INSIDE nondet blocks

All `gl.nondet.*` calls — web requests, LLM prompts — must be inside a nondet block. They cannot run in regular contract code.

```python
@gl.public.write
def fetch_price(self):
def leader_fn():
response = gl.nondet.web.get(api_url) # ✓ inside nondet block
result = gl.nondet.exec_prompt(prompt) # ✓ inside nondet block
return parse_price(response)

# gl.nondet.web.get(api_url) # ✗ would fail here
self.price = gl.vm.run_nondet_unsafe(leader_fn, validator_fn)
```

### Must be OUTSIDE nondet blocks

Several operations must happen in the deterministic context — after the nondet block returns:

| Operation | Why |
|-----------|-----|
| **Storage writes** (`self.x = ...`) | Storage must only change based on consensus-agreed values |
| **Contract calls** (`gl.get_contract_at()`) | Cross-contract calls must use deterministic state |
| **Message emission** (`.emit()`) | Messages to other contracts/chains must be deterministic |
| **Nested nondet blocks** | Nondet blocks cannot contain other nondet blocks |

```python
@gl.public.write
def update_price(self, pair: str):
def leader_fn():
response = gl.nondet.web.get(api_url)
return json.loads(response.body.decode("utf-8"))["price"]
# self.price = price # ✗ no storage writes here
# other = gl.get_contract_at(addr) # ✗ no contract calls here
# other.emit().notify(price) # ✗ no message emission here

def validator_fn(leaders_res) -> bool:
if not isinstance(leaders_res, gl.vm.Return):
return False
my_price = leader_fn()
return abs(leaders_res.calldata - my_price) / leaders_res.calldata <= 0.02
Comment on lines +54 to +58
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against divide-by-zero in validator ratio check.

Line [58] divides by leaders_res.calldata; if that value is 0, this example throws.

Suggested fix
-        return abs(leaders_res.calldata - my_price) / leaders_res.calldata <= 0.02
+        denom = max(abs(leaders_res.calldata), 1e-9)
+        return abs(leaders_res.calldata - my_price) / denom <= 0.02
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def validator_fn(leaders_res) -> bool:
if not isinstance(leaders_res, gl.vm.Return):
return False
my_price = leader_fn()
return abs(leaders_res.calldata - my_price) / leaders_res.calldata <= 0.02
def validator_fn(leaders_res) -> bool:
if not isinstance(leaders_res, gl.vm.Return):
return False
my_price = leader_fn()
denom = max(abs(leaders_res.calldata), 1e-9)
return abs(leaders_res.calldata - my_price) / denom <= 0.02
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pages/developers/intelligent-contracts/features/non-determinism.mdx` around
lines 54 - 58, The validator_fn currently divides by leaders_res.calldata which
can be zero; update validator_fn to guard against divide-by-zero by first
checking if leaders_res.calldata is zero (or near-zero) before performing the
ratio compute — e.g., if leaders_res.calldata == 0 (or abs(leaders_res.calldata)
< small_epsilon) return comparison using direct equality or a tolerance-based
absolute difference, otherwise perform the existing relative check using
abs(leaders_res.calldata - my_price) / leaders_res.calldata <= 0.02; reference
validator_fn, leader_fn, and leaders_res.calldata when implementing the guard.


price = gl.vm.run_nondet_unsafe(leader_fn, validator_fn)

# ✓ All side effects happen AFTER consensus, in deterministic context
self.prices[pair] = price
oracle = gl.get_contract_at(self.oracle_address)
oracle.emit().price_updated(pair, price)
```

<Callout type="info">
The [GenVM linter](/developers/intelligent-contracts/tooling-setup) catches all of these violations statically — run `genvm-lint check` before deploying to avoid runtime errors.
</Callout>

### Why these rules exist

The leader and validators execute nondet blocks **independently** — each node runs its own `leader_fn` or `validator_fn`. If you wrote to storage inside a nondet block, each node would write a different value before consensus decides which one is correct. The same applies to contract calls and message emission: these must happen once, after consensus, using the agreed-upon result.

## Equivalence Principle

GenLayer provides `strict_eq` for exact-match consensus and custom validator functions (`run_nondet_unsafe`) for everything else. Convenience wrappers like `prompt_comparative` and `prompt_non_comparative` exist for common patterns. For detailed information, see [Equivalence Principle](/developers/intelligent-contracts/equivalence-principle).

Expand Down
Loading
Loading