Skip to content

Commit a7528b6

Browse files
authored
chore(ci): switch from gpt-4 to gpt-5 for default evals agent (#518)
* chore: switch from gpt-4 to gpt-5 for default evals agent Signed-off-by: Calum Murray <[email protected]> * cleanup: address review comments Signed-off-by: Calum Murray <[email protected]> --------- Signed-off-by: Calum Murray <[email protected]>
1 parent b706778 commit a7528b6

File tree

4 files changed

+3
-5
lines changed

4 files changed

+3
-5
lines changed

.github/workflows/gevals.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,11 +120,10 @@ jobs:
120120
# OpenAI Agent configuration
121121
MODEL_BASE_URL: ${{ secrets.MODEL_BASE_URL }}
122122
MODEL_KEY: ${{ secrets.MODEL_KEY }}
123-
MODEL_NAME: ${{ secrets.MODEL_NAME }}
124123
# LLM Judge configuration
125124
JUDGE_BASE_URL: ${{ secrets.JUDGE_BASE_URL }}
126125
JUDGE_API_KEY: ${{ secrets.JUDGE_API_KEY }}
127-
JUDGE_MODEL_NAME: ${{ secrets.JUDGE_MODEL_NAME }}
126+
JUDGE_MODEL_NAME: ${{ secrets.JUDGE_MODEL_NAME }} # we still need this one, as only the agent model is specified in yaml
128127

129128
- name: Cleanup
130129
if: always()

evals/README.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,6 @@ The tasks and MCP configuration are shared - only the agent configuration differ
6262
# Set your model credentials
6363
export MODEL_BASE_URL='https://your-api-endpoint.com/v1'
6464
export MODEL_KEY='your-api-key'
65-
export MODEL_NAME='your-model-name'
6665

6766
# Run the test
6867
./gevals eval examples/kube-mcp-server/openai-agent/eval.yaml

evals/openai-agent/agent.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ metadata:
33
name: "openai-agent"
44
builtin:
55
type: "openai-agent"
6-
model: "gpt-4" # Change to your model
6+
model: "gpt-5" # Change to your model
77
# Before running, set environment variables:
88
# export MODEL_BASE_URL="https://api.openai.com/v1"
99
# export MODEL_KEY="sk-..."

evals/openai-agent/eval-inline.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ config:
55
# Inline agent configuration - no separate agent.yaml file needed
66
agent:
77
type: "builtin.openai-agent"
8-
model: "gpt-4" # Change to your model
8+
model: "gpt-5" # Change to your model
99
# Before running, set environment variables:
1010
# export MODEL_BASE_URL="https://api.openai.com/v1"
1111
# export MODEL_KEY="sk-..."

0 commit comments

Comments
 (0)