-
Notifications
You must be signed in to change notification settings - Fork 58
Add CLI options for backend args (like headers and verify) #230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
4b946ae
c216081
b17e67f
e816a81
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,36 @@ | ||
# Coming Soon | ||
# CLI Reference | ||
|
||
This page provides a reference for the `guidellm` command-line interface. For more advanced configuration, including environment variables and `.env` files, see the [Configuration Guide](./configuration.md). | ||
|
||
## `guidellm benchmark run` | ||
|
||
This command is the primary entrypoint for running benchmarks. It has many options that can be specified on the command line or in a scenario file. | ||
|
||
### Scenario Configuration | ||
|
||
| Option | Description | | ||
| --- | --- | | ||
| `--scenario <PATH or NAME>` | The name of a builtin scenario or path to a scenario configuration file. Options specified on the command line will override the scenario file. | | ||
|
||
### Target and Backend Configuration | ||
|
||
These options configure how `guidellm` connects to the system under test. | ||
|
||
| Option | Description | | ||
| --- | --- | | ||
| `--target <URL>` | **Required.** The endpoint of the target system, e.g., `http://localhost:8080`. Can also be set with the `GUIDELLM__OPENAI__BASE_URL` environment variable. | | ||
| `--backend-type <TYPE>` | The type of backend to use. Defaults to `openai_http`. | | ||
| `--backend-args <JSON>` | A JSON string for backend-specific arguments. For example: `--backend-args '{"headers": {"Authorization": "Bearer my-token"}, "verify": false}'` to pass custom headers and disable certificate verification. | | ||
| `--model <NAME>` | The ID of the model to benchmark within the backend. | | ||
|
||
### Data and Request Configuration | ||
|
||
These options define the data to be used for benchmarking and how requests will be generated. | ||
|
||
| Option | Description | | ||
| --- | --- | | ||
| `--data <SOURCE>` | The data source. This can be a HuggingFace dataset ID, a path to a local data file, or a synthetic data configuration. See the [Data Formats Guide](./data_formats.md) for more details. | | ||
| `--rate-type <TYPE>` | The type of request generation strategy to use (e.g., `constant`, `poisson`, `sweep`). | | ||
| `--rate <NUMBER>` | The rate of requests per second for `constant` or `poisson` strategies, or the number of steps for a `sweep`. | | ||
| `--max-requests <NUMBER>` | The maximum number of requests to run for each benchmark. | | ||
| `--max-seconds <NUMBER>` | The maximum number of seconds to run each benchmark for. | |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,58 @@ | ||
# Coming Soon | ||
# Configuration | ||
|
||
The `guidellm` application can be configured using command-line arguments, environment variables, or a `.env` file. This page details the file-based and environment variable configuration options. | ||
|
||
## Configuration Methods | ||
|
||
Settings are loaded with the following priority (highest priority first): | ||
1. Command-line arguments. | ||
2. Environment variables. | ||
3. Values in a `.env` file in the directory where the command is run. | ||
4. Default values. | ||
|
||
## Environment Variable Format | ||
|
||
All settings can be configured using environment variables. The variables must be prefixed with `GUIDELLM__`, and nested settings are separated by a double underscore `__`. | ||
|
||
For example, to set the `api_key` for the `openai` backend, you would use the following environment variable: | ||
```bash | ||
export GUIDELLM__OPENAI__API_KEY="your-api-key" | ||
``` | ||
|
||
### Target and Backend Configuration | ||
|
||
You can configure the connection to the target system using environment variables. This is an alternative to using the `--target-*` command-line flags. | ||
|
||
| Environment Variable | Description | Example | | ||
| --- | --- | --- | | ||
| `GUIDELLM__OPENAI__BASE_URL` | The endpoint of the target system. Equivalent to the `--target` CLI option. | `export GUIDELLM__OPENAI__BASE_URL="http://localhost:8080"` | | ||
| `GUIDELLM__OPENAI__API_KEY` | The API key to use for bearer token authentication. | `export GUIDELLM__OPENAI__API_KEY="your-secret-api-key"` | | ||
| `GUIDELLM__OPENAI__BEARER_TOKEN` | The full bearer token to use for authentication. | `export GUIDELLM__OPENAI__BEARER_TOKEN="Bearer your-secret-token"` | | ||
| `GUIDELLM__OPENAI__HEADERS` | A JSON string representing a dictionary of headers to send to the target. These headers will override any default headers. | `export GUIDELLM__OPENAI__HEADERS='{"Authorization": "Bearer my-token"}'` | | ||
| `GUIDELLM__OPENAI__ORGANIZATION` | The OpenAI organization to use for requests. | `export GUIDELLM__OPENAI__ORGANIZATION="org-12345"` | | ||
| `GUIDELLM__OPENAI__PROJECT` | The OpenAI project to use for requests. | `export GUIDELLM__OPENAI__PROJECT="proj-67890"` | | ||
| `GUIDELLM__OPENAI__VERIFY` | Set to `false` or `0` to disable certificate verification. | `export GUIDELLM__OPENAI__VERIFY=false` | | ||
| `GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS` | The default maximum number of tokens to request for completions. | `export GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS=2048` | | ||
|
||
### General HTTP Settings | ||
|
||
These settings control the behavior of the underlying HTTP client. | ||
|
||
| Environment Variable | Description | | ||
| --- | --- | | ||
| `GUIDELLM__REQUEST_TIMEOUT` | The timeout in seconds for HTTP requests. Defaults to 300. | | ||
| `GUIDELLM__REQUEST_HTTP2` | Set to `true` or `1` to enable HTTP/2 support. Defaults to true. | | ||
| `GUIDELLM__REQUEST_FOLLOW_REDIRECTS` | Set to `true` or `1` to allow the client to follow redirects. Defaults to true. | | ||
|
||
|
||
### Using a `.env` file | ||
|
||
You can also place these variables in a `.env` file in your project's root directory: | ||
|
||
```dotenv | ||
# .env file | ||
GUIDELLM__OPENAI__BASE_URL="http://localhost:8080" | ||
GUIDELLM__OPENAI__API_KEY="your-api-key" | ||
GUIDELLM__OPENAI__HEADERS='{"Authorization": "Bearer my-token"}' | ||
GUIDELLM__OPENAI__VERIFY=false | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# Data Formats | ||
|
||
The `--data` argument for the `guidellm benchmark run` command accepts several different formats for specifying the data to be used for benchmarking. | ||
|
||
## Local Data Files | ||
|
||
You can provide a path to a local data file in one of the following formats: | ||
|
||
- **CSV (.csv)**: A comma-separated values file. The loader will attempt to find a column with a common name for the prompt (e.g., `prompt`, `text`, `instruction`). | ||
- **JSON (.json)**: A JSON file. The structure should be a list of objects, where each object represents a row of data. | ||
- **JSON Lines (.jsonl)**: A file where each line is a valid JSON object. | ||
- **Text (.txt)**: A plain text file, where each line is treated as a separate prompt. | ||
|
||
If the prompt column cannot be automatically determined, you can specify it using the `--data-args` option: | ||
```bash | ||
--data-args '{"text_column": "my_custom_prompt_column"}' | ||
``` | ||
|
||
## Synthetic Data | ||
|
||
You can generate synthetic data on the fly by providing a configuration string or file. | ||
|
||
### Configuration Options | ||
|
||
| Parameter | Description | | ||
| --- | --- | | ||
| `prompt_tokens` | **Required.** The average number of tokens for the generated prompts. | | ||
| `output_tokens` | **Required.** The average number of tokens for the generated outputs. | | ||
| `samples` | The total number of samples to generate. Defaults to 1000. | | ||
| `source` | The source text to use for generating the synthetic data. Defaults to a built-in copy of "Pride and Prejudice". | | ||
| `prompt_tokens_stdev` | The standard deviation of the tokens generated for prompts. | | ||
| `prompt_tokens_min` | The minimum number of text tokens generated for prompts. | | ||
| `prompt_tokens_max` | The maximum number of text tokens generated for prompts. | | ||
| `output_tokens_stdev` | The standard deviation of the tokens generated for outputs. | | ||
| `output_tokens_min` | The minimum number of text tokens generated for outputs. | | ||
| `output_tokens_max` | The maximum number of text tokens generated for outputs. | | ||
|
||
### Configuration Formats | ||
|
||
You can provide the synthetic data configuration in one of three ways: | ||
|
||
1. **Key-Value String:** | ||
```bash | ||
--data "prompt_tokens=256,output_tokens=128,samples=500" | ||
``` | ||
|
||
2. **JSON String:** | ||
```bash | ||
--data '{"prompt_tokens": 256, "output_tokens": 128, "samples": 500}' | ||
``` | ||
|
||
3. **YAML or Config File:** | ||
Create a file (e.g., `my_config.yaml`): | ||
```yaml | ||
prompt_tokens: 256 | ||
output_tokens: 128 | ||
samples: 500 | ||
``` | ||
And use it with the `--data` argument: | ||
```bash | ||
--data my_config.yaml | ||
``` |
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -94,6 +94,7 @@ def __init__( | |||||||||||||
extra_query: Optional[dict] = None, | ||||||||||||||
extra_body: Optional[dict] = None, | ||||||||||||||
remove_from_body: Optional[list[str]] = None, | ||||||||||||||
**kwargs, | ||||||||||||||
): | ||||||||||||||
super().__init__(type_="openai_http") | ||||||||||||||
self._target = target or settings.openai.base_url | ||||||||||||||
|
@@ -110,20 +111,36 @@ def __init__( | |||||||||||||
|
||||||||||||||
self._model = model | ||||||||||||||
|
||||||||||||||
# Start with default headers based on other params | ||||||||||||||
default_headers: dict[str, str] = {} | ||||||||||||||
api_key = api_key or settings.openai.api_key | ||||||||||||||
self.authorization = ( | ||||||||||||||
f"Bearer {api_key}" if api_key else settings.openai.bearer_token | ||||||||||||||
) | ||||||||||||||
bearer_token = settings.openai.bearer_token | ||||||||||||||
if api_key: | ||||||||||||||
default_headers["Authorization"] = f"Bearer {api_key}" | ||||||||||||||
elif bearer_token: | ||||||||||||||
default_headers["Authorization"] = bearer_token | ||||||||||||||
|
||||||||||||||
self.organization = organization or settings.openai.organization | ||||||||||||||
if self.organization: | ||||||||||||||
default_headers["OpenAI-Organization"] = self.organization | ||||||||||||||
|
||||||||||||||
self.project = project or settings.openai.project | ||||||||||||||
if self.project: | ||||||||||||||
default_headers["OpenAI-Project"] = self.project | ||||||||||||||
|
||||||||||||||
# User-provided headers from kwargs or settings override defaults | ||||||||||||||
user_headers = kwargs.pop("headers", settings.openai.headers or {}) | ||||||||||||||
default_headers.update(user_headers) | ||||||||||||||
self.headers = default_headers | ||||||||||||||
Comment on lines
+132
to
+134
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be nice to layer this per-key and support removing headers. Something like this:
Suggested change
This would also allow you to simplify the default_headers = {
"Authorization": f"Bearer {api_key}" if api_key else settings.openai.bearer_token,
"OpenAI-Organization": self.organization,
"OpenAI-Project": project or settings.openai.project,
} |
||||||||||||||
|
||||||||||||||
self.timeout = timeout if timeout is not None else settings.request_timeout | ||||||||||||||
self.http2 = http2 if http2 is not None else settings.request_http2 | ||||||||||||||
self.follow_redirects = ( | ||||||||||||||
follow_redirects | ||||||||||||||
if follow_redirects is not None | ||||||||||||||
else settings.request_follow_redirects | ||||||||||||||
) | ||||||||||||||
self.verify = kwargs.pop("verify", settings.openai.verify) | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Due to above suggestion:
Suggested change
|
||||||||||||||
self.max_output_tokens = ( | ||||||||||||||
max_output_tokens | ||||||||||||||
if max_output_tokens is not None | ||||||||||||||
|
@@ -160,9 +177,7 @@ def info(self) -> dict[str, Any]: | |||||||||||||
"timeout": self.timeout, | ||||||||||||||
"http2": self.http2, | ||||||||||||||
"follow_redirects": self.follow_redirects, | ||||||||||||||
"authorization": bool(self.authorization), | ||||||||||||||
"organization": self.organization, | ||||||||||||||
"project": self.project, | ||||||||||||||
"headers": self.headers, | ||||||||||||||
"text_completions_path": TEXT_COMPLETIONS_PATH, | ||||||||||||||
"chat_completions_path": CHAT_COMPLETIONS_PATH, | ||||||||||||||
} | ||||||||||||||
|
@@ -383,6 +398,7 @@ def _get_async_client(self) -> httpx.AsyncClient: | |||||||||||||
http2=self.http2, | ||||||||||||||
timeout=self.timeout, | ||||||||||||||
follow_redirects=self.follow_redirects, | ||||||||||||||
verify=self.verify, | ||||||||||||||
) | ||||||||||||||
self._async_client = client | ||||||||||||||
else: | ||||||||||||||
|
@@ -394,16 +410,7 @@ def _headers(self) -> dict[str, str]: | |||||||||||||
headers = { | ||||||||||||||
"Content-Type": "application/json", | ||||||||||||||
} | ||||||||||||||
|
||||||||||||||
if self.authorization: | ||||||||||||||
headers["Authorization"] = self.authorization | ||||||||||||||
|
||||||||||||||
if self.organization: | ||||||||||||||
headers["OpenAI-Organization"] = self.organization | ||||||||||||||
|
||||||||||||||
if self.project: | ||||||||||||||
headers["OpenAI-Project"] = self.project | ||||||||||||||
|
||||||||||||||
headers.update(self.headers) | ||||||||||||||
return headers | ||||||||||||||
|
||||||||||||||
def _params(self, endpoint_type: EndpointType) -> dict[str, str]: | ||||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
import pytest | ||
|
||
from guidellm.backend import OpenAIHTTPBackend | ||
from guidellm.config import settings | ||
|
||
|
||
@pytest.mark.smoke | ||
def test_openai_http_backend_default_initialization(): | ||
backend = OpenAIHTTPBackend() | ||
assert backend.verify is True | ||
|
||
|
||
@pytest.mark.smoke | ||
def test_openai_http_backend_custom_ssl_verification(): | ||
backend = OpenAIHTTPBackend(verify=False) | ||
assert backend.verify is False | ||
|
||
|
||
@pytest.mark.smoke | ||
def test_openai_http_backend_custom_headers_override(): | ||
# Set a default api_key, which would normally create an Authorization header | ||
settings.openai.api_key = "default-api-key" | ||
|
||
# Set custom headers that override the default Authorization and add a new header | ||
openshift_token = "Bearer sha256~xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" | ||
override_headers = { | ||
"Authorization": openshift_token, | ||
"Custom-Header": "Custom-Value", | ||
} | ||
|
||
# Initialize the backend | ||
backend = OpenAIHTTPBackend(headers=override_headers) | ||
|
||
# Check that the override headers are used | ||
assert backend.headers["Authorization"] == openshift_token | ||
assert backend.headers["Custom-Header"] == "Custom-Value" | ||
assert len(backend.headers) == 2 | ||
|
||
# Reset the settings | ||
settings.openai.api_key = None | ||
settings.openai.headers = None | ||
|
||
|
||
@pytest.mark.smoke | ||
def test_openai_http_backend_kwarg_headers_override_settings(): | ||
# Set headers via settings (simulating environment variables) | ||
settings.openai.headers = {"Authorization": "Bearer settings-token"} | ||
|
||
# Set different headers via kwargs (simulating --backend-args) | ||
override_headers = { | ||
"Authorization": "Bearer kwargs-token", | ||
"Custom-Header": "Custom-Value", | ||
} | ||
|
||
# Initialize the backend with kwargs | ||
backend = OpenAIHTTPBackend(headers=override_headers) | ||
|
||
# Check that the kwargs headers took precedence | ||
assert backend.headers["Authorization"] == "Bearer kwargs-token" | ||
assert backend.headers["Custom-Header"] == "Custom-Value" | ||
assert len(backend.headers) == 2 | ||
|
||
# Reset the settings | ||
settings.openai.headers = None |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
import pytest | ||
from click.testing import CliRunner | ||
|
||
from guidellm.__main__ import cli | ||
|
||
|
||
@pytest.mark.smoke | ||
def test_benchmark_run_with_backend_args(): | ||
runner = CliRunner() | ||
result = runner.invoke( | ||
cli, | ||
[ | ||
"benchmark", | ||
"run", | ||
"--backend-args", | ||
'{"headers": {"Authorization": "Bearer my-token"}, "verify": false}', | ||
"--target", | ||
"http://localhost:8000", | ||
"--data", | ||
"prompt_tokens=1,output_tokens=1", | ||
"--rate-type", | ||
"constant", | ||
"--rate", | ||
"1", | ||
"--max-requests", | ||
"1", | ||
], | ||
) | ||
# This will fail because it can't connect to the server, | ||
# but it will pass the header parsing, which is what we want to test. | ||
assert result.exit_code != 0 | ||
assert "Invalid header format" not in result.output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow same pattern as other optionals