Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow input csv and output csv in experiment run #103

Merged
merged 7 commits into from
Sep 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ A pre-print for this work is available on [arXiv](https://arxiv.org/abs/2408.118

The benefit of _asynchronous querying_ is that it allows for multiple requests to be sent to an API _without_ having to wait for the LLM's response, which is particularly useful to fully utilise the rate limits of an API. This is especially useful when an experiment file contains a large number of prompts and/or has several models to query. [_Asynchronous programming_](https://docs.python.org/3/library/asyncio.html) is simply a way for programs to avoid getting stuck on long tasks (like waiting for an LLM response from an API) and instead keep running other things at the same time (to send other queries).

With `prompto`, you are able to define your experiments of LLMs in a jsonl file where each line contains the prompt and any parameters to be used for a query of a model from a specific API. The library will process the experiment file and query models and store results. You are also able to query _multiple_ models from _different_ APIs in a single experiment file and `prompto` will take care of querying the models _asynchronously_ and in _parallel_.
With `prompto`, you are able to define your experiments of LLMs in a jsonl or csv file where each line/row contains the prompt and any parameters to be used for a query of a model from a specific API. The library will process the experiment file and query models and store results. You are also able to query _multiple_ models from _different_ APIs in a single experiment file and `prompto` will take care of querying the models _asynchronously_ and in _parallel_.

The library is designed to be extensible and can be used to query different models.

Expand Down
20 changes: 20 additions & 0 deletions docs/experiment_file.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

An experiment file is a [JSON Lines (jsonl)](https://jsonlines.org/) file that contains the prompts for the experiments along with any other parameters or metadata that is required for the prompt. Each line in the jsonl file is a valid JSON value which defines a particular input to the LLM which we will obtain a response for. We often refer to a single line in the jsonl file as a "`prompt_dict`" (prompt dictionary).

From `prompto` version 0.2.0 onwards, it's also possible to use `csv` files as input to the pipeline. See the [CSV input section](#csv-input) for more details.

For all models/APIs, we require the following keys in the `prompt_dict`:

* `prompt`: the prompt for the model
Expand All @@ -15,6 +17,9 @@ For all models/APIs, we require the following keys in the `prompt_dict`:

In addition, there are other optional keys that can be included in the `prompt_dict`:

* `id`: a unique identifier for the prompt
* This is a string that can be used to uniquely identify the prompt. This is useful when you want to track the responses to the prompts and match them back to the original prompts
* This is not strictly required, but is often useful to have
* `parameters`: the parameter settings / generation config for the query (given as a dictionary)
* This is a dictionary that contains the parameters for the query. The parameters are specific to the model and the API being used. For example, for the Gemini API (`"api": "gemini"`), some parameters to configure are {`temperature`, `max_output_tokens`, `top_p`, `top_k`} etc. which are used to control the generation of the response. For the OpenAI API (`"api": "openai"`), some of these parameters are named differently for instance the maximum output tokens is set using the `max_tokens` parameter and `top_k` is not available to set. For Ollama (`"api": "ollama"`), the parameters are different again, e.g. the maximum number of tokens to predict is set using `num_predict`
* See the API documentation for the specific API for the list of parameters that can be set and their default values
Expand All @@ -23,3 +28,18 @@ In addition, there are other optional keys that can be included in the `prompt_d
* Note that you can use parallel processing without using the "group" key, but using this key allows you to have full control in order group the prompts in a way that makes sense for your use case. See the [specifying rate limits documentation](rate_limits.md) for more details on parallel processing

Lastly, there are other optional keys that are only available for certain APIs/models. For example, for the Gemini API, you can have a `multimedia` key which is a list of dictionaries defining the multimedia files (e.g. images/videos) to be used in the prompt to a multimodal LLM. For these, see the documentation for the specific API/model for more details.

## CSV input

For using CSV inputs, the `prompt_dict`s are defined as rows in the CSV file. The CSV file should have a header row with the keys corresponding to the keys above with the exception of the `parameters` key. The parameters (the keys in the dictionary) should have their own columns in the CSV file _prepended with a "parameters-" prefix_. For example, if you have a parameter `temperature` in the `parameters` dictionary, you should have a column named `parameters-temperature` in the CSV file. The values for the parameters should be in the corresponding columns.

For example, the two jsonl and csv file inputs are equivalent:

```json
{"id": "id-0", "prompt": "What is the capital of France?", "api": "openai", "model_name": "gpt-3.5-turbo", "parameters": {"temperature": 0.5, "max_tokens": 100}}
```

```csv
id,prompt,api,model_name,parameters-temperature,parameters-max_tokens
id-0,What is the capital of France?,openai,gpt-3.5-turbo,0.5,100
```
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
24-09-2024, 09:14: Error (i=1, id=9): NotImplementedError - API unknown-api not recognised or implemented
24-09-2024, 09:14: Error (i=2, id=10): NotImplementedError - API unknown-api not recognised or implemented
24-09-2024, 09:14: Error (i=3, id=11): NotImplementedError - API unknown-api not recognised or implemented
24-09-2024, 09:14: Completed experiment: test.jsonl! Experiment processing time: 3.703 seconds, Average time per query: 1.234 seconds

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{"id": 9, "prompt": ["Hello", "My name is Bob and I'm 6 years old", "How old am I next year?"], "api": "test", "model_name": "test", "parameters": {"candidate_count": 1, "max_output_tokens": 64, "temperature": 1, "top_k": 40}, "response": "This is a test response"}
{"id": 10, "prompt": ["Can you give me a random number between 1-10?", "What is +5 of that number?", "What is half of that number?"], "api": "test", "model_name": "test", "parameters": {"candidate_count": 1, "max_output_tokens": 128, "temperature": 0.5, "top_k": 40}, "response": "This is a test response"}
{"id": 11, "prompt": "How many theaters are there in London's South End?", "api": "test", "model_name": "test", "response": "This is a test response"}
{"id": 9, "prompt": ["Hello", "My name is Bob and I'm 6 years old", "How old am I next year?"], "api": "test", "model_name": "test", "parameters": {"candidate_count": 1, "max_output_tokens": 64, "temperature": 1, "top_k": 40}, "timestamp_sent": "24-09-2024-09-14-39", "response": "This is a test response"}
{"id": 10, "prompt": ["Can you give me a random number between 1-10?", "What is +5 of that number?", "What is half of that number?"], "api": "test", "model_name": "test", "parameters": {"candidate_count": 1, "max_output_tokens": 128, "temperature": 0.5, "top_k": 40}, "timestamp_sent": "24-09-2024-09-14-40", "response": "This is a test response"}
{"id": 11, "prompt": "How many theaters are there in London's South End?", "api": "test", "model_name": "test", "timestamp_sent": "24-09-2024-09-14-41", "response": "ValueError - This is a test error which we should handle and return"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
24-09-2024, 09:14: Error (i=3, id=11): ValueError - This is a test error which we should handle and return
24-09-2024, 09:14: Completed experiment: test2.jsonl! Experiment processing time: 3.615 seconds, Average time per query: 1.205 seconds

This file was deleted.

This file was deleted.

Loading
Loading