Skip to content

Latest commit

 

History

History
233 lines (172 loc) · 5.62 KB

README.md

File metadata and controls

233 lines (172 loc) · 5.62 KB

hellmer

This package enables batch processing of chat models from ellmer supporting sequential or parallel processing.

Installation

devtools::install_github("dylanpieper/hellmer")

Overview

Process multiple chat interactions with:

  • Sequential or parallel processing
  • State persistence and recovery
  • Progress tracking
  • Structured data extraction
  • Tool integration
  • Configurable output verbosity
  • Automatic retry with backoff
  • Timeout handling
  • Sound notifications

Installation

devtools::install_github("dylanpieper/hellmer")

Basic Usage

Sequential Processing

chat <- chat_batch(chat_claude("You reply concisely"))

prompts <- list(
  "What is 2+2?",
  "Name one planet.",
  "Is water wet?",
  "What color is the sky?",
  "Count to 3.",
  "Say hello.",
  "Name a primary color.",
  "What is 5x5?",
  "True or false: Birds can fly.",
  "What day comes after Monday?"
)

result <- chat$batch(prompts)

result$progress()
result$texts()
result$chats()

Parallel Processing

Simply swap chat_batch() for chat_parallel() to enable parallel processing.

chat <- chat_parallel(chat_claude("You reply concisely"))

Features

State Management

Batch processing automatically saves state and can resume interrupted operations:

result <- chat$batch(prompts, state_path = "chat_state.rds")

If state_path is not defined, a temporary file will be created by default.

Structured Data Extraction

Extract structured data using type specifications:

type_sentiment <- type_object(
  "Extract sentiment scores",
  positive_score = type_number("Positive sentiment score, 0.0 to 1.0"),
  negative_score = type_number("Negative sentiment score, 0.0 to 1.0"),
  neutral_score = type_number("Neutral sentiment score, 0.0 to 1.0")
)

prompts <- list(
  "I love this product! It's amazing!",
  "This is okay, nothing special.",
  "Terrible experience, very disappointed."
)

result <- chat$batch(prompts, type_spec = type_sentiment)
structured_data <- result$structured_data()

Tool Integration

Register and use tools (R functions):

square_number <- function(num) num^2

chat$register_tool(tool(
  square_number,
  "Calculates the square of a given number",
  num = type_integer("The number to square")
))

prompts <- list(
  "What is the square of 3?",
  "Calculate the square of 5."
)

Output Control

Control verbosity with the echo parameter (sequential only):

  • "none": Silent operation with progress bar
  • "text": Show chat responses only
  • "all": Show both prompts and responses
chat <- chat_batch(
  chat_claude("You reply concisely"), 
  echo = "none"
)

Automatic Retry

Automatically retry failed requests with backoff:

chat <- chat_batch(
  chat_claude(),
  max_retries = 3,    # Maximum number of retry attempts
  initial_delay = 1,  # Initial delay in seconds
  max_delay = 32,     # Maximum delay between retries
  backoff_factor = 2  # Multiply delay by this factor after each retry
)

If a request fails, the code will:

  1. Wait for the initial_delay
  2. Retry the request 3
  3. If it fails again, wait for (delay × backoff_factor)
  4. Continue until success or max_retries is reached

If the code detects an authorization or API key issue, it will stop immediately.

Timeout Handling

The timeout parameter specifies the maximum time to wait for a response from the chat model for each prompt. However, this parameter is still limited by the timeouts propagated up from the ellmer chat models.

chat <- chat_parallel(
  ellmer::chat_ollama(
    model = "deepseek-r1:8b",
    echo = "none"
  ),
  timeout = 60
)

Sound Notifications

Toggle sound notifications on batch completion, interruption, and error:

chat <- chat_batch(
  chat_claude(),
  beep = TRUE
)

Quick References

chat_batch()

Creates a sequential batch processor.

chat_batch(
  chat_model = chat_claude(),  # Base chat model
  echo = "none",               # Output verbosity (sequential only)
  beep = TRUE,                 # Toggle sound notifications
  max_retries = 3,             # Maximum retry attempts
  initial_delay = 1,           # Initial retry delay in seconds
  max_delay = 32,              # Maximum delay between retries
  backoff_factor = 2           # Exponential backoff multiplier
)

chat_parallel()

Creates a parallel batch processor.

chat_parallel(
  chat_model = chat_claude(),  # Base chat model
  beep = TRUE,                 # Enable sound notifications
  plan = "multisession",       # "multisession" or "multicore"
  workers = 4                  # Number of parallel workers
)

batch$batch()

Processes a list or vector of prompts.

batch(
  prompts,            # List of prompts to process
  type_spec = NULL,   # Optional type specification for structured data
  state_path = NULL,  # Optional path for state persistence
  chunk_size = 4      # Number of prompts per chunk (parallel only)
)

You can mimic sequential processing when using chat_parallel() by setting the chunk_size = 1, but this will likely decrease performance compared to chat_batch() (see tests/test_benchmark.R).

Results Methods

  • texts(): Returns response texts in the same format as the input prompts (i.e., a list if prompts were provided as a list, or a character vector if prompts were provided as a vector)
  • chats(): Returns a list of chat objects
  • progress(): Returns processing statistics
  • structured_data(): Returns extracted structured data (if type_spec is provided)