Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,7 @@ services:
# LLM_MODEL: "qwen3:8b"
# OLLAMA_HOST: "http://host.docker.internal:11434"
# OLLAMA_CONTEXT_LENGTH: "8192" # Sets Ollama NumCtx (context window)
# OLLAMA_HEADERS: "Authorization=Bearer mytoken" # Optional headers for reverse-proxy auth
# TOKEN_LIMIT: 1000 # Recommended for smaller models

# Option 5: Anthropic/Claude
Expand Down Expand Up @@ -569,6 +570,7 @@ For best results with the enhanced OCR features:
| `VISION_LLM_TEMPERATURE` | Sampling temperature for Vision OCR generation. Lower is more deterministic. Important: For OpenAI GPT-5 it must be explicitly set to `1.0`. | No | |
| `OLLAMA_CONTEXT_LENGTH` | (Ollama only) Integer. Sets NumCtx (context window) for the Ollama runner. If unset or 0, the model default is used. | No | |
| `OLLAMA_OCR_TOP_K` | (Ollama only) Top-k token sampling for Vision OCR. Lower favors more likely tokens; higher increases diversity. | No | |
| `OLLAMA_HEADERS` | (Ollama only) Comma-separated `Key=Value` pairs added as HTTP headers to every Ollama request. Useful for authorization when Ollama is behind a reverse proxy (e.g. `Authorization=Bearer mytoken`). | No | |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document the comma-in-value limitation of OLLAMA_HEADERS.

The current description implies a simple Key=Value,Key2=Value2 split. Two edge cases users may hit:

  1. Comma in value – e.g. Accept=text/html, application/json is unparseable with this format; no escaping mechanism is mentioned.
  2. = in value – e.g. a Base64 Basic auth token (Authorization=Basic dXNlcjpwYXNz) must be parsed by splitting on the first = only.

Mentioning at least the comma limitation prevents silent misconfiguration.

📝 Suggested documentation addition
-| `OLLAMA_HEADERS`                    | (Ollama only) Comma-separated `Key=Value` pairs added as HTTP headers to every Ollama request. Useful for authorization when Ollama is behind a reverse proxy (e.g. `Authorization=Bearer mytoken`). | No       |                            |
+| `OLLAMA_HEADERS`                    | (Ollama only) Comma-separated `Key=Value` pairs added as HTTP headers to every Ollama request. Useful for authorization when Ollama is behind a reverse proxy (e.g. `Authorization=Bearer mytoken`). Each pair is split on the **first** `=`; header values must not contain commas. | No       |                            |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| `OLLAMA_HEADERS` | (Ollama only) Comma-separated `Key=Value` pairs added as HTTP headers to every Ollama request. Useful for authorization when Ollama is behind a reverse proxy (e.g. `Authorization=Bearer mytoken`). | No | |
| `OLLAMA_HEADERS` | (Ollama only) Comma-separated `Key=Value` pairs added as HTTP headers to every Ollama request. Useful for authorization when Ollama is behind a reverse proxy (e.g. `Authorization=Bearer mytoken`). Each pair is split on the **first** `=`; header values must not contain commas. | No | |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 573, Update the README entry for OLLAMA_HEADERS to
explicitly state that values cannot contain commas (there is no escaping for
commas), so header values with commas (e.g., "Accept=text/html,
application/json") are not supported; also clarify parsing semantics that
keys/values are parsed by splitting on the first '=' (so values may contain '='
characters such as Base64 tokens), and give a short example of a valid header
string and a recommended workaround (e.g., use a reverse proxy or alternate env
var) to avoid silent misconfiguration.

| `AZURE_DOCAI_ENDPOINT` | Azure Document Intelligence endpoint. Required if OCR_PROVIDER is `azure`. | Cond. | |
| `AZURE_DOCAI_KEY` | Azure Document Intelligence API key. Required if OCR_PROVIDER is `azure`. | Cond. | |
| `AZURE_DOCAI_MODEL_ID` | Azure Document Intelligence model ID. Optional if using `azure` provider. | No | prebuilt-read |
Expand Down Expand Up @@ -917,6 +919,7 @@ When using local LLMs (like those through Ollama), you might need to adjust cert

- Use `TOKEN_LIMIT` environment variable to control the maximum number of tokens sent to the LLM
- For Ollama, set `OLLAMA_CONTEXT_LENGTH` to control the model's context window (NumCtx). This is independent of `TOKEN_LIMIT` and configures the server-side KV cache size. If unset or 0, the model default is used. Choose a value within the model's supported window (e.g., 8192).
- If Ollama is behind a reverse proxy that requires authentication, set `OLLAMA_HEADERS` to a comma-separated list of `Key=Value` header pairs (e.g. `Authorization=Bearer mytoken`).
- Smaller models might truncate content unexpectedly if given too much text
- Start with a conservative limit (e.g., 1000 tokens) and adjust based on your model's capabilities
- Set to `0` to disable the limit (use with caution)
Expand Down
10 changes: 9 additions & 1 deletion main.go
Original file line number Diff line number Diff line change
Expand Up @@ -992,6 +992,9 @@ func createLLM() (llms.Model, error) {
log.Warnf("Invalid OLLAMA_CONTEXT_LENGTH value: %v, ignoring", err)
}
}
if client := ocr.OllamaHTTPClient(); client != nil {
opts = append(opts, ollama.WithHTTPClient(client))
}
llm, err := ollama.New(opts...)
if err != nil {
return nil, err
Expand Down Expand Up @@ -1100,6 +1103,9 @@ func createVisionLLM() (llms.Model, error) {
log.Warnf("Invalid OLLAMA_CONTEXT_LENGTH value: %v, ignoring", err)
}
}
if client := ocr.OllamaHTTPClient(); client != nil {
opts = append(opts, ollama.WithHTTPClient(client))
}
llm, err := ollama.New(opts...)
if err != nil {
return nil, err
Expand Down Expand Up @@ -1136,6 +1142,7 @@ func createVisionLLM() (llms.Model, error) {
}
}


func createCustomHTTPClient() *http.Client {
// Create custom transport that adds headers
customTransport := &headerTransport{
Expand All @@ -1160,8 +1167,9 @@ type headerTransport struct {

// RoundTrip implements the http.RoundTripper interface
func (t *headerTransport) RoundTrip(req *http.Request) (*http.Response, error) {
req = req.Clone(req.Context())
for key, value := range t.headers {
req.Header.Add(key, value)
req.Header.Set(key, value)
}
return t.transport.RoundTrip(req)
}
42 changes: 42 additions & 0 deletions ocr/llm_provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"encoding/base64"
"fmt"
"image"
"net/http"
"os"
"strings"

Expand Down Expand Up @@ -208,6 +209,44 @@ func createOpenAIClient(config Config) (llms.Model, error) {
)
}

// OllamaHTTPClient returns an *http.Client with headers from OLLAMA_HEADERS injected,
// or nil if OLLAMA_HEADERS is not set.
func OllamaHTTPClient() *http.Client {
raw := os.Getenv("OLLAMA_HEADERS")
if raw == "" {
return nil
}
headers := map[string]string{}
for _, pair := range strings.Split(raw, ",") {
parts := strings.SplitN(strings.TrimSpace(pair), "=", 2)
if len(parts) == 2 && parts[0] != "" {
headers[parts[0]] = parts[1]
}
}
if len(headers) == 0 {
return nil
}
return &http.Client{
Transport: &ollamaHeaderTransport{
base: http.DefaultTransport,
headers: headers,
},
}
}

type ollamaHeaderTransport struct {
base http.RoundTripper
headers map[string]string
}

func (t *ollamaHeaderTransport) RoundTrip(req *http.Request) (*http.Response, error) {
req = req.Clone(req.Context())
for k, v := range t.headers {
req.Header.Set(k, v)
}
return t.base.RoundTrip(req)
}

// createOllamaClient creates a new Ollama vision model client
func createOllamaClient(config Config) (llms.Model, error) {
host := os.Getenv("OLLAMA_HOST")
Expand All @@ -221,6 +260,9 @@ func createOllamaClient(config Config) (llms.Model, error) {
if config.OllamaContextLength > 0 {
opts = append(opts, ollama.WithRunnerNumCtx(config.OllamaContextLength))
}
if client := OllamaHTTPClient(); client != nil {
opts = append(opts, ollama.WithHTTPClient(client))
}
return ollama.New(opts...)
}

Expand Down
Loading