Ways of telling by how much a too-large input is too large #5

domenic · 2024-08-30T15:23:50Z

It's possible an input fed to these APIs is too large. This will be signaled by a rejected promise, probably a "QuotaExceededError" DOMException.

However, @uskay points out that this does not allow you to give informative error messages, telling the user or web developer by how much the input is too large.

There are two possible APIs one could imagine here:

Measure, then summarize

const summarizer = await ai.summarizer.create();
const summarizerCapabilities = await ai.summarizer.capabilities();

const tokenCount = await summarizer.countTokens(input);
if (tokenCount > summarizerCapabilities.maxTokens) {
  console.error(`Too large! You tried to summarize ${tokenCount} tokens, but only up to ${summarizerCapabilities.maxTokens} is possible!`);
} else {
  console.log(await summarizer.summarize(input));
}

This API is probably bad because it requires two round-trips to the language model, one to tokenize, and then a second one to tokenize-plus-summarize.

More informative errors

This would probably look something like:

const summarizer = await ai.summarizer.create();
const summarizerCapabilities = await ai.summarizer.capabilities();

try {
  console.log(await summarizer.summarize(input));
} catch (e) {
  if (e.name === "TooManyTokensError") {
    console.error(`Too large! You tried to summarize ${e.tokenCount} tokens, but only up to ${summarizerCapabilities.maxTokens} is possible!`);
  } else {
    throw e;
  }
}

This is probably better since it only has one round trip.

The text was updated successfully, but these errors were encountered:

domenic · 2024-11-26T03:28:54Z

In an separate thread @andreban pointed out that you probably need both. Above I pointed out situations where "more informative errors" is better, but his scenario is

A developer may want to provide feedback to the user how close they are to reaching the limit before the user submits the content to be summarized, so you'd need to know max tokens and be able to count tokens, without actually summarizing

which cannot really be handled without the "measure, then summarize" approach. We'll just have to be sure to put appropriate warnings on the countTokens() API.

Closes #5.

domenic · 2025-01-22T07:21:48Z

I've put up an initial draft for this in #31.

The prompt API will change to align with this.

Closes #5.

domenic · 2025-02-26T05:55:39Z

Let's assume that whatwg/webidl#1465 works out. Then we will have a QuotaExceededError with properties quota and requested.

Should we rename other properties and methods to align with this?

Considerations:

It might be good to move away from "tokens" since that is somewhat specific to the LLM implementation strategy.
Similarly, it might be good to move away from the current prompt API's oncontextoverflow, since the word "context" doesn't appear anywhere else in the API.
Probably reusing the "quota" language is a good idea?
We should probably align prompt API and writing assistance APIs to some degree.
- Right now prompt API has countPromptTokens(), maxTokens, tokensSoFar, tokensLeft.
- Writing assistance APIs don't need a tokensSoFar/tokensLeft, since they're not stateful, but they would benefit from a count method and a max value.

Proposal 1:

maxTokens => quota
tokensSoFar => consumed (or used?)
tokensLeft => nothing, just subtract
countPromptTokens() => consumption() (or usage()?)
oncontextoverflow => onquotaoverflow

These are nice and short. However, I'm a bit worried that these names are too generic. Compare ai.languageModel.consumption("string") to ai.languageModel.countTokens("string"), or ai.languageModel.tokensSoFar to ai.languageModel.consumed.

So proposal 2 would be to add a prefix, and change a few names to fit better with that prefix:

maxTokens => inputQuota
tokensSoFar => inputUsage
tokensLeft => nothing, just subtract
countPromptTokens() => measureInputUsage()
oncontextoverflow => oninputquotaoverflow

domenic added a commit that referenced this issue Jan 22, 2025

Add too many tokens errors and token-counting APIs

9211561

Closes #5.

domenic linked a pull request Jan 22, 2025 that will close this issue

Add too many tokens errors and token-counting APIs #31

Open

domenic added a commit that referenced this issue Jan 24, 2025

Add too many tokens errors and token-counting APIs

70d6311

Closes #5.

domenic mentioned this issue Jan 27, 2025

Upgrade QuotaExceededError to a DOMException derived interface whatwg/webidl#1465

Open

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ways of telling by how much a too-large input is too large #5

Ways of telling by how much a too-large input is too large #5

domenic commented Aug 30, 2024 •

edited

Loading

domenic commented Nov 26, 2024

domenic commented Jan 22, 2025

domenic commented Feb 26, 2025

Ways of telling by how much a too-large input is too large #5

Ways of telling by how much a too-large input is too large #5

Comments

domenic commented Aug 30, 2024 • edited Loading

Measure, then summarize

More informative errors

domenic commented Nov 26, 2024

domenic commented Jan 22, 2025

domenic commented Feb 26, 2025

domenic commented Aug 30, 2024 •

edited

Loading