Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ways of telling by how much a too-large input is too large #5

Open
domenic opened this issue Aug 30, 2024 · 3 comments · May be fixed by #31
Open

Ways of telling by how much a too-large input is too large #5

domenic opened this issue Aug 30, 2024 · 3 comments · May be fixed by #31

Comments

@domenic
Copy link
Collaborator

domenic commented Aug 30, 2024

It's possible an input fed to these APIs is too large. This will be signaled by a rejected promise, probably a "QuotaExceededError" DOMException.

However, @uskay points out that this does not allow you to give informative error messages, telling the user or web developer by how much the input is too large.

There are two possible APIs one could imagine here:

Measure, then summarize

const summarizer = await ai.summarizer.create();
const summarizerCapabilities = await ai.summarizer.capabilities();

const tokenCount = await summarizer.countTokens(input);
if (tokenCount > summarizerCapabilities.maxTokens) {
  console.error(`Too large! You tried to summarize ${tokenCount} tokens, but only up to ${summarizerCapabilities.maxTokens} is possible!`);
} else {
  console.log(await summarizer.summarize(input));
}

This API is probably bad because it requires two round-trips to the language model, one to tokenize, and then a second one to tokenize-plus-summarize.

More informative errors

This would probably look something like:

const summarizer = await ai.summarizer.create();
const summarizerCapabilities = await ai.summarizer.capabilities();

try {
  console.log(await summarizer.summarize(input));
} catch (e) {
  if (e.name === "TooManyTokensError") {
    console.error(`Too large! You tried to summarize ${e.tokenCount} tokens, but only up to ${summarizerCapabilities.maxTokens} is possible!`);
  } else {
    throw e;
  }
}

This is probably better since it only has one round trip.

@domenic
Copy link
Collaborator Author

domenic commented Nov 26, 2024

In an separate thread @andreban pointed out that you probably need both. Above I pointed out situations where "more informative errors" is better, but his scenario is

A developer may want to provide feedback to the user how close they are to reaching the limit before the user submits the content to be summarized, so you'd need to know max tokens and be able to count tokens, without actually summarizing

which cannot really be handled without the "measure, then summarize" approach. We'll just have to be sure to put appropriate warnings on the countTokens() API.

domenic added a commit that referenced this issue Jan 22, 2025
@domenic domenic linked a pull request Jan 22, 2025 that will close this issue
@domenic
Copy link
Collaborator Author

domenic commented Jan 22, 2025

I've put up an initial draft for this in #31.

The prompt API will change to align with this.

@domenic
Copy link
Collaborator Author

domenic commented Feb 26, 2025

Let's assume that whatwg/webidl#1465 works out. Then we will have a QuotaExceededError with properties quota and requested.

Should we rename other properties and methods to align with this?

Considerations:

  • It might be good to move away from "tokens" since that is somewhat specific to the LLM implementation strategy.
  • Similarly, it might be good to move away from the current prompt API's oncontextoverflow, since the word "context" doesn't appear anywhere else in the API.
  • Probably reusing the "quota" language is a good idea?
  • We should probably align prompt API and writing assistance APIs to some degree.
    • Right now prompt API has countPromptTokens(), maxTokens, tokensSoFar, tokensLeft.
    • Writing assistance APIs don't need a tokensSoFar/tokensLeft, since they're not stateful, but they would benefit from a count method and a max value.

Proposal 1:

  • maxTokens => quota
  • tokensSoFar => consumed (or used?)
  • tokensLeft => nothing, just subtract
  • countPromptTokens() => consumption() (or usage()?)
  • oncontextoverflow => onquotaoverflow

These are nice and short. However, I'm a bit worried that these names are too generic. Compare ai.languageModel.consumption("string") to ai.languageModel.countTokens("string"), or ai.languageModel.tokensSoFar to ai.languageModel.consumed.

So proposal 2 would be to add a prefix, and change a few names to fit better with that prefix:

  • maxTokens => inputQuota
  • tokensSoFar => inputUsage
  • tokensLeft => nothing, just subtract
  • countPromptTokens() => measureInputUsage()
  • oncontextoverflow => oninputquotaoverflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant