Skip to content

perf(llm): Optimize pruneLines functions in countTokens #5310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

0x23d11
Copy link
Contributor

@0x23d11 0x23d11 commented Apr 23, 2025

Description

Closes #4947

This PR addresses issue #4947 by optimizing the performance of the pruneLinesFromTop and pruneLinesFromBottom functions in core/llm/countTokens.ts

Problem

The previous implementations used Array.prototype.shift() and Array.prototype.pop() within a while loop to remove lines from the beginning or end of a prompt until it fit within the token limit. These array modification methods have a time complexity of O(n) because they require shifting subsequent elements. When dealing with very long prompts (e.g., thousands of lines), repeatedly calling shift() or pop() becomes computationally expensive, leading to significant performance degradation.

Solution

The implemented solution refactors these functions to avoid costly array modifications within the loop:

  1. The prompt is split into lines.
  2. The token count for each line is calculated once upfront and stored in an array (lineTokens).
  3. The total initial token count is calculated by summing lineTokens and adding the count for necessary newline characters (\n).
  4. A while loop iterates as long as the totalTokens exceeds maxTokens.
  5. Inside the loop, instead of removing elements from the lines array, an index pointer (start or end) is adjusted.
  6. The pre-calculated token count for the line being (conceptually) removed, along with its corresponding newline token, is subtracted from totalTokens.
  7. After the loop, Array.prototype.slice() (an O(n) operation performed only once) is used with the final start or end index to extract the desired lines.
  8. The resulting lines are joined back into a string.

Benefits

This approach drastically reduces the computational complexity, especially for large prompts, as the expensive O(n) operations (shift/pop) inside the loop are replaced by cheap O(1) index increments/decrements. The token calculation per line and the final slice operation are performed only once.

Checklist

  • I've read the contributing guide
  • [] The relevant docs, if any, have been updated or created
  • [] The relevant tests, if any, have been updated or created

Screenshots

[ For visual changes, include screenshots. Screen recordings are particularly helpful, and appreciated! ]

Testing instructions

[ For new or modified features, provide step-by-step testing instructions to validate the intended behavior of the change, including any relevant tests to run. ]

@0x23d11 0x23d11 requested a review from a team as a code owner April 23, 2025 12:11
@0x23d11 0x23d11 requested review from sestinj and removed request for a team April 23, 2025 12:11
Copy link

netlify bot commented Apr 23, 2025

Deploy Preview for continuedev canceled.

Name Link
🔨 Latest commit 35b3189
🔍 Latest deploy log https://app.netlify.com/sites/continuedev/deploys/6808d8e238a57c00082bdaa9

Copy link
Contributor

@sestinj sestinj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fairly complex change that affects a lot of things, so if we are going to consider merging this I would ask that it come with very good testing. Are you able to add unit tests that cover edge cases and make sure that this hasn't regressed in any way?

@RomneyDa
Copy link
Collaborator

RomneyDa commented Apr 23, 2025

@0x23d11 before writing tests for this note that #5138 changes pruning logic quite a bit and will effect this.

EDIT: didn't look closely enough, looks like this is for pruning lines not messages

@sestinj sestinj closed this Apr 23, 2025
@continuedev continuedev deleted a comment from sestinj Apr 23, 2025
@RomneyDa RomneyDa reopened this Apr 23, 2025
@RomneyDa
Copy link
Collaborator

@0x23d11 There are some tests in countTokens.test.ts that you could just flesh out a bit with more examples and then unskip!

@0x23d11
Copy link
Contributor Author

0x23d11 commented Apr 24, 2025

@RomneyDa ok I've seen the tests, I'll write some more examples for the tests.

After that it should be good to go right? Considering all the new test additions are good enough

@sestinj
Copy link
Contributor

sestinj commented Apr 29, 2025

@0x23d11 I'd love to merge this PR, please let me know if you have the chance to write some tests, or if you'd like any help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants