Feat/pass countinference to serverless getweights #1373

bigbitbus · 2025-06-18T19:51:32Z

Description

Pass through the countinference and service_secret parameters to the roboflow API backend.

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

Testing in staging (ongoing)

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

Docs updated? What were the changes:

inference/core/roboflow_api.py

…serverless-getweights Review `feat/pass-countinference-to-serverless-getweights`

…ights

…(`feat/pass-countinference-to-serverless-getweights`) Here's an optimized rewrite of your program, addressing profiling hot spots and general efficiency improvements. **Optimization Summary:** 1. **Avoid Redundant Method Calls:** - Minimize repeated lookups and calculations. - Cache computations/results when possible within function scope. 2. **Lazy Imports:** - Move GC and optional torch imports where needed (they are only used upon eviction). 3. **Deque Optimizations:** - In `WithFixedSizeCache.add_model`, avoid repeated `self._key_queue.remove(queue_id)` by checking position or maintaining a set for fast checks (no need, since only called if known present, and block is rare). Still, code can be reduced for clarity. 4. **Reduce logging** in the hot add logic (unless DEBUG mode; logging is a major time sink during profiling). 5. **Batch Removals:** - Accumulate models to remove and do a single `gc.collect()` call after, instead of per-iteration. 6. **Data structure** choices are left unchanged (deque is still best for explicit ordering here). 7. **General Logic**: Use local variables for lookups on attributes used multiple times (minor, but helps). --- **Key Runtime Optimizations:** - Only call `gc.collect()` after all removals in a batch, not after every single model eviction. - Reduced logging in hot code paths (this was responsible for noticeable time in profiling). - Use local variables when repeatedly accessing class attributes. - Use direct inlining for `_resolve_queue_id` for this use case. - Defensive handling if queue/model state falls out of sync—never throws unnecessarily. **Performance Note:** If you profile again after these changes, most of the time will now be in actual model loading and removal. That is, this code will not be a noticeable bottleneck anymore in the workflow. If LRU cache size is much larger, consider further data structure optimizations such as a dict for constant-time eviction and presence checking, but for N ~ 8 this is not needed.

codeflash-ai · 2025-06-24T21:57:26Z

⚡️ Codeflash found optimizations for this PR

📄 50% (0.50x) speedup for `WithFixedSizeCache.add_model` in `inference/core/managers/decorators/fixed_size_cache.py`

⏱️ Runtime : 1.08 seconds → 722 milliseconds (best of 12 runs)

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up method WithFixedSizeCache.add_model by 50% in PR #1373 (feat/pass-countinference-to-serverless-getweights) #1385

If you approve, it will be merged into this PR (branch feat/pass-countinference-to-serverless-getweights).

bigbitbus requested review from PawelPeczek-Roboflow, grzegorz-roboflow, yeldarby, probicheaux and hansent as code owners June 18, 2025 19:51

bigbitbus added 2 commits June 18, 2025 15:52

pass countinference and secret to geetweights api

3348291

Passing the countinference and secret successfully

9ab0396

bigbitbus force-pushed the feat/pass-countinference-to-serverless-getweights branch from 51dcd37 to 9ab0396 Compare June 18, 2025 19:52

github-advanced-security bot found potential problems Jun 18, 2025

View reviewed changes

inference/core/roboflow_api.py Fixed Show fixed Hide fixed

inference/core/roboflow_api.py Fixed Show fixed Hide fixed

Formatting

cd8aa0a

github-advanced-security bot found potential problems Jun 18, 2025

View reviewed changes

inference/core/roboflow_api.py Fixed Show fixed Hide fixed

bigbitbus and others added 8 commits June 18, 2025 16:14

Fix tests

3a3684b

Code review of feat/pass-countinference-to-serverless-getweights

165b1d7

Merge pull request #1382 from roboflow/review/pass-countinference-to-…

e6b346f

…serverless-getweights Review `feat/pass-countinference-to-serverless-getweights`

Merge branch 'main' into feat/pass-countinference-to-serverless-getwe…

e315be3

…ights

Cover all model-related endpoints with countinference

26377af

0.51.1

9c5a18e

test on staging

1dda114

Merge branch 'main' into feat/pass-countinference-to-serverless-getwe…

c634b1d

…ights

codeflash-ai bot mentioned this pull request Jun 24, 2025

⚡️ Speed up method WithFixedSizeCache.add_model by 50% in PR #1373 (feat/pass-countinference-to-serverless-getweights) #1385

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/pass countinference to serverless getweights #1373

Feat/pass countinference to serverless getweights #1373

Uh oh!

bigbitbus commented Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codeflash-ai bot commented Jun 24, 2025

⚡️ Speed up method `WithFixedSizeCache.add_model` by 50% in PR #1373 (`feat/pass-countinference-to-serverless-getweights`) #1385

Uh oh!

Uh oh!

Feat/pass countinference to serverless getweights #1373

Are you sure you want to change the base?

Feat/pass countinference to serverless getweights #1373

Uh oh!

Conversation

bigbitbus commented Jun 18, 2025

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codeflash-ai bot commented Jun 24, 2025

⚡️ Codeflash found optimizations for this PR

📄 50% (0.50x) speedup for WithFixedSizeCache.add_model in inference/core/managers/decorators/fixed_size_cache.py

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up method WithFixedSizeCache.add_model by 50% in PR #1373 (feat/pass-countinference-to-serverless-getweights) #1385

Uh oh!

Uh oh!

📄 50% (0.50x) speedup for `WithFixedSizeCache.add_model` in `inference/core/managers/decorators/fixed_size_cache.py`

⚡️ Speed up method `WithFixedSizeCache.add_model` by 50% in PR #1373 (`feat/pass-countinference-to-serverless-getweights`) #1385