-
Notifications
You must be signed in to change notification settings - Fork 65
FXC-3294-add-opt-in-local-cache-for-simulation-results-hashed-by-simulation-runtime-context #2871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8 files reviewed, 10 comments
Diff CoverageDiff: origin/develop...HEAD, staged and unstaged changes
Summary
tidy3d/config/sections.pytidy3d/web/api/container.pytidy3d/web/api/run.pytidy3d/web/api/webapi.pytidy3d/web/cache.py |
19688af to
4539ed9
Compare
4539ed9 to
80d7e47
Compare
lucas-flexcompute
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this works for photonforge!
Just to make sure I understand correctly, we basically do:
cache = get_cache()
entry = cache.try_fetch(simulation)
if entry: ...
# later, after loading results
cache.store_result(simulation_data, task_id, path)
Can I set path to be the actual cache path? I don't want the user having to worry about where the data will reside.
|
I also just realized there's no way to retrieve the cached data without copying ( Line 254 in 80d7e47
Can we have the option to skip copying? Imagine doing this for hundreds of files in a single circuit simulation… Ideally, I'd like to just retrieve the data itself. |
80d7e47 to
fafeb3c
Compare
I just added the option to specify no path (default). |
I would just rely on |
Ah, I see! Sorry, I completely missed that function! No problems, then! |
fafeb3c to
c8e6049
Compare
Perfect. One correction from my last comment: |
If I don't use |
Yup! Technically, storing is done in load. |
yaugenst-flex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @marcorudolphflex this will be a super useful feature to have!
54acc71 to
99a21b6
Compare
9db9a0c to
86586e0
Compare
86586e0 to
6e39d6c
Compare
8b3d741 to
df6c375
Compare
df6c375 to
2b7c535
Compare
yaugenst-flex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @marcorudolphflex LGTM! Can you make sure all previous comments are resolved and then we can merge.
momchil-flex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is much cleaner now!
My main comments are about making sure this is well documented for the user.
3c650da to
d4c2c89
Compare
Summary
Add opt-in local cache for simulation results hashed by simulation + runtime context
Background
Every repeated run of the same simulation re-uploads/downloads data. We want to cache the resulting .hdf5 locally (opt-in) and reuse it on identical runs. Cache key must go beyond _hash_self() because solver version, workflow path, environment, and runtime options also influence the artifact we return.
Greptile Overview
Updated On: 2025-10-07 11:35:12 UTC
Summary
This PR introduces a comprehensive opt-in local caching system for Tidy3D simulation results to eliminate redundant uploads and downloads when running identical simulations. The cache system stores simulation result HDF5 files locally using a composite cache key that combines simulation hash with runtime context (solver version, workflow type, environment variables) to ensure cache validity beyond simple simulation parameters.The implementation centers around a new
SimulationCacheclass intidy3d/web/cache.pythat provides thread-safe LRU caching with configurable size and entry limits. The cache integrates seamlessly into the existing web API workflow by interceptingrun()calls early to check for cached results and storing successful simulation outputs in theload()function. Cache entries are validated using SHA256 checksums and support atomic file operations to prevent corruption during concurrent access.Configuration is handled through a new
SimulationCacheSettingsclass inconfig.pywith sensible defaults (disabled by default, 10GB max size, 128 max entries,~/.tidy3d/cache/simulationsdirectory). The feature is exposed through an optionaluse_cacheparameter across all API entry points (run(),run_async(), autograd functions) allowing per-call override of global cache settings.The cache system handles both individual
Jobobjects and batch operations throughBatchobjects, with comprehensive error handling for edge cases like cache corruption, missing files, and network failures. The implementation follows enterprise software patterns with proper thread synchronization using a global singleton pattern and LRU eviction policies to manage cache size.Important Files Changed
Changed Files
Confidence score: 4/5
Sequence Diagram
sequenceDiagram participant User participant WebAPI as "web.run()" participant Cache as "SimulationCache" participant Job as "Job" participant Server as "Server" participant Stub as "Tidy3dStub" User->>WebAPI: "run(simulation, use_cache=True)" WebAPI->>Cache: "resolve_simulation_cache(use_cache)" Cache-->>WebAPI: "SimulationCache instance or None" alt Cache enabled WebAPI->>Cache: "try_fetch(simulation)" Cache->>Cache: "build_cache_key(simulation_hash, workflow_type, version)" Cache->>Cache: "_fetch(cache_key)" alt Cache hit Cache-->>WebAPI: "CacheEntry" WebAPI->>Cache: "materialize(path)" Cache-->>WebAPI: "Cached data path" WebAPI->>Stub: "postprocess(path)" Stub-->>WebAPI: "SimulationData" WebAPI-->>User: "SimulationData (from cache)" else Cache miss Cache-->>WebAPI: "None" WebAPI->>Job: "upload(simulation)" Job->>Server: "Upload simulation" Server-->>Job: "task_id" WebAPI->>Job: "start(task_id)" Job->>Server: "Start simulation" WebAPI->>Job: "monitor(task_id)" Job->>Server: "Poll status" Server-->>Job: "Status updates" WebAPI->>Job: "download(task_id, path)" Job->>Server: "Download results" Server-->>Job: "Simulation data file" WebAPI->>Stub: "postprocess(path)" Stub-->>WebAPI: "SimulationData" WebAPI->>Cache: "store_result(stub_data, task_id, path, workflow_type)" Cache->>Cache: "_store(cache_key, task_id, source_path, metadata)" WebAPI-->>User: "SimulationData" end else Cache disabled WebAPI->>Job: "upload(simulation)" Job->>Server: "Upload simulation" Server-->>Job: "task_id" WebAPI->>Job: "start(task_id)" Job->>Server: "Start simulation" WebAPI->>Job: "monitor(task_id)" Job->>Server: "Poll status" Server-->>Job: "Status updates" WebAPI->>Job: "download(task_id, path)" Job->>Server: "Download results" Server-->>Job: "Simulation data file" WebAPI->>Stub: "postprocess(path)" Stub-->>WebAPI: "SimulationData" WebAPI-->>User: "SimulationData" end