feat: add runners to pb protocol#2433
feat: add runners to pb protocol#2433MasterPtato wants to merge 3 commits into05-27-chore_pegboard_add_exit_code_log_from_container_runnerfrom
Conversation
d3bc56e to
9e9302b
Compare
13d14bb to
35fc915
Compare
9e9302b to
b4a61a7
Compare
35fc915 to
bb8ccd5
Compare
faaaf17 to
542b3c5
Compare
542b3c5 to
885534d
Compare
There was a problem hiding this comment.
PR Summary
This PR introduces a major architectural change to the Rivet project by adding runners to the pegboard protocol and removing the isolate-v8-runner implementation. The changes consolidate runner management into a more generic protocol while maintaining backward compatibility.
- Removes
isolate-v8-runnerpackage and related JavaScript runtime functionality, suggesting a shift away from V8-based isolation - Adds new
actor2workflow with support for multi-slot runners and improved state management inpackages/edge/services/pegboard/src/workflows/actor2/ - Introduces runner tracking by remaining slots with new key structures in
packages/edge/services/pegboard/src/keys/runner.rs - Refactors container runner to use generic runner terminology and WebSocket-based communication in
packages/edge/infra/client/container-runner/ - Adds subnet configuration and IP management for runners in
pegboard_configure.shwith new variables likeSUBNET_IPV4andSUBNET_IPV6
76 file(s) reviewed, 49 comment(s)
Edit PR Review Bot Settings | Greptile
docker/universal/Dockerfile
Outdated
| --mount=type=cache,target=/root/.npm,id=universal-user-npm \ | ||
| --mount=type=cache,target=/root/.yarn,id=universal-user-yarn \ | ||
| RUSTFLAGS="--cfg tokio_unstable" RIVET_BUILD_HUB=0 cargo build --bin rivet-server --bin rivet-edge-server --bin rivet-guard --bin rivet-client --bin rivet-isolate-v8-runner --bin rivet-container-runner && \ | ||
| RUSTFLAGS="--cfg tokio_unstable" RIVET_BUILD_HUB=0 cargo build --bin rivet-server --bin rivet-edge-server --bin rivet-guard --bin rivet-client --bin rivet-container-runner && \ |
There was a problem hiding this comment.
logic: The rivet-isolate-v8-runner binary has been removed from the build command. Verify that this functionality is either no longer needed or has been replaced elsewhere in the codebase.
| http2: { | ||
| protocol: "http", | ||
| internalPort: 8085, | ||
| routing: { | ||
| guard: {}, | ||
| }, | ||
| }, |
There was a problem hiding this comment.
logic: HTTP2 port is configured but no tests are implemented to verify HTTP2 functionality
packages/common/hub-embed/build.rs
Outdated
| // Copy dist directory to out_dir | ||
| let dist_path = hub_path.join("dist"); | ||
| fs_extra::dir::copy( | ||
| dist_path.clone(), | ||
| out_dir, | ||
| &fs_extra::dir::CopyOptions::new().content_only(true), | ||
| )?; |
There was a problem hiding this comment.
logic: This will fail if the dist directory doesn't exist when RIVET_BUILD_HUB is false. The old code created an empty directory in this case, but now it tries to copy a potentially non-existent directory.
| // Copy dist directory to out_dir | |
| let dist_path = hub_path.join("dist"); | |
| fs_extra::dir::copy( | |
| dist_path.clone(), | |
| out_dir, | |
| &fs_extra::dir::CopyOptions::new().content_only(true), | |
| )?; | |
| // Copy dist directory to out_dir if it exists | |
| let dist_path = hub_path.join("dist"); | |
| if dist_path.exists() { | |
| fs_extra::dir::copy( | |
| dist_path.clone(), | |
| out_dir, | |
| &fs_extra::dir::CopyOptions::new().content_only(true), | |
| )?; | |
| } else { | |
| // Create empty directory when dist doesn't exist | |
| fs::create_dir_all(out_dir)? | |
| } |
packages/common/hub-embed/build.rs
Outdated
| println!("cargo:rerun-if-changed={}", hub_path.display()); | ||
| println!("cargo:rerun-if-env-changed=FONTAWESOME_PACKAGE_TOKEN"); |
There was a problem hiding this comment.
style: These rerun-if-changed triggers are now outside the build conditional, meaning they'll trigger rebuilds even when RIVET_BUILD_HUB is false.
| let mut create_sub = ctx | ||
| .subscribe::<pegboard::workflows::actor::CreateComplete>(("actor_id", actor_id)) | ||
|
|
||
| if let build::types::BuildAllocationType::None = build.allocation_type { |
There was a problem hiding this comment.
style: Consider extracting common workflow setup code to avoid duplication between allocation type branches
| pub struct RunnerConfig { | ||
| pub image: Image, | ||
| pub root_user_enabled: bool, | ||
| pub resources: Resources, | ||
| pub env: HashableMap<String, String>, | ||
| pub ports: HashableMap<String, Port>, | ||
| pub network_mode: NetworkMode, | ||
| } |
There was a problem hiding this comment.
style: RunnerConfig duplicates fields from deprecated ActorConfig. Consider reusing the existing types to maintain consistency.
| if let BuildAllocationType::Multi = input.build_allocation_type { | ||
| loop { | ||
| let Some(entry) = stream.try_next().await? else { | ||
| return Ok(None); | ||
| }; | ||
|
|
||
| let old_runner_allocation_key = keys::subspace() | ||
| .unpack::<keys::datacenter::RunnersByRemainingSlotsKey>(entry.key()) | ||
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?; | ||
|
|
||
| let data = old_runner_allocation_key | ||
| .deserialize(entry.value()) | ||
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?; | ||
|
|
||
| // Add read conflict only for this key | ||
| tx.add_conflict_range( | ||
| entry.key(), | ||
| &end_of_key_range(entry.key()), | ||
| ConflictRangeType::Read, | ||
| )?; | ||
|
|
||
| // Clear old entry | ||
| tx.clear(entry.key()); | ||
|
|
||
| let new_remaining_slots = | ||
| old_runner_allocation_key.remaining_slots.saturating_sub(1); | ||
|
|
||
| // Write new allocation key with 1 less slot | ||
| let new_allocation_key = keys::datacenter::RunnersByRemainingSlotsKey::new( | ||
| input.image_id, | ||
| new_remaining_slots, | ||
| old_runner_allocation_key.runner_id, | ||
| ); | ||
| tx.set(&keys::subspace().pack(&new_allocation_key), entry.value()); | ||
|
|
||
| // Update runner record | ||
| let remaining_slots_key = | ||
| keys::runner::RemainingSlotsKey::new(old_runner_allocation_key.runner_id); | ||
| tx.set( | ||
| &keys::subspace().pack(&remaining_slots_key), | ||
| &remaining_slots_key | ||
| .serialize(new_remaining_slots) | ||
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | ||
| ); | ||
|
|
||
| // Insert actor index key | ||
| let client_actor_key = | ||
| keys::client::ActorKey::new(data.client_id, input.actor_id); | ||
| tx.set( | ||
| &keys::subspace().pack(&client_actor_key), | ||
| &client_actor_key | ||
| .serialize(input.generation) | ||
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | ||
| ); | ||
|
|
||
| return Ok(Some(AllocateActorOutput { | ||
| runner_id: old_runner_allocation_key.runner_id, | ||
| new_runner: false, | ||
| client_id: data.client_id, | ||
| client_workflow_id: data.client_workflow_id, | ||
| })); | ||
| } | ||
| } |
There was a problem hiding this comment.
logic: The allocation loop for multi-type builds lacks a check for remaining_slots > 0, which could lead to allocating to runners with no capacity
| if let BuildAllocationType::Multi = input.build_allocation_type { | ||
| let remaining_slots = input.build_allocation_total_slots.saturating_sub(1); | ||
| let total_slots = input.build_allocation_total_slots; | ||
|
|
||
| // Insert runner records | ||
| let remaining_slots_key = keys::runner::RemainingSlotsKey::new(runner_id); | ||
| tx.set( | ||
| &keys::subspace().pack(&remaining_slots_key), | ||
| &remaining_slots_key | ||
| .serialize(remaining_slots) | ||
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | ||
| ); | ||
|
|
||
| let total_slots_key = keys::runner::TotalSlotsKey::new(runner_id); | ||
| tx.set( | ||
| &keys::subspace().pack(&total_slots_key), | ||
| &total_slots_key | ||
| .serialize(total_slots) | ||
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | ||
| ); | ||
|
|
||
| // Insert runner index key | ||
| let runner_idx_key = keys::datacenter::RunnersByRemainingSlotsKey::new( | ||
| input.image_id, | ||
| remaining_slots, | ||
| runner_id, | ||
| ); | ||
| tx.set( | ||
| &keys::subspace().pack(&runner_idx_key), | ||
| &runner_idx_key | ||
| .serialize(keys::datacenter::RunnersByRemainingSlotsKeyData { | ||
| client_id: old_client_allocation_key.client_id, | ||
| client_workflow_id, | ||
| }) | ||
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | ||
| ); | ||
| } |
There was a problem hiding this comment.
logic: Runner records are only created when build_allocation_type is Multi, but this creates an inconsistency since Single type builds should also have runner records
| if let BuildAllocationType::Multi = input.build_allocation_type { | |
| let remaining_slots = input.build_allocation_total_slots.saturating_sub(1); | |
| let total_slots = input.build_allocation_total_slots; | |
| // Insert runner records | |
| let remaining_slots_key = keys::runner::RemainingSlotsKey::new(runner_id); | |
| tx.set( | |
| &keys::subspace().pack(&remaining_slots_key), | |
| &remaining_slots_key | |
| .serialize(remaining_slots) | |
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | |
| ); | |
| let total_slots_key = keys::runner::TotalSlotsKey::new(runner_id); | |
| tx.set( | |
| &keys::subspace().pack(&total_slots_key), | |
| &total_slots_key | |
| .serialize(total_slots) | |
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | |
| ); | |
| // Insert runner index key | |
| let runner_idx_key = keys::datacenter::RunnersByRemainingSlotsKey::new( | |
| input.image_id, | |
| remaining_slots, | |
| runner_id, | |
| ); | |
| tx.set( | |
| &keys::subspace().pack(&runner_idx_key), | |
| &runner_idx_key | |
| .serialize(keys::datacenter::RunnersByRemainingSlotsKeyData { | |
| client_id: old_client_allocation_key.client_id, | |
| client_workflow_id, | |
| }) | |
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | |
| ); | |
| } | |
| // Insert runner records for all allocation types | |
| let (remaining_slots, total_slots) = match input.build_allocation_type { | |
| BuildAllocationType::Multi => { | |
| (input.build_allocation_total_slots.saturating_sub(1), input.build_allocation_total_slots) | |
| }, | |
| BuildAllocationType::Single => (0, 1), | |
| }; | |
| // Insert runner records | |
| let remaining_slots_key = keys::runner::RemainingSlotsKey::new(runner_id); | |
| tx.set( | |
| &keys::subspace().pack(&remaining_slots_key), | |
| &remaining_slots_key | |
| .serialize(remaining_slots) | |
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | |
| ); | |
| let total_slots_key = keys::runner::TotalSlotsKey::new(runner_id); | |
| tx.set( | |
| &keys::subspace().pack(&total_slots_key), | |
| &total_slots_key | |
| .serialize(total_slots) | |
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | |
| ); | |
| // Insert runner index key | |
| let runner_idx_key = keys::datacenter::RunnersByRemainingSlotsKey::new( | |
| input.image_id, | |
| remaining_slots, | |
| runner_id, | |
| ); | |
| tx.set( | |
| &keys::subspace().pack(&runner_idx_key), | |
| &runner_idx_key | |
| .serialize(keys::datacenter::RunnersByRemainingSlotsKeyData { | |
| client_id: old_client_allocation_key.client_id, | |
| client_workflow_id, | |
| }) | |
| .map_err(|x| fdb::FdbBindingError::CustomError(x.into()))?, | |
| ); |
| destroy::clear_ports_and_resources( | ||
| input.actor_id, | ||
| input.image_id, | ||
| Some(build.allocation_type), | ||
| ingress_ports, | ||
| Some(input.runner_id), | ||
| Some(input.client_id), | ||
| Some(input.client_workflow_id), | ||
| selected_resources_memory_mib, | ||
| selected_resources_cpu_millicores, | ||
| &tx, | ||
| ) | ||
| .await | ||
| } |
There was a problem hiding this comment.
logic: clear_ports_and_resources is called with build.allocation_type before checking if build exists, which could cause a panic if build is not found
| destroy::clear_ports_and_resources( | |
| input.actor_id, | |
| input.image_id, | |
| Some(build.allocation_type), | |
| ingress_ports, | |
| Some(input.runner_id), | |
| Some(input.client_id), | |
| Some(input.client_workflow_id), | |
| selected_resources_memory_mib, | |
| selected_resources_cpu_millicores, | |
| &tx, | |
| ) | |
| .await | |
| } | |
| destroy::clear_ports_and_resources( | |
| input.actor_id, | |
| input.image_id, | |
| build.map(|b| b.allocation_type), | |
| ingress_ports, | |
| Some(input.runner_id), | |
| Some(input.client_id), | |
| Some(input.client_workflow_id), | |
| selected_resources_memory_mib, | |
| selected_resources_cpu_millicores, | |
| &tx, | |
| ) | |
| .await | |
| } |
| @@ -1,2 +1,3 @@ | |||
| pub mod actor; | |||
| pub mod actor2; | |||
There was a problem hiding this comment.
style: Consider using a more descriptive name like actor_v2 or similar instead of actor2 to better indicate this is a new version
885534d to
d0f985a
Compare
|
Claude finished @MasterPtato's task —— View job PR Review: feat: add runners to pb protocolI've completed a comprehensive review of this major architectural change. This PR introduces significant improvements to the Rivet platform by adding runner support to the pegboard protocol while removing the isolate-v8-runner implementation. Review Checklist:
🏗️ Architecture & DesignStrengths:
Concerns:
🐛 Potential Bugs & IssuesCritical Issues
Security Concerns
🚀 Performance ConsiderationsImprovements:
Potential Issues:
🔒 Security AssessmentPositive:
Concerns:
🧪 Test CoverageMissing Tests:
Existing Tests:
📝 Code Quality IssuesStyle & Maintainability
Error Handling
🔧 RecommendationsImmediate Fixes Required
Architecture Improvements
Performance Optimizations
✅ Overall AssessmentThis PR represents a significant and necessary architectural improvement that consolidates runner management and enables multi-slot allocation. While the core design is sound, several critical bugs and missing error handling need to be addressed before merging. Verdict: Approve with required changes - The architectural direction is correct, but the implementation needs refinement in error handling, test coverage, and bug fixes identified above. Priority Issues to Address:
|
|
Claude finished @MasterPtato's task —— View job PR Review: feat: add runners to pb protocolI've completed a comprehensive review of this major architectural change. This PR introduces significant improvements to the Rivet platform by adding runner support to the pegboard protocol while removing the isolate-v8-runner implementation. Review Checklist:
🏗️ Architecture & DesignStrengths:
Concerns:
🐛 Potential Bugs & IssuesCritical Issues
Security Concerns
🚀 Performance ConsiderationsImprovements:
Potential Issues:
🔒 Security AssessmentPositive:
Concerns:
🧪 Test CoverageMissing Tests:
Existing Tests:
📝 Code Quality IssuesStyle & Maintainability
Error Handling
🔧 RecommendationsImmediate Fixes Required
Architecture Improvements
Performance Optimizations
✅ Overall AssessmentThis PR represents a significant and necessary architectural improvement that consolidates runner management and enables multi-slot allocation. While the core design is sound, several critical bugs and missing error handling need to be addressed before merging. Verdict: Approve with required changes - The architectural direction is correct, but the implementation needs refinement in error handling, test coverage, and bug fixes identified above. Priority Issues to Address:
|
c6943e6 to
d651a8c
Compare
0074232 to
c518c13
Compare
|
Claude finished @MasterPtato's task —— View job PR Review: feat: add runners to pb protocolI've completed a comprehensive review of this major architectural change. This PR introduces significant improvements to the Rivet platform by adding runner support to the pegboard protocol while removing the isolate-v8-runner implementation. Review Checklist:
🏗️ Architecture & DesignStrengths:
Concerns:
🐛 Potential Bugs & IssuesCritical Issues
Security Concerns
🚀 Performance ConsiderationsImprovements:
Potential Issues:
🔒 Security AssessmentPositive:
Concerns:
🧪 Test CoverageMissing Tests:
Existing Tests:
📝 Code Quality IssuesStyle & Maintainability
Error Handling
🔧 RecommendationsImmediate Fixes Required
Architecture Improvements
Performance Optimizations
✅ Overall AssessmentThis PR represents a significant and necessary architectural improvement that consolidates runner management and enables multi-slot allocation. While the core design is sound, several critical bugs and missing error handling need to be addressed before merging. Verdict: Approve with required changes - The architectural direction is correct, but the implementation needs refinement in error handling, test coverage, and bug fixes identified above. Priority Issues to Address:
|
|
Claude finished @MasterPtato's task —— View job PR Review: feat: add runners to pb protocolI've completed a comprehensive review of this major architectural change. This PR introduces significant improvements to the Rivet platform by adding runner support to the pegboard protocol while removing the isolate-v8-runner implementation. Review Checklist:
🏗️ Architecture & DesignStrengths:
Concerns:
🐛 Potential Bugs & IssuesCritical Issues
Security Concerns
🚀 Performance ConsiderationsImprovements:
Potential Issues:
🔒 Security AssessmentPositive:
Concerns:
🧪 Test CoverageMissing Tests:
Existing Tests:
📝 Code Quality IssuesStyle & Maintainability
Error Handling
🔧 RecommendationsImmediate Fixes Required
Architecture Improvements
Performance Optimizations
✅ Overall AssessmentThis PR represents a significant and necessary architectural improvement that consolidates runner management and enables multi-slot allocation. While the core design is sound, several critical bugs and missing error handling need to be addressed before merging. Verdict: Approve with required changes - The architectural direction is correct, but the implementation needs refinement in error handling, test coverage, and bug fixes identified above. Priority Issues to Address:
|
c518c13 to
6c559a1
Compare
d651a8c to
017878a
Compare
|
Claude finished @MasterPtato's task —— View job PR Review: feat: add runners to pb protocolI've completed a comprehensive review of this major architectural change. This PR introduces significant improvements to the Rivet platform by adding runner support to the pegboard protocol while removing the isolate-v8-runner implementation. Review Checklist:
🏗️ Architecture & DesignStrengths:
Concerns:
🐛 Potential Bugs & IssuesCritical Issues
Security Concerns
🚀 Performance ConsiderationsImprovements:
Potential Issues:
🔒 Security AssessmentPositive:
Concerns:
🧪 Test CoverageMissing Tests:
Existing Tests:
📝 Code Quality IssuesStyle & Maintainability
Error Handling
🔧 RecommendationsImmediate Fixes Required
Architecture Improvements
Performance Optimizations
✅ Overall AssessmentThis PR represents a significant and necessary architectural improvement that consolidates runner management and enables multi-slot allocation. While the core design is sound, several critical bugs and missing error handling need to be addressed before merging. Verdict: Approve with required changes - The architectural direction is correct, but the implementation needs refinement in error handling, test coverage, and bug fixes identified above. Priority Issues to Address:
|
|
Claude finished @MasterPtato's task —— View job PR Review: feat: add runners to pb protocolI've completed a comprehensive review of this major architectural change. This PR introduces significant improvements to the Rivet platform by adding runner support to the pegboard protocol while removing the isolate-v8-runner implementation. Review Checklist:
🏗️ Architecture & DesignStrengths:
Concerns:
🐛 Potential Bugs & IssuesCritical Issues
Security Concerns
🚀 Performance ConsiderationsImprovements:
Potential Issues:
🔒 Security AssessmentPositive:
Concerns:
🧪 Test CoverageMissing Tests:
Existing Tests:
📝 Code Quality IssuesStyle & Maintainability
Error Handling
🔧 RecommendationsImmediate Fixes Required
Architecture Improvements
Performance Optimizations
✅ Overall AssessmentThis PR represents a significant and necessary architectural improvement that consolidates runner management and enables multi-slot allocation. While the core design is sound, several critical bugs and missing error handling need to be addressed before merging. Verdict: Approve with required changes - The architectural direction is correct, but the implementation needs refinement in error handling, test coverage, and bug fixes identified above. Priority Issues to Address:
|
017878a to
d6072b5
Compare
6c559a1 to
cdf9857
Compare
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Claude finished @MasterPtato's task —— View job PR Review: feat: add runners to pb protocolI've completed a comprehensive review of this major architectural change. This PR introduces significant improvements to the Rivet platform by adding runner support to the pegboard protocol while removing the isolate-v8-runner implementation. Review Tasks:
🏗️ Architecture & DesignStrengths:
Concerns:
🐛 Potential Bugs & IssuesCritical Issues
Security Concerns
🚀 Performance ConsiderationsImprovements:
Potential Issues:
🔒 Security AssessmentPositive:
Concerns:
🧪 Test CoverageMissing Tests:
Existing Tests:
📝 Code Quality IssuesStyle & Maintainability
Error Handling
🔧 RecommendationsImmediate Fixes Required
Architecture Improvements
Performance Optimizations
✅ Overall AssessmentThis PR represents a significant and necessary architectural improvement that consolidates runner management and enables multi-slot allocation. While the core design is sound, several critical bugs and missing error handling need to be addressed before merging. Verdict: Approve with required changes - The architectural direction is correct, but the implementation needs refinement in error handling, test coverage, and bug fixes identified above. Priority Issues to Address:
|
|
Claude finished @MasterPtato's task —— View job PR Review: feat: add runners to pb protocolI've completed a comprehensive review of this major architectural change. This PR introduces significant improvements to the Rivet platform by adding runner support to the pegboard protocol while removing the isolate-v8-runner implementation. Review Checklist:
🏗️ Architecture & DesignStrengths:
Concerns:
🐛 Potential Bugs & IssuesCritical Issues
Security Concerns
🚀 Performance ConsiderationsImprovements:
Potential Issues:
🔒 Security AssessmentPositive:
Concerns:
🧪 Test CoverageMissing Tests:
Existing Tests:
📝 Code Quality IssuesStyle & Maintainability
Error Handling
🔧 RecommendationsImmediate Fixes Required
Architecture Improvements
Performance Optimizations
✅ Overall AssessmentThis PR represents a significant and necessary architectural improvement that consolidates runner management and enables multi-slot allocation. While the core design is sound, several critical bugs and missing error handling need to be addressed before merging. Verdict: Approve with required changes - The architectural direction is correct, but the implementation needs refinement in error handling, test coverage, and bug fixes identified above. Priority Issues to Address:
|
d6072b5 to
f387ef2
Compare
cdf9857 to
7a40790
Compare
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |

Changes