Skip to content
This repository was archived by the owner on Jan 27, 2026. It is now read-only.
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
c42793f
ability to automatically set shm size based on sys memory
JannikSt Feb 6, 2025
f1f7538
clippy
JannikSt Feb 6, 2025
6d4aa01
Merge pull request #107 from PrimeIntellect-ai/improvement/shm-size-s…
JannikSt Feb 6, 2025
ca57e74
bump version
JannikSt Feb 6, 2025
faf4375
Merge pull request #108 from PrimeIntellect-ai/improvement/shm-size-s…
JannikSt Feb 6, 2025
ad90427
Feature/disable ejection (#111)
JannikSt Feb 6, 2025
bf52611
support ability to restore metrics via orchestrator (#112)
JannikSt Feb 6, 2025
8e8808a
bump version
JannikSt Feb 6, 2025
535e0c3
merge
JannikSt Feb 6, 2025
b698e32
improve resiliency of validator (#114)
JannikSt Feb 6, 2025
7ddf448
bump version
JannikSt Feb 6, 2025
bbabc26
resolve conflicts
JannikSt Feb 6, 2025
b7adeeb
add improvement to metrics reporting (#119)
JannikSt Feb 12, 2025
519307a
resolve conflicts
JannikSt Feb 12, 2025
0d3d5cd
fmt
JannikSt Feb 12, 2025
8cb71b4
very basic validator functionality
mattdf Feb 4, 2025
bf88c80
fmt
mattdf Feb 4, 2025
c58574b
add signature ...
mattdf Feb 4, 2025
92629cd
use sign_request
mattdf Feb 4, 2025
75e055d
use custom serializer for consistency
mattdf Feb 5, 2025
61a482b
add validator arg to miner, implement rounding robust partial eq
mattdf Feb 5, 2025
36349d6
fmt, clippy fix
mattdf Feb 5, 2025
7a47331
fix remote makefile entry
mattdf Feb 5, 2025
9a85c57
misc makefile fixes
mattdf Feb 5, 2025
58f97ba
fix clippy
mattdf Feb 12, 2025
9e801d3
fix fmt...
mattdf Feb 12, 2025
7255a74
make app_state unused
mattdf Feb 12, 2025
e172fbd
Merge pull request #96 from PrimeIntellect-ai/feature/validator-chall…
mattdf Feb 12, 2025
7350607
minor readme adjustment
JannikSt Feb 13, 2025
c3480c9
remove redundant files
JannikSt Feb 13, 2025
07b440b
add license file (#124)
mattdf Feb 13, 2025
463ec6c
basic adjustment of readme, add security, add contributing and docs (…
JannikSt Feb 13, 2025
888810f
fix worker naming in readme
JannikSt Feb 13, 2025
593520f
add readme images (#125)
JannikSt Feb 14, 2025
5849a61
resolve conflicts
JannikSt Feb 14, 2025
12596a9
add disclosure
JannikSt Feb 14, 2025
4c6e15b
add video to remote gpu setup in docs (#128)
JannikSt Feb 26, 2025
047a514
fix submodule setup (#129)
JannikSt Feb 27, 2025
a46c301
Fix/submodule path (#130)
JannikSt Feb 27, 2025
10ea9d7
do not run format on xl-runner
JannikSt Mar 7, 2025
e217951
fix runner setup
JannikSt Mar 7, 2025
b2fe49d
add shared volume to containers to support model downloads (#134)
JannikSt Mar 7, 2025
65753cc
- Readme improvements (#131)
burnpiro Mar 7, 2025
36bb005
feature/toploc-integration (#127)
JannikSt Mar 10, 2025
82a9b4d
align smart contract module (#135)
JannikSt Mar 12, 2025
40fcaf8
support dynamic staking system and ability to automatically increase …
JannikSt Mar 12, 2025
c76d601
fix worker unit formatting (#138)
JannikSt Mar 15, 2025
541430f
remove s3 from release pipeline (#141)
JannikSt Mar 15, 2025
b72760d
support new env vars in validator docker (#143)
JannikSt Mar 15, 2025
ea1da66
improve orchestrator docker image (#144)
JannikSt Mar 15, 2025
4094f5e
fix stake setting approach (#142)
JannikSt Mar 15, 2025
7f306a6
Feature/docker compose setup (#145)
JannikSt Mar 17, 2025
9420a14
add basic ability to generate new wallets from worker cli (#147)
JannikSt Mar 18, 2025
a5716fc
support toploc-server auth (#146)
JannikSt Mar 19, 2025
624128f
move private keys into env files rather than cli args (#150)
JannikSt Mar 19, 2025
089b8dc
only allow discovery svc upload if you have ai token (#149)
JannikSt Mar 19, 2025
d2576a9
support setting penalty in validator arg (#137)
JannikSt Mar 19, 2025
f503fb7
Feature/discovery blacklist detection (#139)
JannikSt Mar 19, 2025
5980e8d
Improvement/discovery security check (#151)
JannikSt Mar 19, 2025
527d7dd
update email in security.md (#154)
JohannesHa Mar 20, 2025
f0bce1c
add install scripts (#155)
JannikSt Mar 20, 2025
914aaf9
Feature/node chain sync (#148)
JannikSt Mar 21, 2025
fb7c9e3
improve boot sequence (#157)
JannikSt Mar 21, 2025
5721523
Improvement/manual transcation approval (#152)
JannikSt Mar 21, 2025
f5e40f4
fix typo (#158)
samsja Mar 21, 2025
b65d788
support legacy private key setup (#160)
JannikSt Mar 22, 2025
19f9ad4
disable spinner (#161)
JannikSt Mar 22, 2025
573fb59
add ability to sign message from CLI (#163)
JannikSt Mar 23, 2025
9cb0eb7
fix permission issue with installer (#162)
JannikSt Mar 23, 2025
7cefac5
add balance cmd and improve registration flow (#164)
JannikSt Mar 24, 2025
7100b5c
improve README & general setup info (#165)
JannikSt Mar 24, 2025
706c247
Create generate-node-wallet command to skip generating provider walle…
manveerxyz Mar 25, 2025
bf1a951
improve stake approval ux by showing response immediatly (#167)
JannikSt Mar 25, 2025
c664091
Feature: interconnect check w. issue tracker (#166)
JannikSt Mar 25, 2025
4312d39
Run cargo fmt
manveerxyz Mar 25, 2025
5a61192
Merge pull request #169 from PrimeIntellect-ai/improvement/generate-n…
manveerxyz Mar 26, 2025
0ec478a
Preserve folder structure for toploc-validator (#168)
JannikSt Mar 26, 2025
0185496
add timestamp to latest status change for node (#170)
JannikSt Mar 26, 2025
840c7e8
Discovery state change tests (#171)
JannikSt Mar 26, 2025
e07e886
adjust beta versioning (#172)
JannikSt Mar 26, 2025
1b7ba47
Improvement/dev release versioning - fix indentation (#173)
JannikSt Mar 26, 2025
f66444e
Fix: node recovery edge case (#174)
JannikSt Mar 26, 2025
f21bcb8
add creation and update timestamps, add sorting to discovery output, …
JannikSt Mar 26, 2025
6ab8c18
Fix: work validation improvements & cleanup (#176)
JannikSt Mar 27, 2025
3974b92
fix install script with new dev tagging (#177)
JannikSt Mar 27, 2025
4f295cd
whitelist provider info in doc
JannikSt Mar 27, 2025
5d53823
add info reg. whitelist provider
JannikSt Mar 27, 2025
8934a12
Fix: chainsync can break on old data (#178)
JannikSt Mar 27, 2025
f2c2df7
Fix: discovery wrong status on missing pool (#179)
JannikSt Mar 27, 2025
ce07605
Improvement/log level (#180)
JannikSt Mar 27, 2025
fd45f37
Synthetic Data Validator: Sequential and without cache (#187)
JannikSt Mar 28, 2025
083adc4
load validator addresses from contract (#181)
JannikSt Mar 28, 2025
17368a1
add discovery upload retries (#182)
JannikSt Mar 28, 2025
3fb118b
detect all gpus during hw check
JannikSt Mar 30, 2025
9e2c1fb
use gpu with highest count
JannikSt Mar 30, 2025
2040e25
resolve conflicts
JannikSt Apr 17, 2025
e937021
cleanup
JannikSt Apr 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions discovery/src/api/routes/node.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,24 @@ pub async fn register_node(
return HttpResponse::BadRequest()
.json(ApiResponse::new(false, "Invalid x-address header"));
}
if let Some(contracts) = data.contracts.clone() {
if (contracts
.compute_registry
.get_node(
node.provider_address.parse().unwrap(),
node.id.parse().unwrap(),
)
.await)
.is_err()
{
return HttpResponse::BadRequest().json(ApiResponse::new(
false,
"Node not found in compute registry",
));
}
}

let node_store = data.node_store.clone();

let update_node = node.clone();
let existing_node = data.node_store.get_node(update_node.id.clone());
Expand Down
157 changes: 82 additions & 75 deletions discovery/src/chainsync/sync.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,103 +33,110 @@ impl ChainSync {
last_chain_sync,
}
}

async fn sync_single_node(
node_store: Arc<NodeStore>,
contracts: Arc<Contracts>,
node: DiscoveryNode,
) -> Result<(), Error> {
let mut n = node.clone();

// Safely parse provider_address and node_address
let provider_address = Address::from_str(&node.provider_address).map_err(|e| {
eprintln!("Failed to parse provider address: {}", e);
anyhow::anyhow!("Invalid provider address")
})?;

let node_address = Address::from_str(&node.id).map_err(|e| {
eprintln!("Failed to parse node address: {}", e);
anyhow::anyhow!("Invalid node address")
})?;
async fn sync_single_node(
node_store: Arc<NodeStore>,
contracts: Arc<Contracts>,
node: DiscoveryNode,
) -> Result<(), Error> {
let mut n = node.clone();

let node_info = contracts
.compute_registry
.get_node(provider_address, node_address)
.await
.map_err(|e| {
eprintln!("Error retrieving node info: {}", e);
anyhow::anyhow!("Failed to retrieve node info")
// Safely parse provider_address and node_address
let provider_address = Address::from_str(&node.provider_address).map_err(|e| {
eprintln!("Failed to parse provider address: {}", e);
anyhow::anyhow!("Invalid provider address")
})?;

let provider_info = contracts
.compute_registry
.get_provider(provider_address)
.await
.map_err(|e| {
eprintln!("Error retrieving provider info: {}", e);
anyhow::anyhow!("Failed to retrieve provider info")
let node_address = Address::from_str(&node.id).map_err(|e| {
eprintln!("Failed to parse node address: {}", e);
anyhow::anyhow!("Invalid node address")
})?;

let (is_active, is_validated) = node_info;
n.is_active = is_active;
n.is_validated = is_validated;
n.is_provider_whitelisted = provider_info.is_whitelisted;
let node_info = contracts
.compute_registry
.get_node(provider_address, node_address)
.await
.map_err(|e| {
eprintln!("Error retrieving node info: {}", e);
anyhow::anyhow!("Failed to retrieve node info")
})?;

// Handle potential errors from async calls
let is_blacklisted = contracts
.compute_pool
.is_node_blacklisted(node.node.compute_pool_id, node_address)
.await
.map_err(|e| {
eprintln!("Error checking if node is blacklisted: {}", e);
anyhow::anyhow!("Failed to check blacklist status")
})?;
n.is_blacklisted = is_blacklisted;
match node_store.update_node(n) {
Ok(_) => (),
Err(e) => {
error!("Error updating node: {}", e);
let provider_info = contracts
.compute_registry
.get_provider(provider_address)
.await
.map_err(|e| {
eprintln!("Error retrieving provider info: {}", e);
anyhow::anyhow!("Failed to retrieve provider info")
})?;

let (is_active, is_validated) = node_info;
n.is_active = is_active;
n.is_validated = is_validated;
n.is_provider_whitelisted = provider_info.is_whitelisted;

// Handle potential errors from async calls
let is_blacklisted = contracts
.compute_pool
.is_node_blacklisted(node.node.compute_pool_id, node_address)
.await
.map_err(|e| {
eprintln!("Error checking if node is blacklisted: {}", e);
anyhow::anyhow!("Failed to check blacklist status")
})?;
n.is_blacklisted = is_blacklisted;
match node_store.update_node(n) {
Ok(_) => (),
Err(e) => {
error!("Error updating node: {}", e);
}
}
}

Ok(())
}
Ok(())
}

pub async fn run(&self) -> Result<(), Error> {
let node_store_clone = self.node_store.clone();
let contracts_clone = self.contracts.clone();
let cancel_token = self.cancel_token.clone();
let chain_sync_interval = self.chain_sync_interval;
let last_chain_sync = self.last_chain_sync.clone();
pub async fn run(&self) -> Result<(), Error> {
let node_store_clone = self.node_store.clone();
let contracts_clone = self.contracts.clone();
let cancel_token = self.cancel_token.clone();
let chain_sync_interval = self.chain_sync_interval;
let last_chain_sync = self.last_chain_sync.clone();

tokio::spawn(async move {
let mut interval = tokio::time::interval(chain_sync_interval);
loop {
tokio::select! {
_ = interval.tick() => {
let nodes = node_store_clone.get_nodes();
match nodes {
Ok(nodes) => {
for node in nodes {
if let Err(e) = ChainSync::sync_single_node(node_store_clone.clone(), contracts_clone.clone(), node).await {
error!("Error syncing node: {}", e);
tokio::spawn(async move {
let mut interval = tokio::time::interval(chain_sync_interval);
loop {
tokio::select! {
_ = interval.tick() => {
let nodes = node_store_clone.get_nodes();
match nodes {
Ok(nodes) => {
for node in nodes {
if let Err(e) = ChainSync::sync_single_node(node_store_clone.clone(), contracts_clone.clone(), node).await {
error!("Error syncing node: {}", e);
}
}
// Update the last chain sync time
let mut last_sync = last_chain_sync.lock().await;
*last_sync = Some(SystemTime::now());
}
Err(e) => {
error!("Error getting nodes: {}", e);
}
// Update the last chain sync time
let mut last_sync = last_chain_sync.lock().await;
*last_sync = Some(SystemTime::now());
}
Err(e) => {
error!("Error getting nodes: {}", e);
}
}
}
_ = cancel_token.cancelled() => {
break;
_ = cancel_token.cancelled() => {
break;
}
}
}
}
});
Ok(())
});
Ok(())
}
}
}
15 changes: 15 additions & 0 deletions orchestrator/src/discovery/monitor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,21 @@ impl<'b> DiscoveryMonitor<'b> {
Ok(())
}

async fn get_nodes(&self) -> Result<Vec<OrchestratorNode>, Error> {
let discovery_nodes = self.fetch_nodes_from_discovery().await?;

for discovery_node in &discovery_nodes {
if let Err(e) = self.sync_single_node_with_discovery(discovery_node).await {
error!("Error syncing node with discovery: {}", e);
}
}
Ok(discovery_nodes
.into_iter()
.map(OrchestratorNode::from)
.collect())
}
}

async fn get_nodes(&self) -> Result<Vec<OrchestratorNode>, Error> {
let discovery_nodes = self.fetch_nodes_from_discovery().await?;

Expand Down
18 changes: 18 additions & 0 deletions shared/src/web3/contracts/implementations/compute_pool_contract.rs
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,24 @@ impl ComputePool {
Ok(result)
}

pub async fn submit_work(
&self,
pool_id: U256,
node: Address,
data: Vec<u8>,
) -> Result<FixedBytes<32>, Box<dyn std::error::Error>> {
let result = self
.instance
.instance()
.function("submitWork", &[pool_id.into(), node.into(), data.into()])?
.send()
.await?
.watch()
.await?;
println!("Result: {:?}", result);
Ok(result)
}

pub async fn blacklist_node(
&self,
pool_id: u32,
Expand Down
53 changes: 53 additions & 0 deletions validator/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -489,3 +489,56 @@ mod tests {
assert_eq!(resp.result, expected_response.result);
}
}

#[cfg(test)]
mod tests {

use actix_web::{test, App};
use actix_web::{
web::{self, post},
HttpResponse, Scope,
};
use shared::models::challenge::{calc_matrix, ChallengeRequest, ChallengeResponse, FixedF64};

pub async fn handle_challenge(challenge: web::Json<ChallengeRequest>) -> HttpResponse {
let result = calc_matrix(&challenge);
HttpResponse::Ok().json(result)
}

pub fn challenge_routes() -> Scope {
web::scope("/challenge")
.route("", post().to(handle_challenge))
.route("/", post().to(handle_challenge))
}

#[actix_web::test]
async fn test_challenge_route() {
let app = test::init_service(App::new().service(challenge_routes())).await;

let vec_a = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0];
let vec_b = [9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0];

// convert vectors to FixedF64
let data_a: Vec<FixedF64> = vec_a.iter().map(|x| FixedF64(*x)).collect();
let data_b: Vec<FixedF64> = vec_b.iter().map(|x| FixedF64(*x)).collect();

let challenge_request = ChallengeRequest {
rows_a: 3,
cols_a: 3,
data_a,
rows_b: 3,
cols_b: 3,
data_b,
};

let req = test::TestRequest::post()
.uri("/challenge")
.set_json(&challenge_request)
.to_request();

let resp: ChallengeResponse = test::call_and_read_body_json(&app, req).await;
let expected_response = calc_matrix(&challenge_request);

assert_eq!(resp.result, expected_response.result);
}
}
1 change: 1 addition & 0 deletions worker/src/api/server.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
use crate::api::routes::challenge::challenge_routes;
use crate::api::routes::invite::invite_routes;
use crate::api::routes::task::task_routes;
use crate::console::Console;
use crate::docker::DockerService;
use crate::operations::heartbeat::service::HeartbeatService;
use crate::state::system_state::SystemState;
Expand Down
15 changes: 10 additions & 5 deletions worker/src/checks/hardware/hardware_check.rs
Original file line number Diff line number Diff line change
Expand Up @@ -141,11 +141,16 @@ impl HardwareChecker {
}

fn collect_gpu_specs(&self) -> Result<Option<GpuSpecs>, Box<dyn std::error::Error>> {
Ok(detect_gpu().map(|gpu| GpuSpecs {
count: Some(gpu.count.unwrap_or(0)),
model: gpu.model,
memory_mb: gpu.memory_mb,
}))
let gpu_specs = detect_gpu();
if gpu_specs.is_empty() {
return Ok(None);
}

let main_gpu = gpu_specs
.into_iter()
.max_by_key(|gpu| gpu.count.unwrap_or(0));

Ok(main_gpu)
}

fn collect_memory_specs(&self) -> Result<(u32, u32), Box<dyn std::error::Error>> {
Expand Down
Loading
Loading