作者您好,感谢您杰出的工作,想询问sandbox环境的设置细节,在internal_sandbox.py文件中的https://seed-sandbox.byteintl.net/faas/sandbox/在正常训练时是否需要替换为http://127.0.0.1:12345/faas/sandbox/,同时在运行之前是否需要执行uvicorn sandbox_api:app --host 127.0.0.1 --port 12345 --workers 4启动;能否提供更加详细的运行说明,感谢。
当前我按照如下步骤运行报错:
step 1: 替换https://seed-sandbox.byteintl.net/faas/sandbox/为http://127.0.0.1:12345/faas/sandbox/;
step 2: 执行uvicorn sandbox_api:app --host 127.0.0.1 --port 12345 --workers 4;
step 3: 执行训练脚本;
报错信息:Traceback (most recent call last):
File "xxx/SimpleTIR/recipe/simpletir/main_simpletir.py", line 59, in main
run_simpletir(config)
File "xxx/SimpleTIR/recipe/simpletir/main_simpletir.py", line 81, in run_simpletir
ray.get(runner.run.remote(config))
File "/usr/local/conda/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
return fn(*args, **kwargs)
File "/usr/local/conda/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "/usr/local/conda/lib/python3.10/site-packages/ray/_private/worker.py", line 2782, in get
values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
File "/usr/local/conda/lib/python3.10/site-packages/ray/_private/worker.py", line 929, in get_objects
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError: ray::TaskRunner.run() (pid=100716, ip=10.164.126.42, actor_id=cbc06dbbad9d239b86903e8201000000, repr=<main_simpletir.TaskRunner object at 0x7f58bc319300>)
File "xxx/SimpleTIR/recipe/simpletir/main_simpletir.py", line 213, in run
trainer.fit()
File "xxx/SimpleTIR/recipe/simpletir/simpletir_ray_trainer.py", line 1074, in fit
gen_batch_output = generation_manager.run_llm_loop(
File "xxx/SimpleTIR/recipe/simpletir/agent_utils.py", line 435, in run_llm_loop
next_obs, dones, is_void_turn, code_info = self.execute_predictions(
File "xxx/SimpleTIR/recipe/simpletir/agent_utils.py", line 606, in execute_predictions
sandbox_success, sandbox_stdout, sandbox_stderr = asyncio.run(
File "/usr/local/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
File "xxx/SimpleTIR/sandbox/internal_sandbox.py", line 35, in parallel_sandbox
results = await asyncio.gather(*tasks_async, return_exceptions=False)
File "xxx/SimpleTIR/sandbox/internal_sandbox.py", line 21, in single_sandbox
response = await run_code_async(request, client_timeout=30.0, max_attempts=2)
File "xxx/.local/lib/python3.10/site-packages/sandbox_fusion/async_client.py", line 48, in run_code
return await _run_code(request)
File "xxx/.local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 189, in async_wrapped
return await copy(fn, *args, **kwargs)
File "xxx/.local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 111, in call
do = await self.iter(retry_state=retry_state)
File "xxx/.local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 153, in iter
result = await action(retry_state)
File "xxx/.local/lib/python3.10/site-packages/tenacity/_utils.py", line 99, in inner
return call(*args, **kwargs)
File "xxx/.local/lib/python3.10/site-packages/sandbox_fusion/client.py", line 49, in on_retry_error
raise e
File "xxx/.local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 114, in call
result = await fn(*args, **kwargs)
File "xxx/.local/lib/python3.10/site-packages/sandbox_fusion/client.py", line 70, in async_wrapper
return await func(*args, **kwargs)
File "xxx/.local/lib/python3.10/site-packages/sandbox_fusion/async_client.py", line 42, in _run_code
raise Exception(f'Faas api responded with code {result.status}: {await result.text()}')
Exception: Faas api responded with code 404: {"detail":"Not Found"}
作者您好,感谢您杰出的工作,想询问sandbox环境的设置细节,在internal_sandbox.py文件中的https://seed-sandbox.byteintl.net/faas/sandbox/在正常训练时是否需要替换为http://127.0.0.1:12345/faas/sandbox/,同时在运行之前是否需要执行uvicorn sandbox_api:app --host 127.0.0.1 --port 12345 --workers 4启动;能否提供更加详细的运行说明,感谢。
当前我按照如下步骤运行报错:
step 1: 替换https://seed-sandbox.byteintl.net/faas/sandbox/为http://127.0.0.1:12345/faas/sandbox/;
step 2: 执行uvicorn sandbox_api:app --host 127.0.0.1 --port 12345 --workers 4;
step 3: 执行训练脚本;
报错信息:Traceback (most recent call last):
File "xxx/SimpleTIR/recipe/simpletir/main_simpletir.py", line 59, in main
run_simpletir(config)
File "xxx/SimpleTIR/recipe/simpletir/main_simpletir.py", line 81, in run_simpletir
ray.get(runner.run.remote(config))
File "/usr/local/conda/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
return fn(*args, **kwargs)
File "/usr/local/conda/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "/usr/local/conda/lib/python3.10/site-packages/ray/_private/worker.py", line 2782, in get
values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
File "/usr/local/conda/lib/python3.10/site-packages/ray/_private/worker.py", line 929, in get_objects
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError: ray::TaskRunner.run() (pid=100716, ip=10.164.126.42, actor_id=cbc06dbbad9d239b86903e8201000000, repr=<main_simpletir.TaskRunner object at 0x7f58bc319300>)
File "xxx/SimpleTIR/recipe/simpletir/main_simpletir.py", line 213, in run
trainer.fit()
File "xxx/SimpleTIR/recipe/simpletir/simpletir_ray_trainer.py", line 1074, in fit
gen_batch_output = generation_manager.run_llm_loop(
File "xxx/SimpleTIR/recipe/simpletir/agent_utils.py", line 435, in run_llm_loop
next_obs, dones, is_void_turn, code_info = self.execute_predictions(
File "xxx/SimpleTIR/recipe/simpletir/agent_utils.py", line 606, in execute_predictions
sandbox_success, sandbox_stdout, sandbox_stderr = asyncio.run(
File "/usr/local/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
File "xxx/SimpleTIR/sandbox/internal_sandbox.py", line 35, in parallel_sandbox
results = await asyncio.gather(*tasks_async, return_exceptions=False)
File "xxx/SimpleTIR/sandbox/internal_sandbox.py", line 21, in single_sandbox
response = await run_code_async(request, client_timeout=30.0, max_attempts=2)
File "xxx/.local/lib/python3.10/site-packages/sandbox_fusion/async_client.py", line 48, in run_code
return await _run_code(request)
File "xxx/.local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 189, in async_wrapped
return await copy(fn, *args, **kwargs)
File "xxx/.local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 111, in call
do = await self.iter(retry_state=retry_state)
File "xxx/.local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 153, in iter
result = await action(retry_state)
File "xxx/.local/lib/python3.10/site-packages/tenacity/_utils.py", line 99, in inner
return call(*args, **kwargs)
File "xxx/.local/lib/python3.10/site-packages/sandbox_fusion/client.py", line 49, in on_retry_error
raise e
File "xxx/.local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 114, in call
result = await fn(*args, **kwargs)
File "xxx/.local/lib/python3.10/site-packages/sandbox_fusion/client.py", line 70, in async_wrapper
return await func(*args, **kwargs)
File "xxx/.local/lib/python3.10/site-packages/sandbox_fusion/async_client.py", line 42, in _run_code
raise Exception(f'Faas api responded with code {result.status}: {await result.text()}')
Exception: Faas api responded with code 404: {"detail":"Not Found"}