fix: add cleanup method for clean the processes #45

zlH518 · 2025-11-08T17:09:42Z

I found that after running simple_example.py, nvidia-smi still showed that I had processes consuming memory. I suspected that the created processes were not being cleared, so I added code to clean up the processes, which solved the problem.

zlH518 · 2025-11-08T17:13:55Z

i also find when i set the n_stage>1, there are some device error.
I fixed the device transfer logic in modeling_qwen3.py. Key improvements include: Dynamically obtaining device location: Instead of hardcoding torch.device(0) or torch.device(i+1), the target device is determined by querying the device location of the actual model parameters.
Device transfer after embedding: After performing embedding, hidden_states is immediately transferred to the device of the first layer to avoid device mismatch.
Inter-layer device transfer: When transferring between different device groups, the actual device location of the next layer is queried instead of assuming a device index.

zheyishine · 2025-11-17T06:42:31Z

Appreciate it!
It is better to apply your second PR to the scaffold.py file, as most of our modeling codes are auto-generated by scaffold.py instead of copy-and-revise.

zlH518 added 2 commits November 9, 2025 01:07

fix: add cleanup method for clean the processes

e37ee04

fix: modefy the logit about the device move

5febd83

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add cleanup method for clean the processes #45

fix: add cleanup method for clean the processes #45

Uh oh!

zlH518 commented Nov 8, 2025

Uh oh!

zlH518 commented Nov 8, 2025

Uh oh!

zheyishine commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: add cleanup method for clean the processes #45

Are you sure you want to change the base?

fix: add cleanup method for clean the processes #45

Uh oh!

Conversation

zlH518 commented Nov 8, 2025

Uh oh!

zlH518 commented Nov 8, 2025

Uh oh!

zheyishine commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants