Skip to content

Conversation

@sparkzky
Copy link
Collaborator

@sparkzky sparkzky commented Nov 29, 2025

初步的cpu多核心上的负载均衡

目前DragonOS启动时会有两个CPU核心,但是之前进程只会调度到第一个核心上运行,第二个核心空转,现在在调度进程之前加入核心选择,进程将调度到当前负载较低的核心上执行

核心选择策略:

优先级从高到低:
1. 负载均衡未启用 → 返回当前CPU
2. 单核系统 → 返回当前CPU
3. IDLE策略任务 → 找最空闲CPU
4. 当前CPU负载 < 原CPU负载 → 当前CPU(缓存热度)
5. 原CPU负载 ≤ 2 → 原CPU(缓存亲和性)
6. 当前CPU负载 ≤ 1 → 当前CPU
7. 默认 → 找负载最低的CPU

Note

Adds a load balancer that selects target CPUs on task wakeup and runs periodic imbalance checks; integrates migration-aware enqueue, enables LB after SMP init, and adds user tests.

  • Scheduler/Core:
    • Load Balancer: New kernel/src/sched/load_balance.rs implementing CPU selection on wakeup and periodic imbalance checks (migration stubs present).
    • Integration: Wire into scheduler_tick via LoadBalancer::should_balance/run_load_balance(); export nr_running_lockless() for lockless load queries.
    • Migration Handling: Update CpuRunQueue::activate_task to refresh CFS refs when ENQUEUE_MIGRATED is set.
    • Module hookup: Register pub mod load_balance; in sched/mod.rs.
  • Process wakeups:
    • Use LoadBalancer::select_task_rq in ProcessManager::wakeup/wakeup_stop to choose target_cpu; set ENQUEUE_MIGRATED when moving between CPUs.
  • SMP init:
    • Enable load balancing post-initialization in smp_init(); minor doc tweak for smp_get_processor_id().
  • User tests:
    • Add user/apps/c_unitest/test_load_balance.c and test_smp_balance.c to validate multicore distribution and basic balancing behavior.

Written by Cursor Bugbot for commit e14d92b. This will update automatically on new commits. Configure here.

Copilot AI review requested due to automatic review settings November 29, 2025 13:52
@github-actions github-actions bot added the enhancement New feature or request label Nov 29, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements initial multi-core CPU load balancing for the DragonOS scheduler. The feature enables tasks to be distributed across multiple CPU cores based on load conditions, with a priority-based selection strategy that balances performance (cache locality) against load distribution.

Key changes include:

  • Added a new load balancer module with CPU selection logic for task wakeup
  • Integrated load balancing into the scheduler tick and process wakeup paths
  • Added test programs to verify load distribution across CPUs

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
kernel/src/sched/load_balance.rs New module implementing CPU selection strategy and load balancing logic
kernel/src/sched/mod.rs Integrated load balancer into scheduler tick, added nr_running_lockless() method and migration flag handling
kernel/src/process/mod.rs Updated process wakeup to use load balancer for CPU selection with migration support
kernel/src/smp/mod.rs Enabled load balancing after SMP initialization
kernel/src/smp/core.rs Updated documentation comment style from @brief to ##
user/apps/c_unitest/test_smp_balance.c Simple test program to verify multi-core load balancing
user/apps/c_unitest/test_load_balance.c Comprehensive test suite for load balancing functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pthread_t threads[NUM_WORKERS];
int thread_ids[NUM_WORKERS];
int i;
int cpu_usage[2] = {0}; /* 假设最多2个CPU */
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array size mismatch: cpu_usage is declared with size 2 but later code accesses indices up to 16 (lines 160-162, 166-169). This will cause buffer overflow if final_cpu >= 2. The array size should be at least 16 to match the loop bounds.

Suggested change
int cpu_usage[2] = {0}; /* 假设最多2个CPU */
int cpu_usage[16] = {0}; /* 支持最多16个CPU */

Copilot uses AI. Check for mistakes.
result = 1;
}

// 这个测例会报错,先注释掉
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Corrected spelling of '测例' (test case) to '测试用例' for consistency with standard terminology.

Suggested change
// 这个测例会报错,先注释掉
// 这个测试用例会报错,先注释掉

Copilot uses AI. Check for mistakes.

use super::{cpu_rq, CpuRunQueue, DequeueFlag, EnqueueFlag, SchedPolicy};

/// ## 负载均衡间隔(单位:jiffies),执行一次负载均衡检查
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment says 'execute one load balance check' but doesn't clarify what happens after the interval. Consider rephrasing to '每隔100个jiffies执行一次负载均衡检查' (execute load balance check every 100 jiffies) for clarity.

Suggested change
/// ## 负载均衡间隔(单位:jiffies),执行一次负载均衡检查
/// ## 负载均衡间隔(单位:jiffies),每隔100个jiffies执行一次负载均衡检查

Copilot uses AI. Check for mistakes.
drop(guard);
// TODO:处理负载均衡

//todo 检查并执行负载均衡(只检测不均衡,migrate_tasks 内部是空的)
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment contains important implementation status (that migrate_tasks is empty), but uses lowercase 'todo' instead of the standard 'TODO:' format. For consistency with other TODOs in the codebase, use 'TODO:' prefix.

Suggested change
//todo 检查并执行负载均衡(只检测不均衡,migrate_tasks 内部是空的)
// TODO: 检查并执行负载均衡(只检测不均衡,migrate_tasks 内部是空的)

Copilot uses AI. Check for mistakes.
}

// 如果原CPU负载不高,保持缓存亲和性
if prev_load <= 2 {
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Magic number 2 used for load threshold without explanation. Consider defining a named constant like CACHE_HOT_LOAD_THRESHOLD to make the cache affinity strategy more explicit.

Copilot uses AI. Check for mistakes.
}

// 如果当前CPU负载低,直接使用当前cpu即可
if current_load <= 1 {
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Magic number 1 used for load threshold without explanation. Consider defining a named constant like IDLE_LOAD_THRESHOLD to clarify when a CPU is considered idle enough for immediate use.

Copilot uses AI. Check for mistakes.
let current_load = Self::get_rq_load_lockless(&current_rq);

// 如果当前CPU负载很高,不主动拉取任务
if current_load > 2 {
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Magic number 2 appears again for load threshold. This should use the same named constant as suggested for line 103 to maintain consistency in the load balancing strategy.

Copilot uses AI. Check for mistakes.
@fslongjin
Copy link
Member

貌似多核线程同步有问题,测例过不了

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

pthread_t threads[NUM_WORKERS];
int thread_ids[NUM_WORKERS];
int i;
int cpu_usage[2] = {0}; /* 假设最多2个CPU */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Buffer overflow: array of 2 accessed with index up to 15

The cpu_usage array is declared with only 2 elements at line 122, but both the write operation at line 161 and the read loop at lines 166-168 use bounds checking for indices 0-15. If final_cpu is between 2 and 15, writing to cpu_usage[stats[i].final_cpu]++ causes a buffer overflow, and the subsequent loop reading cpu_usage[i] for i from 0 to 15 reads out of bounds. Compare with test_fork_balancing() which correctly uses cpu_count[16].

Additional Locations (1)

Fix in Cursor Fix in Web

#[inline]
pub fn nr_running_lockless(&self) -> usize {
self.nr_running
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Data race: reading non-atomic field without synchronization

The nr_running_lockless() method reads the nr_running field (a plain usize) without holding any lock, while other code paths modify this field under the run queue lock. In Rust's memory model, reading a non-atomic value that can be concurrently modified constitutes a data race (undefined behavior). This could contribute to the multi-core synchronization issues mentioned in the PR discussion. The field would need to be AtomicUsize with appropriate memory ordering for safe concurrent access.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants