Skip to content

[Bug/Refactor] 124 unsafe millisecond casts missed by PR #1182 β€” mofa-foundation, mofa-extra, mofa-plugins, mofa-gateway, mofa-cliΒ #1290

@rahulkr182

Description

@rahulkr182

πŸ” Issue Description

While looking at PR #1182, I noticed the fix only touched 4 files in mofa-kernel and mofa-runtime. After running a quick grep across the whole workspace I found 124 more places doing the exact same unsafe cast pattern that #1182 was supposed to fix. A few of them are genuinely buggy β€” not just a style thing β€” so I wanted to flag it.


πŸ“Œ Issue Type

  • Bug
  • Feature Request
  • Enhancement
  • Documentation
  • Refactor
  • Other (please specify)

πŸ“ Description

How I found this

After #1182 merged I ran:

grep -rn "timestamp_millis() as u64\|\.as_millis() as u64\|num_milliseconds() as u64" \
  --include="*.rs" crates/ tests/ examples/ | grep -v target/

Got 124 hits across 8 crates that weren't touched. I split them into three groups based on risk:


Group 1 β€” timestamp_millis() as u64 Β· 5 lines Β· actually buggy

chrono::DateTime::timestamp_millis() returns i64. If the machine clock is before the Unix epoch (happens more often than you'd think β€” Docker containers, CI VMs with misconfigured time, NTP backward jumps), that i64 is negative. Casting it to u64 wraps it around to something near u64::MAX, which completely corrupts any audit trail or telemetry timestamp stored with it.

Lines I found:

  • crates/mofa-foundation/src/workflow/executor.rs β€” lines 187 and 200 (inside create_review_context)
  • crates/mofa-foundation/src/agent/components/tool.rs β€” line 336
  • crates/mofa-kernel/src/hitl/audit.rs β€” line 75 (inside ReviewAuditEvent::new)
  • examples/workflow_viz/src/main.rs β€” line 567

Group 2 β€” num_milliseconds() as u64 Β· 1 line Β· actually buggy

This one felt the worst to me. In workflow/executor.rs line 209:

now.signed_duration_since(paused_at).num_milliseconds() as u64

signed_duration_since returns a signed duration β€” it's negative if paused_at happens to be in the future (e.g. NTP synced the clock forward while a workflow was paused). num_milliseconds() returns i64. Casting that negative value to u64 makes total_wait_time_ms explode to ~u64::MAX, which would totally break any HITL wait-time analytics built on top.

Simple fix:

u64::try_from(
    now.signed_duration_since(paused_at).num_milliseconds().max(0)
).unwrap_or(0)

Group 3 β€” as_millis() as u64 Β· 118 lines Β· coding standard violation

These come from Instant::elapsed().as_millis() or Duration::as_millis(), which return u128. Casting directly to u64 is what Β§VII.2 explicitly bans:

"All numeric conversions must explicitly handle overflow cases. Silent wrapping via 'as' casts is prohibited for conversions that may lose data or change sign."

In practice these won't overflow until the year 584,554 AD so there's no immediate correctness risk, but it's the same pattern #1182 was fixing and it's spread everywhere.

Per crate:

Crate Lines
mofa-foundation 73
mofa-extra 19
mofa-plugins 8
mofa-runtime 6
tests/ 7
mofa-gateway 3
mofa-cli 3
mofa-kernel 2
examples 3
Total 124

🎯 Proposed Solution

I'd split this into two PRs to keep things reviewable:

PR 1 β€” the actual bugs (6 lines changed):

  • Add pub fn chrono_now_ms() -> u64 to mofa-kernel/src/utils.rs (mirrors the existing now_ms() but using chrono).
  • Swap the 5 Group 1 sites to use mofa_kernel::utils::chrono_now_ms().
  • Fix the Group 2 site with the .max(0) clamp.

PR 2 β€” the standards sweep (118 lines):


πŸ“Ž Additional Context


πŸ™‹ Claiming This Issue

  • I'm willing to solve this issue by myself

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/platformInstallation, OS integration, environmentkind/bugSomething is brokenpriority/p1High impactrefactorrustPull requests that update rust codestatus/manual-triageRequires manual review

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions