lme optimization

Commit Review: `fix: optimize LME activation sequence and local transport`

Branch: lme_timeout_fix_optimization_tmp
Author: Nabendu Maiti
Date: 2026-03-19
SHA: 69bbeed
Audience: Patch reviewers, architects

Executive Summary
Problem Statement
Files Changed
Architecture Context
Change Details by Layer
Sequence Diagrams
- 6.1 Before: LME Channel Open (Blocking)
- 6.2 After: LME Channel Open (Timed + Retry)
- 6.3 GetGeneralSettings Retry Flow
- 6.4 LMS Probe Caching
Risk Assessment
Testing Coverage
Architectural Observations

1. Executive Summary

This commit addresses a class of AMT/LME stall bugs where rpc-go could hang indefinitely or wait excessively long when the Intel ME Interface (MEI) kernel driver returned transient "device busy" errors or when the AMT firmware was slow to confirm an APF channel open. The changes introduce:

Timed waits (replacing unbounded WaitGroup.Wait() blocks) at the channel-open barrier in both the RPS executor and the local HTTP transport.
Retry-with-backoff logic at three call sites where transient MEI busy errors are expected: LME channel connect, GetGeneralSettings, and HTTP transport RoundTrip.
APF timer acceleration: resets the APF session flush timer to 500 ms whenever an APF_CHANNEL_DATA message arrives, eliminating stalls caused by a slow timer firing after data has already been delivered.
LMS probe result caching: avoids redundant TCP probes on every WSMAN call during a multi-step activation.
Reduced global timeouts to align with realistic AMT response characteristics and avoid unnecessary delay in failure scenarios.
URL normalization to reject structurally invalid --url values early and prevent misrouted activation flows.

2. Problem Statement

Symptom	Root Cause
`rpc activate --local` hangs for 30 s+ then fails	`WaitGroup.Wait()` was unbounded; if AMT never confirmed channel open (e.g., MEI busy), the goroutine blocked forever
Repeated "device or resource busy" errors causing abort	Single-attempt MEI connect with no retry; transient kernel driver contention treated as fatal
Activation stalls mid-sequence after first WSMAN call	Each WSMAN call re-probed LMS port (TCP connect + TLS handshake overhead per call)
APF data flushed too late during LME sessions	APF session timer used a long default; 500 ms flush acceleration needed for channel data deliveries
Malformed `--url` (e.g., `"http://"`) routed incorrectly	No validation before URL was evaluated as non-empty
Premature `EnsureWSMAN` during local activation	WSMAN init triggered before it was needed, wasting an LME slot

3. Files Changed

File	+Lines	−Lines	Category
`internal/rps/executor.go`	+90	−2	Core: LME channel loop
`internal/local/amt/localTransport.go`	+89	−3	Core: HTTP transport
`internal/commands/activate/local.go`	+38	+1	Local activation retry
`internal/lm/engine.go`	+23	−1	APF timer override
`internal/commands/activate/activate.go`	+26	−4	URL normalization, WSMAN cleanup
`internal/local/amt/wsman.go`	+38	−4	LMS probe caching
`internal/commands/activate/activate_test.go`	+12	0	Tests: URL normalization
`pkg/utils/constants.go`	+6	−6	Timeout constants tuning
`internal/rps/rps.go`	+1	0	Minor
`pkg/heci/linux.go`	+1	0	Cosmetic (blank line)
Total	+289	−41

4. Architecture Context

Component Map

┌──────────────────────────────────────────────────────────────────────┐
│                          rpc-go Process                              │
│                                                                      │
│  ┌─────────────┐    ┌──────────────┐    ┌────────────────────────┐  │
│  │  CLI /      │    │  RPS         │    │  Local Activation      │  │
│  │  Activate   │───▶│  Executor    │    │  Service               │  │
│  │  Command    │    │  (WebSocket) │    │  (local.go)            │  │
│  └──────┬──────┘    └──────┬───────┘    └──────────┬─────────────┘  │
│         │                  │                        │                │
│         ▼                  ▼                        ▼                │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │               GoWSMANMessages / LocalTransport               │   │
│  │   ┌──────────────────────────────────────────────────────┐   │   │
│  │   │   LMS probe cache (plainProbeDone / useLocalLMX)     │   │   │
│  │   └──────────────────────────────────────────────────────┘   │   │
│  └──────────────────┬───────────────────────────────────────────┘   │
│                     │                                                │
│     ┌───────────────┴──────────────┐                                │
│     │           LM Engine          │                                │
│     │  ┌───────────────────────┐   │                                │
│     │  │  APF Session + Timer  │   │                                │
│     │  │  (500ms override on   │   │                                │
│     │  │   CHANNEL_DATA)       │   │                                │
│     │  └───────────────────────┘   │                                │
│     └───────────────┬──────────────┘                                │
└─────────────────────┼────────────────────────────────────────────── ┘
                      │ ioctl / HECI driver
                      ▼
              ┌───────────────┐
              │  MEI kernel   │
              │  driver       │
              │  /dev/mei0    │
              └───────┬───────┘
                      │ HECI bus
                      ▼
              ┌───────────────┐
              │  Intel AMT    │
              │  Firmware     │
              └───────────────┘

Timeout Budget — Before vs. After

                       BEFORE                         AFTER
                ┌──────────────────┐          ┌──────────────────┐
 LMS conn       │     10 s         │          │      6 s         │
 LMS dialer     │      5 s         │          │      4 s         │
 HECI read      │     30 s         │          │     15 s         │
 HECI retry     │   3000 ms        │          │   2000 ms        │
 HECI backoff   │    500 ms        │          │    300 ms        │
 AMT response   │     30 s         │          │     10 s         │
                └──────────────────┘          └──────────────────┘

5. Change Details by Layer

5.1 Timeout Constants — `pkg/utils/constants.go`

These constants govern every wait in the LME/LMS/MEI stack.

Constant	Old Value	New Value	Unit	Rationale
`LMSConnectionTimeout`	10	6	seconds	LMS TCP connect completes < 1 s on healthy system; 6 s is ample
`LMSDialerTimeout`	5	4	seconds	Aligns with reduced connection timeout
`HeciReadTimeout`	30	15	seconds	AMT typically responds in < 5 s; 15 s catches stalls without waiting 30 s
`HeciRetryDelay`	3000	2000	ms	Reduces dead time between HECI retries
`HeciConnectRetryBackoff`	500	300	ms	Shorter initial backoff on transient MEI busy; linear scale preserves total budget
`AMTResponseTimeout`	30	10	seconds	Matches typical AMT response SLA; 30 s was masking hangs

Architectural note: All timeouts are centralized here. No hardcoded values were introduced in this commit; retry delays always reference utils.HeciConnectRetryBackoff.

5.2 LM Engine — APF Timer Override — `internal/lm/engine.go`

What changed

A new wrapper processWithLocalTimerOverride(message []byte) replaces direct apf.Process() calls in both execute() and Listen().

const lmeAPFChannelDataFlushOverride = 500 * time.Millisecond

func (lme *LMEConnection) processWithLocalTimerOverride(message []byte) bytes.Buffer {
    processed := apf.Process(message, lme.Session)

    if len(message) > 0 && message[0] == apf.APF_CHANNEL_DATA && lme.Session.Timer != nil {
        if !lme.Session.Timer.Stop() {
            select { case <-lme.Session.Timer.C: default: }
        }
        lme.Session.Timer.Reset(lmeAPFChannelDataFlushOverride)
    }

    return processed
}

Why

The APF session timer controls how quickly buffered channel data is flushed. When the default timer period is long and APF_CHANNEL_DATA arrives, the data sits in the buffer until the timer fires. This creates visible stalls in the WSMAN response path. The 500 ms override causes the flush to happen much sooner after data arrival without disrupting orderly session teardown on other message types.

Call sites changed

Function	Before	After
`execute()`	`apf.Process(result, lme.Session)`	`lme.processWithLocalTimerOverride(result)`
`Listen()`	`apf.Process(result2, lme.Session)`	`lme.processWithLocalTimerOverride(result2)`

Timer safety pattern

The Stop() + drain-channel pattern prevents the Reset() from racing with an already-fired timer:

Timer.Stop() == false  →  timer already fired  →  drain Timer.C  →  Reset()
Timer.Stop() == true   →  timer not yet fired   →  Reset() directly

This is the standard Go pattern for safe timer reuse.

5.3 RPS Executor — LME Channel Handling — `internal/rps/executor.go`

The RPS path is used during remote/cloud-driven provisioning where RPS sends activation commands over WebSocket to rpc-go, which forwards them via LME to AMT firmware.

5.3.1 Retry on MEI busy (`connectLMEWithRetry`)

connectLMEWithRetry()
  attempt 0 → Connect() ─ success → return nil
                         └ transient busy → wait 300ms
  attempt 1 → Connect() ─ success → return nil
                         └ transient busy → wait 600ms
  attempt 2 → Connect() ─ success/failure → return

Parameter	Value
Max retries	2 (3 total attempts)
Backoff	`(attempt+1) × 300 ms` (linear: 300, 600 ms)
Retry trigger	`"device or resource busy"`, `"resource busy"`, `"no such device"`, `"device unavailable"`
Non-transient	Returned immediately without retry

5.3.2 Timed channel-open wait (replaces blocking `WaitGroup.Wait()`)

Before:

e.waitGroup.Wait()  // blocks forever if AMT never confirms

After:

channelOpenTimer := time.NewTimer(channelOpenTimeout)  // bounded by LMETimerTimeout (capped at AMTResponseTimeout)
channelOpenDone  := make(chan struct{})

go func() {
    defer close(channelOpenDone)
    e.waitGroup.Wait()
}()

select {
case <-channelOpenDone:
    // channel confirmed, proceed
case <-channelOpenTimer.C:
    e.lastError = fmt.Errorf("timeout waiting for AMT channel open...")
    return true  // signal error to caller
}

5.3.3 Response timeout: `context.WithTimeout` → `time.NewTimer`

Aspect	Before	After
Mechanism	`context.WithTimeout` / `ctx.Done()`	`time.NewTimer` / `timer.C`
Goroutine overhead	Context creates goroutine + cancelFunc	Timer is a single heap object
Leak risk	`defer cancel()` was correct but context propagation adds complexity	Direct timer; explicit Stop-drain in defer
Semantics	`ctx.Done()` closed on deadline or cancel	`timer.C` fires once at deadline

The change removes the "context" import and replaces it with "strings" (for the new isTransientMEIBusy helper), netting zero new imports.

5.4 Local Transport — RoundTrip — `internal/local/amt/localTransport.go`

The LocalTransport implements http.RoundTripper for WSMAN over MEI when LMS is not available. It is used by GoWSMANMessages during local activation.

5.4.1 MEI busy retry on `RoundTrip`

Same retry pattern as §5.3.1 but applied to LocalTransport.RoundTrip(). Uses isMEIDeviceBusyError() which checks for "device or resource busy" and "resource busy" (subset of the executor's isTransientMEIBusy; "no such device" not included at transport level as that implies a more permanent absence).

Aspect	Executor `isTransientMEIBusy`	Transport `isMEIDeviceBusyError`
`"device or resource busy"`	✓	✓
`"resource busy"`	✓	✓
`"no such device"`	✓	✗
`"device unavailable"`	✓	✗

Reviewer note: The narrower set in the transport is intentional — a missing MEI device ("no such device") at the transport layer should fall through to an LMS fallback, not retry directly.

5.4.2 Timed channel-open wait

Same pattern as §5.3.2. On timeout, the transport explicitly calls l.Close() to unblock the Listen() goroutine before returning the error — preventing goroutine leaks.

5.4.3 Response timeout added to loop

Previously the inner select loop had no deadline branch. A responseTimer is now added:

case <-responseTimer.C:
    respErr = fmt.Errorf("timeout waiting for LME response after %s", responseTimeout)
    break Loop

This ensures RoundTrip always returns within AMTResponseTimeout (10s).

Before/After flow diagram

BEFORE RoundTrip()                   AFTER RoundTrip()
─────────────────────                ──────────────────────────────
Connect()                            for attempt 0..2:
  │                                    Connect() ── ok ──────────┐
  ▼                                    └── busy? ── wait 300ms   │
go Listen()                                                       │
  │                                  go Listen()                  │
  ▼                                     │                         │
WaitGroup.Wait()  ◄── hangs            select:                   │
                      forever            ├─ WG done → proceed ◄──┘
                      if no ACK          └─ timer   → Close + error

read loop:                           read loop:
  select:                              select:
    ├─ data channel                      ├─ data channel
    └─ error channel                     ├─ error channel
       (no deadline)                     └─ responseTimer (10s)

5.5 WSMAN Client — LMS Probe Caching — `internal/local/amt/wsman.go`

New fields on `GoWSMANMessages`

type GoWSMANMessages struct {
    // ...existing fields...
    plainProbeDone bool    // true once LMS TCP probe has run
    useLocalLMX    bool    // true if the LMS probe failed (use MEI instead)
}

LMS probe: before vs. after

BEFORE SetupWsmanClient() (non-TLS path):
  Every call:
    ctx, cancel := context.WithTimeout(...)
    dialer.DialContext(tcp4, LMSAddress:LMSPort)
    if err → use localTransport
    else   → use LMS

AFTER SetupWsmanClient() (non-TLS path):
  First call only:
    dial LMS, record result in plainProbeDone + useLocalLMX

  Subsequent calls:
    skip probe entirely, use cached useLocalLMX decision
    lazy-init localTransport only when needed

Impact during multi-step activation

A full local ACM activation may call SetupWsmanClient many times (for each WSMAN GET/PUT). Without caching, each call attempted a TCP connect-and-close to the LMS port — wasted syscalls and time on systems without LMS.

TLS path improvement

// Before
dialer := &cryptotls.Dialer{Config: tlsConfig}
conn, _ := dialer.DialContext(ctx, "tcp", utils.LMSAddress+":"+utils.LMSTLSPort)

// After
dialer := &cryptotls.Dialer{
    Config:    tlsConfig,
    NetDialer: &net.Dialer{Timeout: probeTimeout},  // explicit inner timeout
}
conn, _ := dialer.DialContext(ctx, "tcp", net.JoinHostPort(g.target, utils.LMSTLSPort))

net.Dialer{Timeout: probeTimeout} gives the TCP layer its own deadline independent of the TLS context.
net.JoinHostPort replaces string concatenation — handles IPv6 addresses correctly (wraps host in [...]).
g.target replaces hardcoded utils.LMSAddress, enabling the transport to work against custom LMS addresses.

5.6 Activate Command — URL Normalization & WSMAN Init — `internal/commands/activate/activate.go`

`normalizeActivateURL()`

func normalizeActivateURL(raw string) string {
    value := strings.TrimSpace(raw)
    if value == "" { return "" }

    u, err := url.Parse(value)
    if err != nil { return value }       // unparseable → pass through, fail later

    scheme := strings.ToLower(u.Scheme)
    if (scheme == "http" || scheme == "https" || scheme == "ws" || scheme == "wss") && u.Host == "" {
        return ""                        // scheme-only or malformed → treat as empty
    }

    return value
}

Input	Output	Why
`"wss://rps.example.com:8080"`	unchanged	valid
`"http://"`	`""`	scheme with no host
`"https://"`	`""`	scheme with no host
`"somestring"`	`"somestring"`	no scheme → pass through
`""`	`""`	empty

This is called at the top of Validate(), before any mode-detection logic, so cmd.URL == "" is always a reliable sentinel by the time the mode-selection branches run.

Removed `EnsureWSMAN` call

// REMOVED from Run() in the local-activation branch:
if err := cmd.EnsureWSMAN(ctx); err != nil {
    return err
}

EnsureWSMAN opens an LME connection eagerly. When local activation itself subsequently opens the LME for WSMAN calls, this created two concurrent slots, wasting an MEI connection and contributing to the "device busy" errors the rest of this commit fixes.

5.7 Local Activation — GetGeneralSettings Retry — `internal/commands/activate/local.go`

GetGeneralSettings() is a WSMAN call needed to obtain the AMT digest realm used when hashing passwords for CCM/ACM activation. It is now wrapped in getGeneralSettingsWithRetry().

Retry policy

attempt 0:  GetGeneralSettings() ──── ok ──────────────── return response
             └── transient busy? ── wait 300ms
attempt 1:  GetGeneralSettings() ──── ok ──────────────── return response
             └── transient busy? ── wait 600ms
attempt 2:  GetGeneralSettings() ──── ok ──────────────── return response
             └── transient busy? ── wait 900ms
attempt 3:  GetGeneralSettings() ──── any result ──────── return (final)

Parameter	Value
Max retries	3 (4 total attempts)
Backoff	`(attempt+1) × HeciConnectRetryBackoff` — 300, 600, 900 ms
Transient signals	`"device or resource busy"`, `"resource busy"`, `"no such device"`, `"device unavailable"`

Call sites updated

Function	Steps where called
`activateCCM()`	Before digest realm extraction
`activateACMWithTLS()`	Before cert chain operations
`activateACMLegacy()`	Before digest realm extraction

6. Sequence Diagrams

6.1 Before: LME Channel Open (Blocking)

sequenceDiagram
    participant E as RPS Executor
    participant L as LM Engine
    participant M as MEI Driver
    participant A as AMT Firmware

    E->>M: Connect()
    M-->>E: ok / busy (fatal, no retry)
    E->>L: go Listen()
    E->>E: waitGroup.Wait()  [BLOCKS FOREVER if no ACK]
    A-->>L: CHANNEL_OPEN_CONFIRMATION
    L->>E: waitGroup.Done()
    E->>A: Send request data
    A-->>L: CHANNEL_DATA
    L-->>E: data channel
    Note over E: No response deadline → hangs if AMT silent

6.2 After: LME Channel Open (Timed + Retry)

sequenceDiagram
    participant E as RPS Executor
    participant L as LM Engine
    participant M as MEI Driver
    participant A as AMT Firmware

    E->>M: Connect() [attempt 0]
    alt MEI busy
        M-->>E: EBUSY
        E->>E: sleep 300ms
        E->>M: Connect() [attempt 1]
    end
    M-->>E: ok
    E->>L: go Listen()
    E->>E: channelOpenTimer(LMETimerTimeout)
    par goroutine
        E->>E: go waitGroup.Wait() → close(done)
    and select
        A-->>L: CHANNEL_OPEN_CONFIRMATION
        L->>E: waitGroup.Done() → done closed
        E->>E: case <-done: proceed
    and timeout path
        E->>E: case <-timer: log error, return
    end
    E->>A: Send request data
    A-->>L: APF_CHANNEL_DATA
    Note over L: processWithLocalTimerOverride:<br/>reset session timer → 500ms
    L-->>E: data channel
    E->>E: responseTimer(AMTResponseTimeout=10s)
    Note over E: response or timeout, always returns

6.3 GetGeneralSettings Retry Flow

flowchart TD
    A([getGeneralSettingsWithRetry]) --> B[attempt = 0]
    B --> C[wsman.GetGeneralSettings]
    C -->|success| D([return response, nil])
    C -->|error| E{transient\nMEI busy?}
    E -->|No| F([return empty, lastErr])
    E -->|Yes| G{attempt ==\nmaxRetries=3?}
    G -->|Yes| F
    G -->|No| H[sleep attempt+1 × 300ms]
    H --> I[attempt++]
    I --> C

6.4 LMS Probe Caching

flowchart LR
    subgraph "1st WSMAN call"
        A[SetupWsmanClient] --> B{plainProbeDone?}
        B -->|No| C[TCP probe LMS port]
        C -->|LMS up| D[useLocalLMX=false\nplainProbeDone=true]
        C -->|LMS down| E[useLocalLMX=true\nplainProbeDone=true]
    end
    subgraph "2nd+ WSMAN calls"
        F[SetupWsmanClient] --> G{plainProbeDone?}
        G -->|Yes| H{useLocalLMX?}
        H -->|true| I[lazy-init LocalTransport]
        H -->|false| J[use LMS TCP]
    end

7. Risk Assessment

Area	Risk	Severity	Mitigation
Reduced `AMTResponseTimeout` (30→10 s)	Legitimate slow AMT responses (e.g., slow TPM operations during ACM activate) may now timeout	Medium	10 s covers known AMT response times; adjustable via constants; can be reverted independently
Reduced `HeciReadTimeout` (30→15 s)	Slow HECI reads under load may timeout	Low-Medium	Platform testing should confirm; stalls that took 15–30 s were already symptomatic of errors
`plainProbeDone` per `GoWSMANMessages` instance	If custom code creates new instances per call, caching is ineffective	Low	Within rpc-go, instances are held for the duration of an activation flow
`normalizeActivateURL` drops scheme-only URLs silently	A user passing `http://` by mistake gets no error message	Low	Logs no warning; could be improved with an explicit error; not a regression
Removed `EnsureWSMAN`	If some activation path relied on early connection check	Low	`EnsureWSMAN` was never a precondition for correctness; activation WSMAN calls open their own connection
APF 500 ms timer override	If firmware expects the default timer for flow control	Low	500 ms is faster than default but still gradual; APF flow control is governed by window size, not timer
`isMEIDeviceBusyError` vs `isTransientMEIBusy` divergence	Different retry criteria at transport vs. executor could cause inconsistent behaviour	Low	Documented above; narrower set at transport layer is intentional
Goroutine leak on channel-open timeout (LocalTransport)	`Listen()` goroutine blocked on HECI read	Low	`l.Close()` is called on timeout to signal the goroutine

8. Testing Coverage

Change	Test Type	Location	Coverage
`normalizeActivateURL`	Unit tests	`activate_test.go` (+12 lines)	Valid URLs, empty, scheme-only, non-URL strings
`getGeneralSettingsWithRetry`	Manual / integration	—	Not unit-tested in this commit; mock-based test would require wsman mock extension
`connectLMEWithRetry`	Manual	—	No unit test; `isTransientMEIBusy` is inferable from string matching
`isMEIDeviceBusyError`	Manual	—	Same as above
Timed channel-open wait	Manual	—	Requires mock that delays `waitGroup.Done()`
APF timer override	Manual	—	Requires mock APF session with controllable timer
LMS probe caching	Manual	—	Requires test harness with controllable TCP listener

Reviewer suggestion: The retry helpers (isTransientMEIBusy, isMEIDeviceBusyError) are straightforward string-match functions and would benefit from table-driven unit tests in a follow-up commit.

9. Architectural Observations

Pattern consistency

Three independent layers (RPS executor, local transport, local activation service) now share the same transient-retry pattern. The logic is duplicated rather than extracted into a shared helper. This is acceptable for the current scope but warrants a utility function (RetryOnMEIBusy(fn func() error, maxRetries int) error) if a fourth call site emerges.

Timer vs. context for timeouts

The commit deliberately moves away from context.WithTimeout for internal waits in favour of time.NewTimer. This is architecturally sound for leaf-level waits where context propagation is not needed — it reduces goroutine overhead and makes the timeout lifecycle explicit. For cross-goroutine cancellation (e.g., user Ctrl+C), the outer os/signal channel in executor.go remains in place.

Timeout constants ownership

All timing parameters live in pkg/utils/constants.go. The commit does not introduce configurable timeouts (e.g., via config.yaml). Operators needing different timeouts must recompile. This is consistent with the existing design but could become a friction point on hardened or slower hardware.

MEI driver busy error detection

String-matching on kernel error messages is a fragile pattern — kernel messages can differ across kernel versions, distributions, and locales. A future improvement would be to map syscall.EBUSY directly from the HECI driver's error return rather than text searching.

IPv6 readiness

The switch from utils.LMSAddress + ":" + utils.LMSTLSPort to net.JoinHostPort(g.target, utils.LMSTLSPort) quietly improves IPv6 correctness. This is a low-risk but meaningful improvement if LMS is ever exposed on an IPv6 address.

lme optimization

Commit Review: fix: optimize LME activation sequence and local transport

Table of Contents

1. Executive Summary

2. Problem Statement

3. Files Changed

4. Architecture Context

Component Map

Timeout Budget — Before vs. After

5. Change Details by Layer

5.1 Timeout Constants — pkg/utils/constants.go

5.2 LM Engine — APF Timer Override — internal/lm/engine.go

What changed

Why

Call sites changed

Timer safety pattern

5.3 RPS Executor — LME Channel Handling — internal/rps/executor.go

5.3.1 Retry on MEI busy (connectLMEWithRetry)

5.3.2 Timed channel-open wait (replaces blocking WaitGroup.Wait())

5.3.3 Response timeout: context.WithTimeout → time.NewTimer

5.4 Local Transport — RoundTrip — internal/local/amt/localTransport.go

5.4.1 MEI busy retry on RoundTrip

5.4.2 Timed channel-open wait

5.4.3 Response timeout added to loop

Before/After flow diagram

5.5 WSMAN Client — LMS Probe Caching — internal/local/amt/wsman.go

New fields on GoWSMANMessages

LMS probe: before vs. after

Impact during multi-step activation

TLS path improvement

5.6 Activate Command — URL Normalization & WSMAN Init — internal/commands/activate/activate.go

normalizeActivateURL()

Removed EnsureWSMAN call

5.7 Local Activation — GetGeneralSettings Retry — internal/commands/activate/local.go

Retry policy

Call sites updated

6. Sequence Diagrams

6.1 Before: LME Channel Open (Blocking)

6.2 After: LME Channel Open (Timed + Retry)

6.3 GetGeneralSettings Retry Flow

6.4 LMS Probe Caching

7. Risk Assessment

8. Testing Coverage

9. Architectural Observations

Pattern consistency

Timer vs. context for timeouts

Timeout constants ownership

MEI driver busy error detection

IPv6 readiness

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Commit Review: `fix: optimize LME activation sequence and local transport`

5.1 Timeout Constants — `pkg/utils/constants.go`

5.2 LM Engine — APF Timer Override — `internal/lm/engine.go`

5.3 RPS Executor — LME Channel Handling — `internal/rps/executor.go`

5.3.1 Retry on MEI busy (`connectLMEWithRetry`)

5.3.2 Timed channel-open wait (replaces blocking `WaitGroup.Wait()`)

5.3.3 Response timeout: `context.WithTimeout` → `time.NewTimer`

5.4 Local Transport — RoundTrip — `internal/local/amt/localTransport.go`

5.4.1 MEI busy retry on `RoundTrip`

5.5 WSMAN Client — LMS Probe Caching — `internal/local/amt/wsman.go`

New fields on `GoWSMANMessages`

5.6 Activate Command — URL Normalization & WSMAN Init — `internal/commands/activate/activate.go`

`normalizeActivateURL()`

Removed `EnsureWSMAN` call

5.7 Local Activation — GetGeneralSettings Retry — `internal/commands/activate/local.go`