[Feature]: AFD Decoupled Deployment: Achieve 10% Throughput Improvement

### 🚀 The feature, motivation and pitch

Based on the baseline of 1150 tokens per large EP single die, the throughput per die is improved by 10% under the scenario of AFD decoupled deployment.

#### Test configuration
```shell
bash infer.sh --MODEL_DIR=/xxx/deepseekv3-lite-base-latest_bugtest \
              --EXE_MODE="dynamo" \
              --FFN_MODE="dynamo" \
              --BATCH_SIZE=864 \
              --LAYER_NUM=10 \
              --WORLD_SIZE=16 \
              --ATTN_DIES=12 \
              --NEXT_N=1 \
              --N_ROUTED_EXPERTS_PER_RANK=3 \
              --REMAINDER_ROUTER_EXPERT=0 \
              --EXPERTS_SHARE_NUM_COPY=1 \
              --DENSE_TP_SIZE=4 \
              --ON_CLOUD=0 \
              --ENABLE_CACHE_COMPILE=0 \
              --ENABLE_SUPERKERNEL=0 \
              --ENABLE_PREFETCH=1 \
              --LAYER_OUT="FA" \
              --ACTUAL_SEQ_LEN=4096 \
```
With this ratio allocation, a 10% throughput improvement can be achieved, reaching approximately 1260+ tokens per die

### Alternatives

_No response_

### Additional context

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: AFD Decoupled Deployment: Achieve 10% Throughput Improvement #196

🚀 The feature, motivation and pitch

Test configuration

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: AFD Decoupled Deployment: Achieve 10% Throughput Improvement #196

Description

🚀 The feature, motivation and pitch

Test configuration

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions