Skip to content

Commit b7428a0

Browse files
committed
Add thumbnail previews for each post
Signed-off-by: mgoin <[email protected]>
1 parent 2d4369b commit b7428a0

13 files changed

+105
-25
lines changed

_layouts/home.html

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
---
2+
layout: default
3+
---
4+
5+
<div class="home">
6+
{%- if page.title -%}
7+
<h1 class="page-heading">{{ page.title }}</h1>
8+
{%- endif -%}
9+
10+
{{ content }}
11+
12+
{% if site.paginate %}
13+
{% assign posts = paginator.posts %}
14+
{% else %}
15+
{% assign posts = site.posts %}
16+
{% endif %}
17+
18+
{%- if posts.size > 0 -%}
19+
<ul class="post-list">
20+
{%- assign date_format = site.minima.date_format | default: "%b %-d, %Y" -%}
21+
{%- for post in posts -%}
22+
<li class="post-item">
23+
{%- if post.image -%}
24+
<div class="post-thumbnail">
25+
<a href="{{ post.url | relative_url }}">
26+
<img src="{{ post.image | relative_url }}" alt="{{ post.title | escape }}">
27+
</a>
28+
</div>
29+
{%- endif -%}
30+
<div class="post-content">
31+
<span class="post-meta">{{ post.date | date: date_format }}</span>
32+
<h3>
33+
<a class="post-link" href="{{ post.url | relative_url }}">
34+
{{ post.title | escape }}
35+
</a>
36+
</h3>
37+
</div>
38+
</li>
39+
{%- endfor -%}
40+
</ul>
41+
{%- endif -%}
42+
</div>

_posts/2025-01-14-struct-decode-intro.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
layout: post
33
title: "Structured Decoding in vLLM: a gentle introduction"
44
author: "Guest Post by BentoML and Red Hat"
5+
image: /assets/figures/struct-decode-intro/vllm-xgrammar-decode-time-per-output-token.png
56
---
67

78
**TL/DR**:

_posts/2025-01-21-stack-release.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
---
22
layout: post
3-
title: "High Performance and Easy Deployment of vLLM in K8S with “vLLM production-stack”"
4-
thumbnail-img: /assets/figures/stack/stack-thumbnail.png
5-
share-img: /assets/figures/stack/stack-thumbnail.png
3+
title: "High Performance and Easy Deployment of vLLM in K8S with "vLLM production-stack""
64
author: LMCache Team
75
image: /assets/figures/stack/stack-thumbnail.png
86
---

_posts/2025-02-24-ptpc-fp8-rocm.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,6 @@ layout: post
33
title: "PTPC-FP8: Boosting vLLM Performance on AMD ROCm"
44
author: "AMD and Embedded LLM"
55
image: /assets/figures/ptpc/PTPC-tumbnail.png
6-
thumbnail-img: /assets/figures/ptpc/PTPC-tumbnail.png
7-
share-img: /assets/figures/ptpc/PTPC-tumbnail.png
86
math: true
97
---
108

_posts/2025-04-05-llama4.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,6 @@ layout: post
33
title: "Llama 4 in vLLM"
44
author: "The vLLM Team"
55
image: /assets/figures/llama4/perf.png
6-
thumbnail-img: /assets/figures/llama4/perf.png
7-
share-img: /assets/figures/llama4/perf.png
86
---
97

108
We're excited to announce that vLLM now supports the [Llama 4 herd of models](https://ai.meta.com/blog/llama-4-multimodal-intelligence/): **Scout** (17B-16E) and **Maverick** (17B-128E). You can run these powerful long-context, natively multi-modal (up to 8-10 images with good results), mixture-of-experts models in vLLM today by updating to version v0.8.3 or later:

_posts/2025-04-11-transformers-backend.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,6 @@ layout: post
33
title: "Transformers backend integration in vLLM"
44
author: "The Hugging Face Team"
55
image: /assets/figures/transformers-backend/transformers-backend.png
6-
thumbnail-img: /assets/figures/transformers-backend/transformers-backend.png
7-
share-img: /assets/figures/transformers-backend/transformers-backend.png
86
---
97

108
The [Hugging Face Transformers library](https://huggingface.co/docs/transformers/main/en/index)

_posts/2025-04-23-openrlhf-vllm.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,8 @@
11
---
22
layout: post
33
title: "Accelerating RLHF with vLLM, Best Practice from OpenRLHF"
4-
author: "The OpenRLHF Team"
5-
image: /assets/figures/openrlhf-vllm/ray.png
6-
thumbnail-img: /assets/figures/openrlhf-vllm/ray.png
7-
share-img: /assets/figures/openrlhf-vllm/ray.png
4+
author: "The OpenRLHF Team"
5+
image: /assets/figures/openrlhf-vllm/ray.png
86
---
97

108
As demand grows for training reasoning-capable large language models (LLMs), Reinforcement Learning from Human Feedback (RLHF) has emerged as a cornerstone technique. However, conventional RLHF pipelines—especially those using Proximal Policy Optimization (PPO)—are often hindered by substantial computational overhead. This challenge is particularly pronounced with models that excel at complex reasoning tasks (such as OpenAI-o1 and DeepSeek-R1), where generating long chain-of-thought (CoT) outputs can account for up to 90% of total training time. These models must produce detailed, step-by-step reasoning that can span thousands of tokens, making inference significantly more time-consuming than the training phase itself. As a pioneering inference framework, vLLM provides a user-friendly interface for generating RLHF samples and updating model weights.

_posts/2025-06-30-minimax-m1.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,9 @@
22
layout: post
33
title: "MiniMax-M1 Hybrid Architecture Meets vLLM: Long Context, Fast Inference"
44
author: "MiniMax"
5-
benchmark-img: /assets/figures/minimax-m1/benchmark.png
6-
moe-img: /assets/figures/minimax-m1/moe.png
5+
image: /assets/figures/minimax-m1/benchmark.png
6+
benchmark-img: /assets/figures/minimax-m1/benchmark.png
7+
moe-img: /assets/figures/minimax-m1/moe.png
78
lightning_attention-img: /assets/figures/minimax-m1/lightning_attention.png
89
---
910

_posts/2025-09-11-qwen3-next.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,6 @@ layout: post
33
title: "vLLM Now Supports Qwen3-Next: Hybrid Architecture with Extreme Efficiency"
44
author: "The vLLM Team"
55
image: /assets/figures/qwen3-next/qwen.png
6-
thumbnail-img: /assets/figures/qwen3-next/qwen.png
7-
share-img: /assets/figures/qwen3-next/qwen.png
86
---
97

108
We’re excited to announce that **vLLM now supports Qwen3-Next**, the latest generation of foundation models from the Qwen team. Qwen3-Next introduces a **hybrid architecture with extreme efficiency for long context support**, and vLLM offers full support of its functionalities.

_posts/2025-09-16-vllm-meetup.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
---
22
layout: post
33
title: "The First vLLM Meetup in Korea"
4-
author: "vLLM Team"
4+
author: "vLLM Team"
5+
image: /assets/figures/vllm-meetup/image-3.png
56
---
67

78
<p align="center">

0 commit comments

Comments
 (0)