Blog post on KServe + llm-d + vLLM from Red Hat and Tesla by terrytangyuan · Pull Request #192 · llm-d/llm-d.github.io

terrytangyuan · 2026-03-04T16:44:18Z

This blog highlights the collaboration story between Red Hat and Tesla to overcome significant scaling and operational challenges in LLM deployment. It explains how migrating from a simple vLLM deployment to a robust MLOps platform utilizing KServe, llm-d's intelligent routing, and vLLM provides deep customization and improved efficiency through prefix-cache aware routing to maximize GPU utilization.

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

netlify · 2026-03-04T16:44:25Z

✅ Deploy Preview for elaborate-kangaroo-25e1ee ready!

Name	Link
🔨 Latest commit	`6408d37`
🔍 Latest deploy log	https://app.netlify.com/projects/elaborate-kangaroo-25e1ee/deploys/69b17c136741610008684f21
😎 Deploy Preview	https://deploy-preview-192--elaborate-kangaroo-25e1ee.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot

Pull request overview

Adds a new blog post describing a Red Hat + Tesla collaboration story for production-grade LLM inference using KServe, llm-d routing, and vLLM, and registers new authors for attribution.

Changes:

Add three author entries to blog/authors.yml for the new post.
Add a new blog post markdown file with frontmatter, content, and an architecture diagram reference.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
blog/authors.yml	Adds new author profiles referenced by the new blog post.
blog/2026-03-06_production-grade-ai-inference-kserve-red-hat-and-tesla-success-story.md	Introduces the new Red Hat + Tesla success story blog post (frontmatter + content).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

blog/2026-03-06_production-grade-ai-inference-kserve-red-hat-and-tesla-success-story.md

…nd-tesla-success-story.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

… community section - Convert saikrishna.jpg and scottcabrinha.jpg to WebP and update authors.yml image_url to local paths - Download KServe architecture diagram, convert to WebP, store under static/img/blogs/<slug>/ - Update blog post image reference from remote GitHub blob URL to local WebP path - Add "Get Involved with llm-d" community section with links to Slack, GitHub, community calls, social media, and YouTube Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Pete Cheslock <pete.cheslock@redhat.com>

blog/authors.yml

- Move LinkedIn URLs from url field to socials.linkedin for all LinkedIn-based authors - Add socials.github for authors with known GitHub profiles - Add socials.linkedin for terrytangyuan, cabrinha, robshaw, and saikrishna Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Pete Cheslock <pete.cheslock@redhat.com>

…kedIn socials - Remove url field from all authors who have socials defined - Add linkedin socials for petecheslock, cnuland, terrytangyuan, cabrinha, robshaw - Only redhat org entry retains url field (no socials) Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Pete Cheslock <pete.cheslock@redhat.com>

petecheslock · 2026-03-04T21:02:21Z

Preview URL: https://deploy-preview-192--elaborate-kangaroo-25e1ee.netlify.app/blog/production-grade-ai-inference-kserve-red-hat-and-tesla-success-story

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

blog/2026-03-06_production-grade-ai-inference-kserve-red-hat-and-tesla-success-story.md

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

terrytangyuan · 2026-03-11T14:46:03Z

blog/2026-03-06_production-grade-ai-inference-kserve-red-hat-and-tesla-success-story.md

+title: "Production-Grade AI Inference with KServe, llm-d, and vLLM: A Red Hat and Tesla Success Story"
+description: "The collaboration story between Red Hat and Tesla to overcome significant scaling and operational challenges in LLM deployment. It explains how migrating from a simple vLLM deployment to a robust MLOps platform utilizing KServe, llm-d's intelligent routing, and vLLM provides deep customization and improved efficiency through prefix-cache aware routing to maximize GPU utilization."
+slug: production-grade-ai-inference-kserve-red-hat-and-tesla-success-story
+date: 2026-03-06T09:00


Update the date here (and the date in the file name) before this gets merged/published

Blog post on KServe + llm-d + vLLM from Red Hat and Tesla

cf00c3d

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Copilot AI review requested due to automatic review settings March 4, 2026 16:44

Copilot started reviewing on behalf of terrytangyuan March 4, 2026 16:44 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

terrytangyuan and others added 4 commits March 4, 2026 11:50

Update blog/2026-03-06_production-grade-ai-inference-kserve-red-hat-a…

06ad929

…nd-tesla-success-story.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update blog/2026-03-06_production-grade-ai-inference-kserve-red-hat-a…

ed0d1f0

…nd-tesla-success-story.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update blog/2026-03-06_production-grade-ai-inference-kserve-red-hat-a…

0c799ab

…nd-tesla-success-story.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

petecheslock reviewed Mar 4, 2026

View reviewed changes

blog/authors.yml Show resolved Hide resolved

petecheslock and others added 2 commits March 4, 2026 15:57

terrytangyuan added 2 commits March 5, 2026 09:34

Update sai github link

4c2fed0

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

More names and PR link

4f5f0eb

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

terrytangyuan requested review from Gregory-Pereira, chcost, clubanderson, jjasghar, robertgshaw2-redhat and smarterclayton as code owners March 6, 2026 14:58

skpulipaka26 reviewed Mar 11, 2026

View reviewed changes

blog/2026-03-06_production-grade-ai-inference-kserve-red-hat-and-tesla-success-story.md Outdated Show resolved Hide resolved

Add performance chart

6408d37

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

terrytangyuan commented Mar 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blog post on KServe + llm-d + vLLM from Red Hat and Tesla#192

Blog post on KServe + llm-d + vLLM from Red Hat and Tesla#192
terrytangyuan wants to merge 10 commits intollm-d:mainfrom
terrytangyuan:blog-kserve-llm-d

terrytangyuan commented Mar 4, 2026

Uh oh!

netlify bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

petecheslock commented Mar 4, 2026

Uh oh!

Uh oh!

terrytangyuan Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

terrytangyuan commented Mar 4, 2026

Uh oh!

netlify bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for elaborate-kangaroo-25e1ee ready!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

petecheslock commented Mar 4, 2026

Uh oh!

Uh oh!

terrytangyuan Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

netlify bot commented Mar 4, 2026 •

edited

Loading