runpod · muhsinking · Apr 24, 2025 · Apr 24, 2025
diff --git a/docs/overview.md b/docs/overview.md
@@ -31,7 +31,7 @@ Get started with Serverless:
 
 - [Build your first Serverless app.](/serverless/get-started)
 - [Run any LLM as an endpoint using vLLM workers.](/serverless/vllm/get-started)
-- [Tutorial: Create a Serverless endpoint with Stable Diffusion.](/tutorials/serverless/gpu/run-your-first)
+- [Tutorial: Create a Serverless endpoint with Stable Diffusion.](/tutorials/serverless/run-your-first)
 
 ## Pods
 

diff --git a/docs/serverless/vllm/overview.md b/docs/serverless/vllm/overview.md
@@ -57,4 +57,4 @@ For more information on creating a custom docker image, see [Build Docker Image
 - [Get started](/serverless/vllm/get-started): Learn how to deploy a vLLM Worker as a Serverless Endpoint, with detailed guides on configuration and sending requests.
 - [Configurable Endpoints](/serverless/vllm/configurable-endpoints): Select your Hugging Face model and vLLM takes care of the low-level details of model loading, hardware configuration, and execution.
 - [Environment variables](/serverless/vllm/environment-variables): Explore the environment variables available for the vLLM Worker, including detailed documentation and examples.
-- [Run Gemma 7b](/tutorials/serverless/gpu/run-gemma-7b): Walk through deploying Google's Gemma model using RunPod's vLLM Worker, guiding you to set up a Serverless Endpoint with a gated large language model (LLM).
+- [Run Gemma 7b](/tutorials/serverless/run-gemma-7b): Walk through deploying Google's Gemma model using RunPod's vLLM Worker, guiding you to set up a Serverless Endpoint with a gated large language model (LLM).
diff --git a/docs/tutorials/introduction/containers/overview.md b/docs/tutorials/introduction/containers/overview.md
@@ -11,4 +11,4 @@ While the documentation around the introduction section gives a holistic view an
 
 - If you are looking for an understanding of Containers and Docker, see [Container overview](/tutorials/introduction/containers).
 - If you are looking to run your first Pod with RunPod, see [Run your first Fast Stable Diffusion with Jupyter Notebook](/tutorials/pods/run-your-first).
-- For Serverless implementation, see [Run your first serverless endpoint with Stable Diffusion](/tutorials/serverless/gpu/run-your-first).
+- For Serverless implementation, see [Run your first serverless endpoint with Stable Diffusion](/tutorials/serverless/run-your-first).
diff --git a/docs/tutorials/introduction/overview.md b/docs/tutorials/introduction/overview.md
@@ -13,13 +13,13 @@ Explore how to run and deploy AI applications using RunPod's Serverless platform
 
 ### GPUs
 
-- [Generate images with SDXL Turbo](/tutorials/serverless/gpu/generate-sdxl-turbo): Learn how to build a web application using RunPod's Serverless Workers and SDXL Turbo from Stability AI, a fast text-to-image model, and send requests to an Endpoint to generate images from text-based inputs.
-- [Run Google's Gemma model](/tutorials/serverless/gpu/run-gemma-7b): Deploy Google's Gemma model on RunPod's vLLM Worker, create a Serverless Endpoint, and interact with the model using OpenAI APIs and Python.
-- [Run your first serverless endpoint with Stable Diffusion](/tutorials/serverless/gpu/run-your-first): Use RunPod's Stable Diffusion v1 inference endpoint to generate images, set up your serverless worker, start a job, check job status, and retrieve results.
+- [Generate images with SDXL Turbo](/tutorials/serverless/generate-sdxl-turbo): Learn how to build a web application using RunPod's Serverless Workers and SDXL Turbo from Stability AI, a fast text-to-image model, and send requests to an Endpoint to generate images from text-based inputs.
+- [Run Google's Gemma model](/tutorials/serverless/run-gemma-7b): Deploy Google's Gemma model on RunPod's vLLM Worker, create a Serverless Endpoint, and interact with the model using OpenAI APIs and Python.
+- [Run your first serverless endpoint with Stable Diffusion](/tutorials/serverless/run-your-first): Use RunPod's Stable Diffusion v1 inference endpoint to generate images, set up your serverless worker, start a job, check job status, and retrieve results.
 
 ### CPUs
 
-- [Run an Ollama Server on a RunPod CPU](/tutorials/serverless/cpu/run-ollama-inference): Set up and run an Ollama server on RunPod CPU for inference with this step-by-step tutorial.
+- [Run an Ollama Server on a RunPod CPU](/tutorials/serverless/run-ollama-inference): Set up and run an Ollama server on RunPod CPU for inference with this step-by-step tutorial.
 
 ## Pods
 

diff --git a/docs/tutorials/migrations/banana/overview.md b/docs/tutorials/migrations/banana/overview.md
@@ -122,7 +122,7 @@ gh repo clone runpod-workers/worker-template
 Now that you've got a basic RunPod Worker template created:
 
 - Continue reading to see how you'd migrate from Banana to RunPod
-- See [Generate SDXL Turbo](/tutorials/serverless/gpu/generate-sdxl-turbo) for a general approach on deploying your first Serverless Endpoint with RunPod.
+- See [Generate SDXL Turbo](/tutorials/serverless/generate-sdxl-turbo) for a general approach on deploying your first Serverless Endpoint with RunPod.
 
 ## Project structure
 

diff --git a/docusaurus.config.js b/docusaurus.config.js
@@ -279,7 +279,7 @@ const config = {
             redirects.push(existingPath.replace('/serverless/endpoints/', '/serverless/references/'));
           }
           else if (existingPath.includes('/tutorials/serverless/')) {
-            redirects.push(existingPath.replace('tutorials/serverless/', 'tutorials/serverless/gpu/'));
+            redirects.push(existingPath.replace('/tutorials/serverless/', '/tutorials/serverless/gpu/'));
           }
           return redirects;
         },
@@ -311,7 +311,7 @@ const config = {
           },
           {
             to: '/tutorials/serverless/run-ollama-inference',
-            from: '/tutorials/serverless/gpu/run-ollama-inference',
+            from: '/tutorials/serverless/cpu/run-ollama-inference',
           },
         ]
       },