@@ -16,6 +16,46 @@ RunPod Serverless is a cloud computing platform that lets you run AI models and
16
16
* ** Cost efficiency** : Pay only for what you use, with per-second billing and no costs when idle.
17
17
* ** Fast deployment** : Get your code running in the cloud in minutes with minimal configuration.
18
18
19
+ ## Deployment options
20
+
21
+ RunPod Serverless offers three ways to deploy your workloads, each designed for different use cases:
22
+
23
+ ### 1. Quick Deploys
24
+
25
+ ** Best for** : Getting popular AI models running quickly with minimal setup.
26
+
27
+ Quick Deploys are pre-configured templates for popular AI models that you can deploy with just a few clicks:
28
+ * No coding required
29
+ * Pre-optimized configurations
30
+ * Wide selection of popular AI models
31
+ * Minimal technical knowledge needed
32
+
33
+ [ Get started with Quick Deploys →] ( /serverless/quick-deploys )
34
+
35
+ ### 2. vLLM endpoints
36
+
37
+ ** Best for** : Deploying and serving large language models (LLMs).
38
+
39
+ vLLM endpoints are specifically optimized for running LLMs:
40
+ * Support for any [ Hugging Face model] ( https://huggingface.co/models )
41
+ * Optimized for LLM inference
42
+ * Simple configuration via environment variables
43
+ * High-performance serving with vLLM
44
+
45
+ [ Get started with vLLM endpoints →] ( /serverless/workers/vllm/get-started )
46
+
47
+ ### 3. Custom endpoints
48
+
49
+ ** Best for** : Running custom code or specialized AI workloads.
50
+
51
+ Custom endpoints give you complete control over your application:
52
+ * Write your own Python code
53
+ * Package in Docker containers
54
+ * Full flexibility for any use case
55
+ * Custom processing logic
56
+
57
+ [ Get started with custom endpoints →] ( /serverless/get-started )
58
+
19
59
## Key concepts
20
60
21
61
### Endpoints
@@ -67,43 +107,6 @@ When a user/client sends a request to your Serverless endpoint:
67
107
* ** Media processing** : Handle video transcoding, image generation, or audio processing.
68
108
* ** Scientific computing** : Run simulations, data analysis, or other specialized workloads.
69
109
70
-
71
- ## Get started with Serverless
72
-
73
- There are multiple ways to get started with Serverless:
74
-
75
- ### Custom endpoints
76
-
77
- For complete control over your application logic:
78
-
79
- 1 . Write your own handler function in Python.
80
- 2 . Package it in a Docker container.
81
- 3 . Deploy it using the RunPod console.
82
-
83
- [ Get started with custom endpoints →] ( /serverless/get-started )
84
-
85
- ### Quick Deploys
86
-
87
- [ Quick Deploys] ( /serverless/quick-deploys ) are the fastest way to deploy popular AI models with minimal configuration:
88
-
89
- 1 . Go to the [ Serverless page] ( https://www.runpod.io/console/serverless ) in the RunPod console.
90
- 2 . Select a Quick Deploy from the menu and click ** configure** .
91
- 3 . Select your GPU type and worker settings.
92
- 4 . Deploy with a single click.
93
-
94
- [ Get started with Quick Deploys →] ( /serverless/quick-deploys )
95
-
96
- ### vLLM endpoints
97
-
98
- Deploy a pre-built endpoint specifically designed for large language models:
99
-
100
- 1 . Use pre-built Docker images optimized for LLMs.
101
- 2 . Choose any [ Hugging Face] ( https://huggingface.co/models ) model.
102
- 3 . Configure with simple environment variables.
103
- 4 . Deploy with with a single click.
104
-
105
- [ Get started with vLLM endpoints →] ( /serverless/workers/vllm/get-started )
106
-
107
110
## Next Steps
108
111
109
112
Ready to get started with RunPod Serverless?
@@ -112,4 +115,4 @@ Ready to get started with RunPod Serverless?
112
115
- [ Try a Quick Deploy model.] ( /serverless/quick-deploys )
113
116
- [ Deploy large language models in minutes with vLLM.] ( /serverless/workers/vllm/overview )
114
117
- [ Learn about handler functions.] ( /serverless/workers/handlers/overview )
115
- - [ Learn about endpoints.] ( /serverless/endpoints/overview )
118
+ - [ Learn about endpoints.] ( /serverless/endpoints/overview )
0 commit comments