diff --git a/.cursor/rules/google-style-guide.mdc b/.cursor/rules/google-style-guide.mdc
new file mode 100644
index 00000000..87764bcf
--- /dev/null
+++ b/.cursor/rules/google-style-guide.mdc
@@ -0,0 +1,86 @@
+---
+description:
+globs:
+alwaysApply: true
+---
+---
+description: |
+ Enforce Google's Developer Style Guide principles for technical documentation.
+ These rules guide the AI to create clear, consistent, and user-friendly documentation.
+globs:
+ - "*.md"
+ - "*.mdx"
+ - "*.txt"
+---
+
+# Google Developer Style Guide for Technical Documentation
+
+## Document Structure
+- Always use sentence case for all Markdown headings (e.g., '# This is a heading' not '# This Is A Heading').
+- Begin each main section with a brief one or two sentence overview that summarizes the section's content.
+- Organize content into logical sections with clear and concise headings and subheadings.
+- Structure the documentation in a hierarchical manner, using heading levels (# for main titles, ## for sections, ### for subsections).
+
+## Lists and Formatting
+- Use Markdown numbered lists (1., 2., etc.) for sequential steps or ordered procedures.
+- Use Markdown unordered lists (-, *, etc.) for collections of related items that don't have a specific order.
+- Format code-related text using Markdown code blocks with the appropriate language identifier for syntax highlighting:
+
+ ```python
+ def example_function():
+ return "Hello, world!"
+ ```
+- Format UI elements such as button labels and menu items using bold Markdown syntax (**UI Element**).
+- Use italic text (*text*) sparingly, primarily for emphasis, terms, or book titles.
+- Present pairs of related data (like terms and definitions) using description lists or bold terms followed by their explanations.
+- Use unambiguous date formatting, preferably YYYY-MM-DD.
+
+## Language and Tone
+- Always address the reader using the second person pronoun "you" instead of "we" or "us".
+- Prefer active voice in sentences. For example, instead of "The file was saved by the system," write "The system saved the file."
+- Maintain a friendly, conversational, and helpful tone, similar to explaining a concept to a colleague.
+- Use standard American English spelling and punctuation consistently.
+- Avoid highly technical jargon without providing clear explanations or definitions.
+- Be mindful of using idioms or culturally specific references that might not be universally understood.
+- Avoid unnecessary repetition of adjectives and adverbs.
+- Write in a clear, concise, and factual manner, avoiding overly casual or promotional language.
+
+## Links and References
+- When creating hyperlinks using Markdown, ensure the link text clearly describes the target page (e.g., [Learn more about the API](mdc:url)).
+- Prioritize linking to official documentation, well-established technical websites, or academic resources.
+- For fundamental concepts crucial to understanding the current topic, provide a brief explanation within the documentation rather than immediately linking externally.
+- Reserve external links for more detailed or supplementary information.
+
+## Code Examples
+- Always enclose code examples in Markdown code blocks using triple backticks (```) and specify the programming language.
+- Precede every code block with a brief paragraph explaining its context and purpose.
+- Follow the code block with an explanation of its key parts and expected output.
+- Provide substantial, real-world code examples that demonstrate complete or significant functionality rather than isolated snippets.
+- If the code example pertains to a specific file or directory, mention its location relative to the project root.
+
+## Images and Diagrams
+- When including images or diagrams, use Markdown image syntax and provide descriptive alt text: 
+- Prefer PNG format for diagrams and illustrations, and WebP format for other images where appropriate.
+- Ensure all images serve a purpose and enhance understanding of the content.
+
+## Warnings, Notes, and Important Information
+- Format warnings using Markdown blockquotes with a clear prefix:
+ > :::warning
+
+ This action cannont be undone.
+
+ :::
+
+- Format notes using Markdown blockquotes:
+ > :::warning
+
+ Additional configuration may be required for custom installations.
+
+ :::
+- Keep warning, note, and important information messages brief and to the point, focusing on essential information.
+
+## Step-by-Step Instructions
+- Present step-by-step instructions using Markdown numbered lists.
+- Begin each step with a clear action verb (e.g., "Click", "Open", "Enter").
+- Ensure each step represents a single, actionable task.
+- Provide sufficient detail for the target audience to understand and execute each action without requiring additional assumptions.
\ No newline at end of file
diff --git a/docs/sdks/go/overview.md b/docs/sdks/go/overview.md
index 647cafe0..85e8ecca 100644
--- a/docs/sdks/go/overview.md
+++ b/docs/sdks/go/overview.md
@@ -1,43 +1,39 @@
---
title: Overview
sidebar_position: 1
+description: "Get started with RunPod Go SDK for building web applications, server-side implementations, and automating tasks. Learn how to install, configure, and secure your API key."
---
-Get started with setting up your RunPod projects using Go.
-Whether you're building web applications, server-side implementations, or automating tasks, the RunPod Go SDK provides the tools you need.
-This guide outlines the steps to get your development environment ready and integrate RunPod into your Go projects.
+This guide helps you set up and use the RunPod Go SDK in your projects. You'll learn how to install the SDK, configure your environment, and integrate RunPod into your Go applications.
## Prerequisites
-Before you begin, ensure that you have the following:
+Before you begin, ensure you have:
-- Go installed on your machine (version 1.16 or later)
-- A RunPod account with an API key and Endpoint Id
+- Go 1.16 or later installed
+- A RunPod account with an API key and endpoint ID
-## Install the RunPod SDK {#install}
+## Install the SDK
-Before integrating RunPod into your project, you'll need to install the SDK.
+To install the RunPod SDK in your project:
-To install the RunPod SDK, run the following `go get` command in your project directory.
+1. Run this command in your project directory:
+ ```bash
+ go get github.com/runpod/go-sdk
+ ```
-```command
-go get github.com/runpod/go-sdk
-```
-
-This command installs the `runpod-sdk` package.
-Then run the following command to install the dependencies:
-
-```command
-go mod tidy
-```
-
-For more details about the package, visit the [Go package page](https://pkg.go.dev/github.com/runpod/go-sdk/pkg/sdk) or the [GitHub repository](https://github.com/runpod/go-sdk).
+2. Install dependencies:
+ ```bash
+ go mod tidy
+ ```
-## Add your API key
+For more details, visit:
+- [Go package documentation](https://pkg.go.dev/github.com/runpod/go-sdk/pkg/sdk)
+- [GitHub repository](https://github.com/runpod/go-sdk)
-To use the RunPod SDK in your project, you first need to import it and configure it with your API key and endpoint ID. Ensure these values are securely stored, preferably as environment variables.
+## Configure your environment
-Below is a basic example of how to initialize and use the RunPod SDK in your Go project.
+Set up your API key and endpoint ID in your Go application:
```go
func main() {
@@ -54,21 +50,20 @@ func main() {
}
```
-This snippet demonstrates how to import the SDK, initialize it with your API key, and reference a specific endpoint using its ID.
-
-### Secure your API key
+## Secure your API key
-When working with the RunPod SDK, it's essential to secure your API key.
-Storing the API key in environment variables is recommended, as shown in the initialization example. This method keeps your key out of your source code and reduces the risk of accidental exposure.
+Always store your API key securely:
-:::note
+- Use environment variables (recommended)
+- Avoid storing keys in source code
+- Use secure secrets management solutions
-Use environment variables or secure secrets management solutions to handle sensitive information like API keys.
+> **Note:** Never commit API keys to version control. Use environment variables or secure secrets management solutions to handle sensitive information.
-:::
+## Next steps
-For more information, see the following:
+For more information, see:
-- [RunPod SDK Go Package](https://pkg.go.dev/github.com/runpod/go-sdk/pkg/sdk)
-- [RunPod GitHub Repository](https://github.com/runpod/go-sdk)
-- [Endpoints](/sdks/go/endpoints)
+- [Endpoints documentation](endpoints.md)
+- [Go package documentation](https://pkg.go.dev/github.com/runpod/go-sdk/pkg/sdk)
+- [GitHub repository](https://github.com/runpod/go-sdk)
diff --git a/docs/sdks/graphql/configurations.md b/docs/sdks/graphql/configurations.md
index 4a67215a..db962e48 100644
--- a/docs/sdks/graphql/configurations.md
+++ b/docs/sdks/graphql/configurations.md
@@ -4,50 +4,62 @@ sidebar_position: 1
description: "Configure your environment with essential arguments: containerDiskInGb, dockerArgs, env, imageName, name, and volumeInGb, to ensure correct setup and operation of your container."
---
-For details on queries, mutations, fields, and inputs, see the [RunPod GraphQL Spec](https://graphql-spec.runpod.io/).
+This guide explains the essential configuration arguments for your RunPod environment. For complete API details, see the [RunPod GraphQL Spec](https://graphql-spec.runpod.io/).
-When configuring your environment, certain arguments are essential to ensure the correct setup and operation. Below is a detailed overview of each required argument:
+## Required arguments
-### `containerDiskInGb`
+The following arguments are required for proper container setup and operation:
+
+### Container disk size
+
+`containerDiskInGb` specifies the container's disk size in gigabytes:
-- **Description**: Specifies the size of the disk allocated for the container in gigabytes. This space is used for the operating system, installed applications, and any data generated or used by the container.
- **Type**: Integer
-- **Example**: `10` for a 10 GB disk size.
+- **Example**: `10` for a 10 GB disk
+- **Use**: Operating system, applications, and container data
-### `dockerArgs`
+### Docker arguments
+
+`dockerArgs` overrides the container's start command:
-- **Description**: If specified, overrides the [container start command](https://docs.docker.com/engine/reference/builder/#cmd). If this argument is not provided, it will rely on the start command provided in the docker image.
- **Type**: String
-- **Example**: `sleep infinity` to run the container in the background.
+- **Example**: `sleep infinity` for background operation
+- **Use**: Custom container startup behavior
+
+### Environment variables
-
+`env` sets container environment variables:
-### `env`
+- **Type**: Dictionary/Object
+- **Example**: `{"DATABASE_URL": "postgres://user:password@localhost/dbname"}`
+- **Use**: Application configuration and credentials
-- **Description**: A set of environment variables to be set within the container. These can configure application settings, external service credentials, or any other configuration data required by the software running in the container.
-- **Type**: Dictionary or Object
-- **Example**: `{"DATABASE_URL": "postgres://user:password@localhost/dbname"}`.
+### Docker image
-### `imageName`
+`imageName` specifies the container image:
-- **Description**: The name of the Docker image to use for the container. This should include the repository name and tag, if applicable.
- **Type**: String
-- **Example**: `"nginx:latest"` for the latest version of the Nginx image.
+- **Example**: `"nginx:latest"`
+- **Use**: Container base image and version
-### `name`
+### Container name
+
+`name` identifies your container instance:
-- **Description**: The name assigned to the container instance. This name is used for identification and must be unique within the context it's being used.
- **Type**: String
-- **Example**: `"my-app-container"`.
+- **Example**: `"my-app-container"`
+- **Use**: Container identification and management
+
+### Persistent volume
-### `volumeInGb`
+`volumeInGb` defines persistent storage size:
-- **Description**: Defines the size of an additional persistent volume in gigabytes. This volume is used for storing data that needs to persist between container restarts or redeployments.
- **Type**: Integer
-- **Example**: `5` for a 5GB persistent volume.
+- **Example**: `5` for 5GB storage
+- **Use**: Data persistence between restarts
+
+## Optional arguments
-Ensure that these arguments are correctly specified in your configuration to avoid errors during deployment.
+Additional configuration options may be available for specific use cases. See the [RunPod GraphQL Spec](https://graphql-spec.runpod.io/) for details.
-Optional arguments may also be available, providing additional customization and flexibility for your setup.
+> **Note:** Ensure all required arguments are correctly specified to avoid deployment errors.
diff --git a/docs/sdks/javascript/overview.md b/docs/sdks/javascript/overview.md
index 5cb33fc1..1fc9fb51 100644
--- a/docs/sdks/javascript/overview.md
+++ b/docs/sdks/javascript/overview.md
@@ -4,31 +4,36 @@ sidebar_position: 1
description: "Get started with RunPod JavaScript SDK, a tool for building web apps, server-side implementations, and automating tasks. Learn how to install, integrate, and secure your API key for seamless development."
---
-Get started with setting up your RunPod projects using JavaScript. Whether you're building web applications, server-side implementations, or automating tasks, the RunPod JavaScript SDK provides the tools you need.
-This guide outlines the steps to get your development environment ready and integrate RunPod into your JavaScript projects.
+This guide helps you set up and use the RunPod JavaScript SDK in your projects. You'll learn how to install the SDK, configure your environment, and integrate RunPod into your JavaScript applications.
-## Install the RunPod SDK
+## Install the SDK
-Before integrating RunPod into your project, you'll need to install the SDK.
-Using Node.js and npm (Node Package Manager) simplifies this process.
-Ensure you have Node.js and npm installed on your system before proceeding.
+To use the RunPod SDK in your project:
-To install the RunPod SDK, run the following npm command in your project directory.
+1. Ensure you have Node.js and npm installed on your system
+2. Run one of these commands in your project directory:
-```command
-npm install --save runpod-sdk
-# or
-yarn add runpod-sdk
-```
+ ```bash
+ npm install --save runpod-sdk
+ # or
+ yarn add runpod-sdk
+ ```
+
+This installs the `runpod-sdk` package and adds it to your project's `package.json` dependencies.
+
+For more details, visit:
+- [npm package page](https://www.npmjs.com/package/runpod-sdk)
+- [GitHub repository](https://github.com/runpod/js-sdk)
-This command installs the `runpod-sdk` package and adds it to your project's `package.json` dependencies.
-For more details about the package, visit the [npm package page](https://www.npmjs.com/package/runpod-sdk) or the [GitHub repository](https://github.com/runpod/js-sdk).
+## Configure your environment
-## Add your API key
+To use the RunPod SDK, you need to:
-To use the RunPod SDK in your project, you first need to import it and configure it with your API key and endpoint ID. Ensure these values are securely stored, preferably as environment variables.
+1. Import the SDK
+2. Configure it with your API key and endpoint ID
+3. Store sensitive information securely
-Below is a basic example of how to initialize and use the RunPod SDK in your JavaScript project.
+Here's how to initialize the SDK:
```javascript
const { RUNPOD_API_KEY, ENDPOINT_ID } = process.env;
@@ -38,22 +43,22 @@ const runpod = runpodSdk(RUNPOD_API_KEY);
const endpoint = runpod.endpoint(ENDPOINT_ID);
```
-This snippet demonstrates how to import the SDK, initialize it with your API key, and reference a specific endpoint using its ID.
-Remember, the RunPod SDK uses the ES Module (ESM) system and supports asynchronous operations, making it compatible with modern JavaScript development practices.
+The SDK uses ES Modules (ESM) and supports asynchronous operations for modern JavaScript development.
-### Secure your API key
+## Secure your API key
-When working with the RunPod SDK, it's essential to secure your API key.
-Storing the API key in environment variables is recommended, as shown in the initialization example. This method keeps your key out of your source code and reduces the risk of accidental exposure.
+Always store your API key securely:
-:::note
+- Use environment variables (recommended)
+- Avoid storing keys in source code
+- Use secure secrets management solutions
-Use environment variables or secure secrets management solutions to handle sensitive information like API keys.
+> **Note:** Never commit API keys to version control. Use environment variables or secure secrets management solutions to handle sensitive information.
-:::
+## Next steps
-For more information, see the following:
+For more information, see:
-- [RunPod SDK npm Package](https://www.npmjs.com/package/runpod-sdk)
-- [RunPod GitHub Repository](https://github.com/runpod/js-sdk)
-- [Endpoints](/sdks/javascript/endpoints)
+- [Endpoints documentation](endpoints.md)
+- [npm package documentation](https://www.npmjs.com/package/runpod-sdk)
+- [GitHub repository](https://github.com/runpod/js-sdk)
diff --git a/docs/sdks/overview.md b/docs/sdks/overview.md
index 40e5e0bd..b9d33789 100644
--- a/docs/sdks/overview.md
+++ b/docs/sdks/overview.md
@@ -1,42 +1,74 @@
---
title: Overview
-description: "Unlock serverless functionality with RunPod SDKs, enabling developers to create custom logic, simplify deployments, and programmatically manage infrastructure, including Pods, Templates, and Endpoints."
+description: "Learn how to use RunPod SDKs to build, deploy, and manage AI applications. Find solutions for common use cases and get started quickly with your preferred programming language."
sidebar_position: 1
---
-RunPod SDKs provide developers with tools to use the RunPod API for creating serverless functions and managing infrastructure.
-They enable custom logic integration, simplify deployments, and allow for programmatic infrastructure management.
+This guide helps you use RunPod SDKs to build and manage AI applications. Choose your preferred programming language and follow the guides that match your goals.
-## Interacting with Serverless Endpoints
+## Quick start
-Once deployed, serverless functions is exposed as an Endpoints, you can allow external applications to interact with them through HTTP requests.
+Get started quickly with your preferred programming language:
-#### Interact with Serverless Endpoints:
+- [Python SDK](python/overview.md) - Best for AI/ML applications
+- [JavaScript SDK](javascript/overview.md) - Ideal for web applications
+- [Go SDK](go/overview.md) - Great for high-performance services
+- [GraphQL API](graphql/configurations.md) - Direct API access
-Your Serverless Endpoints works similarly to an HTTP request.
-You will need to provide an Endpoint Id and a reference to your API key to complete requests.
+## Common use cases
-## Infrastructure management
+### Build AI applications
+- [Create serverless endpoints](python/endpoints.md)
+- [Deploy ML models](python/apis.md)
+- [Monitor application performance](python/structured-logging.md)
-The RunPod SDK facilitates the programmatic creation, configuration, and management of various infrastructure components, including Pods, Templates, and Endpoints.
+### Manage infrastructure
+- [Set up GPU instances](python/apis.md#list-available-gpus)
+- [Configure templates](python/apis.md#create-templates)
+- [Scale resources](python/apis.md#create-endpoints)
-### Managing Pods
+### Monitor and debug
+- [Track application logs](python/structured-logging.md)
+- [Monitor performance](python/apis.md)
+- [Debug issues](python/structured-logging.md#log-levels)
-Pods are the fundamental building blocks in RunPod, representing isolated environments for running applications.
+## Choose your SDK
-#### Manage Pods:
+Each SDK is optimized for different use cases:
-1. **Create a Pod**: Use the SDK to instantiate a new Pod with the desired configuration.
-2. **Configure the Pod**: Adjust settings such as GPU, memory allocation, and network access according to your needs.
-3. **Deploy Applications**: Deploy your applications or services within the Pod.
-4. **Monitor and scale**: Utilize the SDK to monitor Pod performance and scale resources as required.
+### Python SDK
+Best for:
+- AI/ML model deployment
+- Data processing pipelines
+- Scientific computing
+- Quick prototyping
-### Manage Templates and Endpoints
+### JavaScript SDK
+Best for:
+- Web applications
+- Frontend integrations
+- Browser-based tools
+- Node.js services
-Templates define the base environment for Pods, while Endpoints enable external access to services running within Pods.
+### Go SDK
+Best for:
+- High-performance services
+- Microservices
+- CLI tools
+- System utilities
-#### Use Templates and Endpoints:
+### GraphQL API
+Best for:
+- Custom integrations
+- Direct API access
+- Complex queries
+- Real-time updates
-1. **Create a Template**: Define a Template that specifies the base configuration for Pods.
-2. **Instantiate Pods from Templates**: Use the Template to create Pods with a consistent environment.
-3. **Expose Services via Endpoints**: Configure Endpoints to allow external access to applications running in Pods.
+## Next steps
+
+1. Choose your preferred programming language
+2. Follow the quick start guide
+3. Explore use case examples
+4. Build your application
+
+> **Note:** All SDKs provide similar core functionality. Choose based on your team's expertise and project requirements.
diff --git a/docs/sdks/python/_loggers.md b/docs/sdks/python/_loggers.md
deleted file mode 100644
index 36fd1fdc..00000000
--- a/docs/sdks/python/_loggers.md
+++ /dev/null
@@ -1,56 +0,0 @@
----
-title: Loggers
-description: "Enable efficient application monitoring and debugging with RunPod's structured logging interface, simplifying issue identification and resolution, and ensuring smooth operation."
----
-
-Logging is essential for insight into your application's performance and health.
-It facilitates quick identification and resolution of issues, ensuring smooth operation.
-
-Because of this, RunPod provides a structured logging interface, simplifying application monitoring and debugging, for your Handler code.
-
-To setup logs, instantiate the `RunPodLogger()` module.
-
-```python
-import runpod
-
-log = runpod.RunPodLogger()
-```
-
-Then set the log level.
-In the following example, there are two logs levels being set.
-
-```python
-import runpod
-import os
-
-log = runpod.RunPodLogger()
-
-
-def handler(job):
- try:
- job_input = job["input"]
- log.info("Processing job input")
-
- name = job_input.get("name", "World")
- log.info("Processing completed successfully")
-
- return f"Hello, {name}!"
- except Exception as e:
- # Log the exception with an error level log
- log.error(f"An error occurred: {str(e)}")
- return "An error occurred during processing."
-
-
-runpod.serverless.start({"handler": handler})
-```
-
-## Log levels
-
-RunPod provides a logging interface with types you're already familiar with.
-
-The following provides a list of log levels you can set inside your application.
-
-- `debug`: For in-depth troubleshooting. Use during development to track execution flow.
-- `info`: (default) Indicates normal operation. Confirms the application is running as expected.
-- `warn`: Alerts to potential issues. Signals unexpected but non-critical events.
-- `error`: Highlights failures. Marks inability to perform a function, requiring immediate attention.
diff --git a/docs/sdks/python/apis.md b/docs/sdks/python/apis.md
index b93edef0..677a860f 100644
--- a/docs/sdks/python/apis.md
+++ b/docs/sdks/python/apis.md
@@ -8,13 +8,11 @@ description: "Learn how to manage computational resources with the RunPod API, i
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
-This document outlines the core functionalities provided by the RunPod API, including how to interact with Endpoints, manage Templates, and list available GPUs.
-These operations let you dynamically manage computational resources within the RunPod environment.
+This guide explains how to use the RunPod API to manage computational resources. You'll learn how to work with endpoints, templates, and GPUs programmatically.
-## Get Endpoints
+## Get endpoints
-To retrieve a comprehensive list of all available endpoint configurations within RunPod, you can use the `get_endpoints()` function.
-This function returns a list of endpoint configurations, allowing you to understand what's available for use in your projects.
+Retrieve a list of all available endpoint configurations:
```python
import runpod
@@ -22,17 +20,14 @@ import os
runpod.api_key = os.getenv("RUNPOD_API_KEY")
-# Fetching all available endpoints
+# Get all available endpoints
endpoints = runpod.get_endpoints()
-
-# Displaying the list of endpoints
print(endpoints)
```
-## Create Template
+## Create templates
-Templates in RunPod serve as predefined configurations for setting up environments efficiently.
-The `create_template()` function facilitates the creation of new templates by specifying a name and a Docker image.
+Templates define predefined configurations for your environments. Create a new template:
@@ -44,14 +39,14 @@ import os
runpod.api_key = os.getenv("RUNPOD_API_KEY")
try:
- # Creating a new template with a specified name and Docker image
- new_template = runpod.create_template(name="test", image_name="runpod/base:0.1.0")
-
- # Output the created template details
+ # Create a new template
+ new_template = runpod.create_template(
+ name="test",
+ image_name="runpod/base:0.1.0"
+ )
print(new_template)
except runpod.error.QueryError as err:
- # Handling potential errors during template creation
print(err)
print(err.query)
```
@@ -77,11 +72,9 @@ except runpod.error.QueryError as err:
-## Create Endpoint
+## Create endpoints
-Creating a new endpoint with the `create_endpoint()` function.
-This function requires you to specify a `name` and a `template_id`.
-Additional configurations such as GPUs, number of Workers, and more can also be specified depending your requirements.
+Create a new endpoint using a template:
@@ -93,28 +86,25 @@ import os
runpod.api_key = os.getenv("RUNPOD_API_KEY")
try:
- # Creating a template to use with the new endpoint
+ # Create a template first
new_template = runpod.create_template(
- name="test", image_name="runpod/base:0.4.4", is_serverless=True
+ name="test",
+ image_name="runpod/base:0.4.4",
+ is_serverless=True
)
-
- # Output the created template details
print(new_template)
- # Creating a new endpoint using the previously created template
+ # Create an endpoint using the template
new_endpoint = runpod.create_endpoint(
name="test",
template_id=new_template["id"],
gpu_ids="AMPERE_16",
workers_min=0,
- workers_max=1,
+ workers_max=1
)
-
- # Output the created endpoint details
print(new_endpoint)
except runpod.error.QueryError as err:
- # Handling potential errors during endpoint creation
print(err)
print(err.query)
```
@@ -153,9 +143,9 @@ except runpod.error.QueryError as err:
-## Get GPUs
+## List available GPUs
-For understanding the computational resources available, the `get_gpus()` function lists all GPUs that can be allocated to endpoints in RunPod. This enables optimal resource selection based on your computational needs.
+Get information about available GPUs:
@@ -167,10 +157,8 @@ import os
runpod.api_key = os.getenv("RUNPOD_API_KEY")
-# Fetching all available GPUs
+# Get all available GPUs
gpus = runpod.get_gpus()
-
-# Displaying the GPUs in a formatted manner
print(json.dumps(gpus, indent=2))
```
@@ -189,17 +177,15 @@ print(json.dumps(gpus, indent=2))
"displayName": "A100 SXM 80GB",
"memoryInGb": 80
}
- // Additional GPUs omitted for brevity
]
```
-## Get GPU by Id
+## Get GPU details
-Use `get_gpu()` and pass in a GPU Id to retrieve details about a specific GPU model by its ID.
-This is useful when understanding the capabilities and costs associated with various GPU models.
+Retrieve detailed information about a specific GPU:
@@ -211,9 +197,9 @@ import os
runpod.api_key = os.getenv("RUNPOD_API_KEY")
-gpus = runpod.get_gpu("NVIDIA A100 80GB PCIe")
-
-print(json.dumps(gpus, indent=2))
+# Get details for a specific GPU
+gpu = runpod.get_gpu("NVIDIA A100 80GB PCIe")
+print(json.dumps(gpu, indent=2))
```
@@ -244,7 +230,6 @@ print(json.dumps(gpus, indent=2))
```
-
-Through these functionalities, the RunPod API enables efficient and flexible management of computational resources, catering to a wide range of project requirements.
+> **Note:** The API provides flexible resource management options. Choose configurations that best match your project requirements.
diff --git a/docs/sdks/python/overview.md b/docs/sdks/python/overview.md
index 8e699cc6..9cca4877 100644
--- a/docs/sdks/python/overview.md
+++ b/docs/sdks/python/overview.md
@@ -1,148 +1,73 @@
---
title: Overview
sidebar_position: 1
-description: "Get started with setting up your RunPod projects using Python. Learn how to install the RunPod SDK, create a Python virtual environment, and configure your API key for access to the RunPod platform."
+description: "Get started with RunPod Python SDK for building AI applications, deploying ML models, and managing computational resources. Learn how to set up your environment and start building."
---
-import Tabs from '@theme/Tabs';
-import TabItem from '@theme/TabItem';
+This guide helps you use the RunPod Python SDK to build AI applications and manage computational resources. You'll learn how to set up your environment and start building with Python.
-Get started with setting up your RunPod projects using Python.
-Depending on the specific needs of your project, there are various ways to interact with the RunPod platform.
-This guide provides an approach to get you up and running.
+## Quick start
-## Install the RunPod SDK
+1. Set up your Python environment:
+ ```bash
+ python3 -m venv env
+ source env/bin/activate # On macOS/Linux
+ # or
+ env\Scripts\activate # On Windows
+ ```
-Create a Python virtual environment to install the RunPod SDK library.
-Virtual environments allow you to manage dependencies for different projects separately, avoiding conflicts between project requirements.
+2. Install the SDK:
+ ```bash
+ python -m pip install runpod
+ ```
-To get started, install setup a virtual environment then install the RunPod SDK library.
+3. Configure your API key:
+ ```python
+ import runpod
+ import os
-
-
+ runpod.api_key = os.getenv("RUNPOD_API_KEY")
+ ```
-Create a Python virtual environment with [venv](https://docs.python.org/3/library/venv.html):
+## Common use cases
- ```command
- python3 -m venv env
- source env/bin/activate
- ```
+### Deploy ML models
+- [Create serverless endpoints](endpoints.md)
+- [Configure GPU resources](apis.md#list-available-gpus)
+- [Monitor model performance](structured-logging.md)
-
-
+### Build AI applications
+- [Set up development environment](apis.md#create-templates)
+- [Deploy applications](apis.md#create-endpoints)
+- [Track application logs](structured-logging.md)
-Create a Python virtual environment with [venv](https://docs.python.org/3/library/venv.html):
+### Manage resources
+- [Configure GPU instances](apis.md#list-available-gpus)
+- [Set up templates](apis.md#create-templates)
+- [Scale endpoints](apis.md#create-endpoints)
- ```command
- python -m venv env
- env\Scripts\activate
- ```
+## Key features
-
-
+### Serverless deployment
+- Deploy ML models as serverless endpoints
+- Automatic scaling based on demand
+- Pay-per-use pricing model
-Create a Python virtual environment with [venv](https://docs.python.org/3/library/venv.html):
+### Resource management
+- GPU instance configuration
+- Template-based deployment
+- Resource monitoring
- ```command
- python3 -m venv env
- source env/bin/activate
- ```
+### Monitoring and logging
+- Structured logging interface
+- Performance tracking
+- Error handling
-
-
+## Next steps
-To install the SDK, run the following command from the terminal.
+1. [Set up your environment](apis.md)
+2. [Deploy your first model](endpoints.md)
+3. [Monitor your application](structured-logging.md)
+4. [Scale your resources](apis.md#create-endpoints)
-```command
-python -m pip install runpod
-```
-
-
-
-You should have the RunPod SDK installed and ready to use.
-
-## Get RunPod SDK version
-
-To ensure you've setup your RunPod SDK in Python, choose from one of the following methods to print the RunPod Python SDK version to your terminal.
-
-
-
-
- Run the following command using pip to get the RunPod SDK version.
-
- ```command
- pip show runpod
- ```
-
- You should see something similar to the following output.
-
- ```command
- runpod==1.6.1
- ```
-
-
-
-
- Run the following command from your terminal to get the RunPod SDK version.
-
- ```command
- python3 -c "import runpod; print(runpod.__version__)"
- ```
-
-
-
-
- To ensure you've setup your installation correctly, get the RunPod SDK version.
- Create a new file called `main.py`.
- Add the following to your Python file and execute the script.
-
- ```python
- import runpod
-
- version = runpod.version.get_version()
-
- print(f"RunPod version number: {version}")
- ```
-
- You should see something similar to the following output.
-
- ```text
- RunPod version number: 1.X.0
- ```
-
-
-
-
-You can find the latest version of the RunPod Python SDK on [GitHub](https://github.com/runpod/runpod-python/releases).
-
-Now that you've installed the RunPod SDK, add your API key.
-
-## Add your API key
-
-Set `api_key` and reference its variable in your Python application.
-This authenticates your requests to the RunPod platform and allows you to access the [RunPod API](/sdks/python/apis).
-
-```python
-import runpod
-import os
-
-runpod.api_key = os.getenv("RUNPOD_API_KEY")
-```
-
-:::note
-
-It's recommended to use environment variables to set your API key.
-You shouldn't load your API key directly into your code.
-
-For these examples, the API key loads from an environment variable called `RUNPOD_API_KEY`.
-
-:::
-
-Now that you've have the RunPod Python SDK installed and configured, you can start using the RunPod platform.
-
-For more information, see:
-
-- [APIs](/sdks/python/apis)
-- [Endpoints](/sdks/python/endpoints)
+> **Note:** The Python SDK is optimized for AI/ML applications. Use it for model deployment, data processing, and scientific computing.
diff --git a/docs/sdks/python/structured-logging.md b/docs/sdks/python/structured-logging.md
new file mode 100644
index 00000000..ca9cddac
--- /dev/null
+++ b/docs/sdks/python/structured-logging.md
@@ -0,0 +1,96 @@
+---
+title: Structured logging
+description: "Monitor and debug your applications with RunPod's structured logging interface. Track performance, identify issues, and gain insights into your running serverless functions."
+---
+
+# Loggers
+
+RunPod's structured logging interface helps you monitor and debug your applications. This guide shows you how to set up and use the RunPod logger to track performance metrics, identify issues, and ensure smooth operation of your deployments.
+
+## Quick start
+
+### Initialize the logger
+
+```python
+from runpod.serverless.logging import RunPodLogger
+
+# Initialize the logger
+logger = RunPodLogger()
+```
+
+### Using the logger in a handler function
+
+```python
+from runpod.serverless.logging import RunPodLogger
+
+logger = RunPodLogger()
+
+def handler(event):
+ logger.info("Processing request")
+
+ try:
+ # Your logic here
+ input_data = event["input"]
+ logger.debug(f"Received input: {input_data}")
+
+ # Process data
+ result = process_data(input_data)
+
+ logger.info("Request processed successfully")
+ return result
+ except Exception as e:
+ logger.error(f"Error processing request: {str(e)}")
+ return {"error": str(e)}
+```
+
+## Log levels
+
+RunPod logger supports different log levels for various situations:
+
+- **Debug**: Detailed information useful for debugging
+ ```python
+ logger.debug("Loading model with parameters", extra={"model_size": "7B", "quantization": "4bit"})
+ ```
+
+- **Info**: General information about the application's operation
+ ```python
+ logger.info("Request processing started")
+ ```
+
+- **Warning**: Potential issues that aren't errors but might need attention
+ ```python
+ logger.warn("Memory usage above 80%", extra={"memory_used": "12.8GB", "memory_total": "16GB"})
+ ```
+
+- **Error**: Errors that allow the application to continue running
+ ```python
+ logger.error("Failed to process input", extra={"error_type": "ValueError", "input_id": "123"})
+ ```
+
+## Best practices
+
+For effective logging:
+
+1. **Use appropriate log levels**: Reserve debug for development information, info for normal operations, warnings for potential issues, and errors for actual problems.
+
+2. **Include context**: Add relevant data using the `extra` parameter to make logs more useful.
+ ```python
+ logger.info("Generated image", extra={"dimensions": "1024x1024", "generation_time": "2.3s"})
+ ```
+
+3. **Log beginning and end of key operations**: This helps track execution flow and identify where issues occur.
+ ```python
+ logger.info("Starting model inference")
+ # ... inference code ...
+ logger.info("Model inference completed", extra={"inference_time": elapsed_time})
+ ```
+
+4. **Include error details**: When catching exceptions, include specific error information.
+ ```python
+ try:
+ # Operation that might fail
+ except Exception as e:
+ logger.error(f"Operation failed: {str(e)}", extra={"error_type": type(e).__name__})
+ ```
+
+> **Note**: The default log level is `info`. You can adjust this based on your debugging needs and production requirements.
diff --git a/docs/serverless/build/first-endpoint.md b/docs/serverless/build/first-endpoint.md
new file mode 100644
index 00000000..342f537c
--- /dev/null
+++ b/docs/serverless/build/first-endpoint.md
@@ -0,0 +1,185 @@
+---
+title: Create your first endpoint
+description: "Build and deploy your own custom serverless endpoint on RunPod. Learn to set up your environment, create a handler function, and deploy your container to RunPod Serverless."
+sidebar_position: 1
+---
+
+# Create your first endpoint
+
+In this guide, you'll learn how to build and deploy a custom serverless endpoint that can process any type of data.
+
+## Prerequisites
+
+Before you begin, make sure you have:
+
+- A RunPod account ([Sign up here](https://www.runpod.io/console/serverless))
+- Docker installed on your machine ([Get Docker](https://docs.docker.com/get-docker/))
+- Python 3.10 or later
+- Basic Python knowledge
+
+## Step 1: Set up your project
+
+1. Create a new directory for your project:
+
+```bash
+mkdir my-runpod-endpoint
+cd my-runpod-endpoint
+```
+
+2. Create a Python virtual environment:
+
+```bash
+# Create the virtual environment
+python -m venv venv
+
+# Activate it (macOS/Linux)
+source venv/bin/activate
+
+# Or on Windows
+# venv\Scripts\activate
+```
+
+3. Install the RunPod Python SDK:
+
+```bash
+pip install runpod
+```
+
+## Step 2: Create a handler function
+
+Create a file named `handler.py` with this basic template:
+
+```python
+import runpod
+
+def handler(event):
+ """
+ This function is called when a request is sent to your endpoint.
+ """
+ # Get the input from the request
+ job_input = event["input"]
+
+ # Process the input
+ # Replace this with your actual processing logic
+ result = {
+ "message": f"Received input: {job_input}",
+ "processed": True,
+ "timestamp": runpod.utils.get_utc_timestamp()
+ }
+
+ # Return the result
+ return result
+
+# Start the serverless function
+runpod.serverless.start({"handler": handler})
+```
+
+## Step 3: Create a test input file
+
+Create a file named `test_input.json` to test your handler locally:
+
+```json
+{
+ "input": {
+ "text": "Hello, RunPod!",
+ "parameter": 42
+ }
+}
+```
+
+## Step 4: Test your handler locally
+
+Run your handler locally to make sure it works:
+
+```bash
+python handler.py
+```
+
+You should see output that looks like:
+
+```
+--- Starting Serverless Worker | Version X.X.X ---
+INFO | Using test_input.json as job input.
+DEBUG | Retrieved local job: {'input': {'text': 'Hello, RunPod!', 'parameter': 42}, 'id': 'local_test'}
+INFO | local_test | Started.
+DEBUG | local_test | Handler output: {'message': "Received input: {'text': 'Hello, RunPod!', 'parameter': 42}", 'processed': True, 'timestamp': '2023-08-01T15:30:45Z'}
+INFO | Job local_test completed successfully.
+INFO | Job result: {'output': {'message': "Received input: {'text': 'Hello, RunPod!', 'parameter': 42}", 'processed': True, 'timestamp': '2023-08-01T15:30:45Z'}}
+INFO | Local testing complete, exiting.
+```
+
+## Step 5: Create a Dockerfile
+
+Create a `Dockerfile` to package your handler:
+
+```dockerfile
+FROM python:3.10-slim
+
+WORKDIR /app
+
+# Install dependencies
+RUN pip install --no-cache-dir runpod
+
+# Copy handler code
+COPY handler.py /app/
+COPY test_input.json /app/
+
+# Start the handler
+CMD ["python", "-u", "handler.py"]
+```
+
+## Step 6: Build and push your Docker image
+
+1. Build your Docker image:
+
+```bash
+docker build -t username/my-runpod-endpoint:latest .
+```
+
+Replace `username` with your Docker Hub username or container registry prefix.
+
+2. Push your image to Docker Hub or your container registry:
+
+```bash
+docker push username/my-runpod-endpoint:latest
+```
+
+## Step 7: Deploy to RunPod
+
+1. Go to the [RunPod Console](https://www.runpod.io/console/serverless)
+2. Click "New Endpoint"
+3. Enter your Docker image URL
+4. Configure your endpoint:
+ - **Name**: Choose a descriptive name
+ - **GPU Type**: Select a GPU type (or CPU)
+ - **Min/Max Workers**: Set scaling parameters
+ - **Idle Timeout**: How long to keep workers running after inactivity
+
+5. Click "Deploy"
+
+## Step 8: Test your endpoint
+
+Once deployed, you can test your endpoint using the RunPod console or with a curl request:
+
+```bash
+curl -X POST "https://api.runpod.ai/v2/YOUR_ENDPOINT_ID/run" \
+ -H "Content-Type: application/json" \
+ -H "Authorization: Bearer YOUR_API_KEY" \
+ -d '{
+ "input": {
+ "text": "Hello from API request!",
+ "parameter": 100
+ }
+ }'
+```
+
+Replace `YOUR_ENDPOINT_ID` and `YOUR_API_KEY` with your actual values.
+
+## Next steps
+
+Now that you've created your first endpoint, you might want to:
+
+- [Learn about handler functions](/docs/serverless/build/handler-functions) - More advanced handler patterns
+- [Build a custom worker](/docs/serverless/build/custom-workers) - Create workers with custom dependencies
+- [Start from a template](/docs/serverless/build/from-template) - Use starter templates for different use cases
+- [Configure autoscaling](/docs/serverless/manage/scaling) - Optimize for performance and cost
\ No newline at end of file
diff --git a/docs/serverless/examples/text-generation.md b/docs/serverless/examples/text-generation.md
new file mode 100644
index 00000000..978bf7a4
--- /dev/null
+++ b/docs/serverless/examples/text-generation.md
@@ -0,0 +1,303 @@
+---
+title: Text generation
+description: "Build a text generation API with large language models on RunPod Serverless. This complete guide covers setup, deployment, optimization, and integration."
+---
+
+# Text generation with LLMs
+
+This guide shows you how to build and deploy a text generation API using large language models (LLMs) on RunPod Serverless.
+
+## Overview
+
+You'll learn how to:
+1. Set up a text generation endpoint
+2. Configure for optimal performance and cost
+3. Send requests and process responses
+4. Integrate with your applications
+
+## Prerequisites
+
+- A RunPod account with serverless access
+- Basic understanding of Python and Docker
+- Familiarity with LLMs (optional)
+
+## Option 1: Use quick deploy (easiest)
+
+RunPod offers pre-configured endpoints for popular LLMs:
+
+1. Go to the [RunPod Console](https://www.runpod.io/console/serverless)
+2. Click "Quick Deploy"
+3. Select a text generation model (Llama, Mistral, etc.)
+4. Configure GPU and worker settings
+5. Deploy
+
+## Option 2: Build a custom endpoint
+
+For more flexibility, you can create a custom endpoint:
+
+### Step 1: Create a handler with vLLM
+
+Create a file named `handler.py`:
+
+```python
+import runpod
+import os
+from vllm import LLM, SamplingParams
+
+# Initialize the model (runs once when worker starts)
+def init_model():
+ global model
+ model = LLM(
+ model="meta-llama/Llama-3-8b-chat-hf", # Replace with your preferred model
+ tensor_parallel_size=1, # Adjust based on GPU type
+ trust_remote_code=True
+ )
+ return model
+
+# Initialize model globally
+model = init_model()
+
+# Define sampling parameters
+default_params = SamplingParams(
+ temperature=0.7,
+ top_p=0.95,
+ max_tokens=512
+)
+
+def handler(event):
+ """
+ Handle inference requests
+ """
+ try:
+ # Get input from request
+ job_input = event["input"]
+ prompt = job_input.get("prompt", "")
+ system_prompt = job_input.get("system_prompt", "You are a helpful AI assistant.")
+
+ # Get custom generation parameters or use defaults
+ params = job_input.get("params", {})
+ sampling_params = SamplingParams(
+ temperature=params.get("temperature", default_params.temperature),
+ top_p=params.get("top_p", default_params.top_p),
+ max_tokens=params.get("max_tokens", default_params.max_tokens)
+ )
+
+ # Format prompt for chat
+ formatted_prompt = f"[INST] <>\n{system_prompt}\n<>\n\n{prompt} [/INST]"
+
+ # Generate text
+ outputs = model.generate([formatted_prompt], sampling_params)
+
+ # Format output
+ generated_text = outputs[0].outputs[0].text
+
+ return {
+ "generated_text": generated_text,
+ "model": "meta-llama/Llama-3-8b-chat-hf",
+ "usage": {
+ "prompt_tokens": len(prompt.split()),
+ "completion_tokens": len(generated_text.split()),
+ "total_tokens": len(prompt.split()) + len(generated_text.split())
+ }
+ }
+
+ except Exception as e:
+ return {"error": str(e)}
+
+# Start the serverless function
+runpod.serverless.start({"handler": handler})
+```
+
+### Step 2: Create a Dockerfile
+
+Create a `Dockerfile`:
+
+```dockerfile
+FROM runpod/pytorch:2.2.0-py3.10-cuda12.1.0-devel
+
+WORKDIR /app
+
+# Install dependencies
+RUN pip install --no-cache-dir runpod vllm transformers accelerate
+
+# Copy handler code
+COPY handler.py .
+
+# Set environment variables
+ENV HUGGING_FACE_HUB_TOKEN="your_hf_token" # Replace with your token
+ENV RUNPOD_VLLM_MODEL="meta-llama/Llama-3-8b-chat-hf"
+
+# Start the handler
+CMD ["python", "-u", "handler.py"]
+```
+
+### Step 3: Build and push the image
+
+```bash
+docker build -t your-username/llm-endpoint:latest .
+docker push your-username/llm-endpoint:latest
+```
+
+### Step 4: Deploy the endpoint
+
+1. Go to the RunPod Serverless console
+2. Create a new endpoint with your image
+3. Select an appropriate GPU (A10G, A100, etc.)
+4. Configure workers based on expected traffic
+5. Deploy
+
+## Sending requests
+
+Send requests to your endpoint:
+
+```python
+import requests
+import json
+
+# Replace with your endpoint ID and API key
+ENDPOINT_ID = "your-endpoint-id"
+API_KEY = "your-api-key"
+
+def generate_text(prompt, system_prompt="You are a helpful AI assistant.", **params):
+ url = f"https://api.runpod.ai/v2/{ENDPOINT_ID}/run"
+
+ payload = {
+ "input": {
+ "prompt": prompt,
+ "system_prompt": system_prompt,
+ "params": params
+ }
+ }
+
+ headers = {
+ "Content-Type": "application/json",
+ "Authorization": f"Bearer {API_KEY}"
+ }
+
+ response = requests.post(url, headers=headers, data=json.dumps(payload))
+ return response.json()
+
+# Example usage
+result = generate_text(
+ "Explain quantum computing in simple terms.",
+ temperature=0.5,
+ max_tokens=300
+)
+
+print(result)
+```
+
+## Performance optimization
+
+### Model size vs. performance
+
+| Model Size | GPU Recommendation | Throughput | Latency |
+|------------|-------------------|------------|---------|
+| 7B-8B | L4, RTX 4090 | Medium-High | Low |
+| 13B-14B | A10G, A6000 | Medium | Medium |
+| 30B-70B | A100 40GB/80GB | Low-Medium | Medium-High |
+
+### Quantization
+
+Add quantization to reduce memory usage and increase throughput:
+
+```python
+# Modify the init_model function
+def init_model():
+ global model
+ model = LLM(
+ model="meta-llama/Llama-3-8b-chat-hf",
+ tensor_parallel_size=1,
+ trust_remote_code=True,
+ quantization="awq" # Use AWQ quantization
+ )
+ return model
+```
+
+### Caching
+
+Enable caching to improve performance for repeated queries:
+
+```python
+# Modify the init_model function
+def init_model():
+ global model
+ model = LLM(
+ model="meta-llama/Llama-3-8b-chat-hf",
+ tensor_parallel_size=1,
+ trust_remote_code=True,
+ cache_size=1024 # Cache up to 1024 requests
+ )
+ return model
+```
+
+## Monitoring and scaling
+
+### Configure optimal scaling
+
+For text generation endpoints:
+
+- **For experimentation**: Min 0, Max 1-2, Idle 60s
+- **For production**: Min 1, Max 5+, Idle 300s
+
+### Monitor performance
+
+Check your endpoint metrics to:
+- Track usage patterns
+- Identify bottlenecks
+- Optimize cost
+
+## Integration examples
+
+### Web application
+
+```javascript
+async function generateText() {
+ const prompt = document.getElementById('prompt').value;
+
+ const response = await fetch('https://api.runpod.ai/v2/YOUR_ENDPOINT_ID/run', {
+ method: 'POST',
+ headers: {
+ 'Content-Type': 'application/json',
+ 'Authorization': 'Bearer YOUR_API_KEY'
+ },
+ body: JSON.stringify({
+ input: {
+ prompt: prompt
+ }
+ })
+ });
+
+ const data = await response.json();
+ document.getElementById('result').innerText = data.output.generated_text;
+}
+```
+
+### Async processing for long tasks
+
+For long-running generation, use the async API:
+
+```python
+# Submit job
+response = requests.post(
+ f"https://api.runpod.ai/v2/{ENDPOINT_ID}/run",
+ headers=headers,
+ json=payload
+)
+job_id = response.json()["id"]
+
+# Check status and get result when done
+status_url = f"https://api.runpod.ai/v2/{ENDPOINT_ID}/status/{job_id}"
+while True:
+ status = requests.get(status_url, headers=headers).json()
+ if status["status"] == "COMPLETED":
+ result = status["output"]
+ break
+ time.sleep(1)
+```
+
+## Next steps
+
+- [Explore image generation](/docs/serverless/examples/image-generation)
+- [Learn about chaining endpoints](/docs/serverless/examples/chaining-endpoints)
+- [Optimize costs](/docs/serverless/manage/optimize)
\ No newline at end of file
diff --git a/docs/serverless/get-started.md b/docs/serverless/get-started.md
index bbea805a..cba51c72 100644
--- a/docs/serverless/get-started.md
+++ b/docs/serverless/get-started.md
@@ -1,132 +1,201 @@
---
-title: "Get started with Endpoints"
+title: "Step-by-step guide"
sidebar_position: 2
-description: Master the art of building Docker images, deploying Serverless endpoints, and sending requests with this comprehensive guide, covering prerequisites, RunPod setup, and deployment steps.
+description: "Follow this detailed step-by-step guide to build and deploy a custom serverless endpoint on RunPod. Set up your development environment, create a handler, and deploy your application."
---
-## Build a Serverless Application on RunPod
+# Building a custom endpoint: Step-by-step
-Follow these steps to set up a development environment, create a handler file, test it locally, and build a Docker image for deployment:
+This comprehensive guide walks you through creating and deploying a custom serverless endpoint on RunPod from scratch. We'll build a simple application that you can later adapt for your specific needs.
-1. Create a Python virtual environment and install RunPod SDK
+## Prerequisites
-```bash
-# 1. Create a Python virtual environment
-python3 -m venv venv
+Before you begin, make sure you have:
+- A RunPod account ([Sign up here](https://www.runpod.io/console/serverless))
+- Docker installed on your machine ([Get Docker](https://docs.docker.com/get-docker/))
+- Python 3.10 or later
+- Basic understanding of Python and Docker
-# 2. Activate the virtual environment
-# On macOS/Linux:
+## Step 1: Set up your development environment
-source venv/bin/activate
+1. Create a new directory for your project:
+ ```bash
+ mkdir my-serverless-app
+ cd my-serverless-app
+ ```
-# On Windows:
-venv\Scripts\activate
+2. Create and activate a Python virtual environment:
+ ```bash
+ # Create virtual environment
+ python3 -m venv venv
-# 3. Install the RunPod SDK
-pip install runpod
-```
+ # Activate it (macOS/Linux)
+ source venv/bin/activate
+ # OR (Windows)
+ venv\Scripts\activate
+ ```
+
+3. Install the RunPod SDK:
+ ```bash
+ pip install runpod
+ ```
+
+## Step 2: Create your handler function
-2. Create the handler file (rp_handler.py):
+Create a file named `handler.py` with this basic template:
```python
import runpod
-import time
def handler(event):
- input = event['input']
- instruction = input.get('instruction')
- seconds = input.get('seconds', 0)
-
- # Placeholder for a task; replace with image or text generation logic as needed
- time.sleep(seconds)
- result = instruction.replace(instruction.split()[0], 'created', 1)
-
- return result
-
-if __name__ == '__main__':
- runpod.serverless.start({'handler': handler})
+ """
+ Process requests sent to your serverless endpoint.
+ This function is invoked when a request is sent to your endpoint.
+ """
+ try:
+ # Get input from the request
+ job_input = event["input"]
+
+ # Process the input (customize this for your needs)
+ # This is where your application logic goes
+ result = {
+ "message": f"Processed input: {job_input}",
+ "status": "success",
+ "timestamp": runpod.utils.get_utc_timestamp()
+ }
+
+ # Return the result
+ return result
+
+ except Exception as e:
+ # Return error information if something goes wrong
+ return {"error": str(e)}
+
+# Start the serverless function
+runpod.serverless.start({"handler": handler})
```
-3. Create a test_input.json file in the same folder:
+## Step 3: Create a test input file
-```python
+Create a file named `test_input.json` to test your handler locally:
+
+```json
{
"input": {
- "instruction": "create a image",
- "seconds": 15
+ "message": "Hello, RunPod!",
+ "parameter": 42
}
}
```
-4. Test the handler code locally:
+## Step 4: Test locally
-```python
-python3 rp_handler.py
+Run your handler locally to ensure it works correctly:
-# You should see an output like this:
---- Starting Serverless Worker | Version 1.7.0 ---
-INFO | Using test_input.json as job input.
-DEBUG | Retrieved local job: {'input': {'instruction': 'create a image', 'seconds': 15}, 'id': 'local_test'}
-INFO | local_test | Started.
-DEBUG | local_test | Handler output: created a image
-DEBUG | local_test | run_job return: {'output': 'created a image'}
-INFO | Job local_test completed successfully.
-INFO | Job result: {'output': 'created a image'}
-INFO | Local testing complete, exiting.
+```bash
+python handler.py
```
-5. Create a Dockerfile:
-
-```docker
-FROM python:3.10-slim
-
-WORKDIR /
-RUN pip install --no-cache-dir runpod
-COPY rp_handler.py /
+You should see output similar to:
-# Start the container
-CMD ["python3", "-u", "rp_handler.py"]
```
-
-6. Build and push your Docker image
-
-```command
-docker build --platform linux/amd64 --tag /: .
+--- Starting Serverless Worker | Version X.X.X ---
+INFO | Using test_input.json as job input.
+DEBUG | Retrieved local job: {'input': {'message': 'Hello, RunPod!', 'parameter': 42}, 'id': 'local_test'}
+INFO | local_test | Started.
+DEBUG | local_test | Handler output: {'message': "Processed input: {'message': 'Hello, RunPod!', 'parameter': 42}", 'status': 'success', 'timestamp': '2023-08-01T15:30:45Z'}
+INFO | Job local_test completed successfully.
+INFO | Local testing complete, exiting.
```
-7. Push to your container registry:
+## Step 5: Containerize your application
+
+1. Create a `Dockerfile`:
+ ```dockerfile
+ FROM python:3.10-slim
+
+ WORKDIR /app
+
+ # Install dependencies
+ RUN pip install --no-cache-dir runpod
+
+ # Copy your handler code
+ COPY handler.py /app/
+ COPY test_input.json /app/
+
+ # Start the handler
+ CMD ["python", "-u", "handler.py"]
+ ```
+
+2. Build your Docker image:
+ ```bash
+ docker build --platform linux/amd64 -t your-username/serverless-app:latest .
+ ```
+
+ Replace `your-username` with your Docker Hub username or registry prefix.
+
+3. Test your container locally:
+ ```bash
+ docker run your-username/serverless-app:latest
+ ```
+
+4. Push to Docker Hub or your registry:
+ ```bash
+ docker push your-username/serverless-app:latest
+ ```
+
+## Step 6: Deploy to RunPod
+
+1. Go to the [RunPod Serverless Console](https://www.runpod.io/console/serverless)
+2. Click "New Endpoint"
+3. Enter your Docker image URL (e.g., `your-username/serverless-app:latest`)
+4. Configure your endpoint:
+ - **Name**: Choose a descriptive name for your endpoint
+ - **GPU Type**: Select the appropriate GPU (or CPU) based on your needs
+ - **Worker Count**: Set to 0 for scale-to-zero or 1+ to keep workers warm
+ - **Max Workers**: Set the maximum number of concurrent workers
+ - **Idle Timeout**: How long to keep workers alive after finishing a job
+ - **Flash Boot**: Enable for faster cold starts (if needed)
+
+5. Click "Deploy" to create your endpoint
+
+## Step 7: Test your endpoint
+
+Once deployed, you can test your endpoint using the RunPod console or with curl:
-```command
-docker push /:
+```bash
+curl -X POST "https://api.runpod.ai/v2/YOUR_ENDPOINT_ID/run" \
+ -H "Authorization: Bearer YOUR_API_KEY" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "input": {
+ "message": "Hello from API request!",
+ "parameter": 100
+ }
+ }'
```
-:::note
+Replace `YOUR_ENDPOINT_ID` and `YOUR_API_KEY` with your actual values.
-When building your docker image, you might need to specify the platform you are building for.
-This is important when you are building on a machine with a different architecture than the one you are deploying to.
+## Step 8: Monitor and adjust
-When building for RunPod providers use `--platform=linux/amd64`.
+1. Check the logs and metrics in the RunPod console
+2. Adjust worker count and idle timeout based on your observed traffic patterns
+3. Update your endpoint as needed by pushing new Docker images
-:::
+## Next steps
-Alternatively, you can clone our [worker-basic](https://github.com/runpod-workers/worker-basic) repository to quickly build a Docker image and push it to your container registry for a faster start.
+Now that you've deployed your custom endpoint, you can:
-Now that you've pushed your container registry, you're ready to deploy your Serverless Endpoint to RunPod.
+- Add more complex processing logic to your handler function
+- Integrate with machine learning models or other libraries
+- Set up CI/CD for automated deployments
+- Connect your endpoint to your applications
-## Deploy a Serverless Endpoint
+For advanced usage, explore:
-This step will walk you through deploying a Serverless Endpoint to RunPod. You can refer to this walkthrough to deploy your own custom Docker image.
+- [Worker development](workers/overview.md)
+- [Endpoint management](endpoints/manage-endpoints.md)
+- [Configure autoscaling](manage/scaling.md)
-
+> **Pro tip**: For local development, you can use our [example repository](https://github.com/runpod-workers/worker-basic) as a starting point.
diff --git a/docs/serverless/index.md b/docs/serverless/index.md
new file mode 100644
index 00000000..5950cdd1
--- /dev/null
+++ b/docs/serverless/index.md
@@ -0,0 +1,96 @@
+---
+title: RunPod Serverless
+description: "Deploy, scale, and manage AI applications with RunPod Serverless. Build custom endpoints or use pre-configured models with pay-per-second pricing."
+sidebar_position: 1
+slug: /serverless
+---
+
+# RunPod Serverless
+
+Deploy and scale AI applications without managing infrastructure. RunPod Serverless handles the complexity, so you can focus on building.
+
+## Choose your path
+
+
+
+
+
🚀 I want to deploy quickly
+
+
+
Deploy pre-built AI models in minutes without writing code.