Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: runpod/runpod-python
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 1.7.2
Choose a base ref
...
head repository: runpod/runpod-python
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
  • 18 commits
  • 32 files changed
  • 9 contributors

Commits on Oct 7, 2024

  1. fix: sls-core not reporting status code 2 (#363)

    ef0xa authored Oct 7, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    b148e14 View commit details
  2. fix: sls-core-error (#364)

    ef0xa authored Oct 7, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    5d1cec6 View commit details

Commits on Oct 12, 2024

  1. Blocking job take call means 5-sec debounce no longer needed (#366)

    Fix: This was causing unnecessary delays in serverless workers.
    
    Refactored rp_job.get_job to work well under pause and unpause conditions. More debug lines too.
    Refactored rp_scale.JobScaler to handle shutdowns where it cleans up hanging tasks and connections gracefully. Better debug lines.
    Fixed rp_scale.JobScaler from unnecessary long asyncio.sleeps made before considering the blocking get_job calls.
    Improved worker_state's JobProgress and JobsQueue to timestamp when jobs are added or removed.
    Incorporated the lines of code in worker.run_worker into rp_scale.JobScaler where it belongs and simplified to job_scaler.start()
    Fixed non-error logged as errors in tracer
    Updated unit tests mandating these changes
    deanq authored Oct 12, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    5a6b911 View commit details

Commits on Oct 15, 2024

  1. debounce at HTTP 429 response (#367)

    deanq authored Oct 15, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    df77102 View commit details

Commits on Oct 24, 2024

  1. fix: long-running jobs crash with SIGTERM (#370)

    deanq authored Oct 24, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    4ab05f3 View commit details
  2. Bump pylint from 3.2.5 to 3.3.1 (#361)

    Bumps [pylint](https://github.com/pylint-dev/pylint) from 3.2.5 to 3.3.1.
    - [Release notes](https://github.com/pylint-dev/pylint/releases)
    - [Commits](pylint-dev/pylint@v3.2.5...v3.3.1)
    
    ---
    updated-dependencies:
    - dependency-name: pylint
      dependency-type: direct:development
      update-type: version-update:semver-minor
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Oct 24, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    26bd0cf View commit details
  3. added parameters min_download and min_upload to create_pod (#360)

    dxqbYD authored Oct 24, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    85a402c View commit details

Commits on Nov 20, 2024

  1. Fix: failed requests due to race conditions in the job queue vs job p…

    …rogress (#376)
    
    * fix: JobsProgress is now asyncio-safe. This prevents any race conditions when job_progress.get_job_count() was checked before getting more jobs.
    * fix: strict jobs count for evaluating if new jobs can be taken `jobs_needed = concurrency - queue - in progress`
    * debug: better debug logs
    * improved unit tests
    deanq authored Nov 20, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    a477453 View commit details

Commits on Dec 5, 2024

  1. Fix: JobScaler issues that cause request failures (#383)

    * Integrated asyncio.Queue within JobScaler (removes JobsQueue) and fully take advantage of its blocking .get .put functions
    * Using asyncio.Queue(maxsize) to dictate concurrency (via concurrency_modifier)
    * JobScaler.set_scale() adjusts concurrency when needed and safe in runtime
    * JobScaler.current_occupancy() uses asyncio.Queue size and JobsProgress(set) size to gate capacity
    * Simpler/cleaner job acquisition steps
    * Removed legacy tracers for http clients
    deanq authored Dec 5, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    2543f34 View commit details
  2. Fixes issue #373 for required input validation (#379)

    * fixed and added a test case for  Input validation "not working as expected #373"
    * added an additional unit test for None
    gabewillen authored Dec 5, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    51f22bd View commit details

Commits on Dec 7, 2024

  1. Update cryptography requirement from <44.0.0 to <45.0.0 (#380)

    Updates the requirements on [cryptography](https://github.com/pyca/cryptography) to permit the latest version.
    - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
    - [Commits](pyca/cryptography@0.1...44.0.0)
    
    ---
    updated-dependencies:
    - dependency-name: cryptography
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Dean Quiñanola <[email protected]>
    dependabot[bot] and deanq authored Dec 7, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    527db3c View commit details
  2. Feature/E-2131 Utility function for resolving model-cache paths from …

    …Huggingface repositories (#377)
    
    * added a utility for resolving model cache paths from a huggingface repository
    
    * Added a TODO for the `path_template` key word argument
    
    * added unit tests for model cache resolver
    
    * fixed module documentation
    
    * resolve to None when a repository is improperly formatted
    
    * fixed comment wording
    
    ---------
    
    Co-authored-by: Dean Quiñanola <[email protected]>
    gabewillen and deanq authored Dec 7, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    0a57890 View commit details

Commits on Dec 10, 2024

  1. fix: streamed errors were previously swallowed (#384)

    This created false-positive completed tasks
    deanq authored Dec 10, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    d7a2131 View commit details

Commits on Jan 2, 2025

  1. Fix: handle uncaught exception only for Serverless workers (#388)

    * refactor: moved handle_uncaught_exception to rp_scale
    * refactor: bind handle_uncaught_exception on JobScaler init
    * fix: python <3.11 compatibility
    deanq authored Jan 2, 2025

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    3f78233 View commit details

Commits on Jan 3, 2025

  1. fix: change log to debug (#389)

    Yhlong00 authored Jan 3, 2025

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    7912c20 View commit details
  2. Add allowed CUDA versions parameter to endpoint creation (#375)

    * Add allowed CUDA versions parameter to endpoint creation
    * Add gpu_count parameter to endpoint creation functions
    nielsrolf authored Jan 3, 2025

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    9c5918e View commit details

Commits on Jan 16, 2025

  1. Update: GQL uses API Key in Authorization Header (#394)

    deanq authored Jan 16, 2025

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    2c62255 View commit details

Commits on Mar 25, 2025

  1. Update async job streaming stop condition and test cases (#404)

    zealotjin authored Mar 25, 2025

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    969cc35 View commit details
4 changes: 2 additions & 2 deletions .github/workflows/CD-publish_to_pypi.yml
Original file line number Diff line number Diff line change
@@ -13,10 +13,10 @@ jobs:

steps:
- uses: actions/checkout@v4
- name: Set up Python 3.11.0
- name: Set up Python 3.11.10
uses: actions/setup-python@v5
with:
python-version: 3.11.0
python-version: 3.11.10

- name: Install pypa/build
run: >-
4 changes: 2 additions & 2 deletions .github/workflows/CD-test_publish_to_pypi.yml
Original file line number Diff line number Diff line change
@@ -14,10 +14,10 @@ jobs:

steps:
- uses: actions/checkout@v4
- name: Set up Python 3.11.0
- name: Set up Python 3.11.10
uses: actions/setup-python@v5
with:
python-version: 3.11.0
python-version: 3.11.10

- name: Install pypa/build
run: >-
2 changes: 1 addition & 1 deletion .github/workflows/CI-pytests.yml
Original file line number Diff line number Diff line change
@@ -15,7 +15,7 @@ jobs:
run_tests:
strategy:
matrix:
python-version: [3.8, 3.9, 3.10.12, 3.11.0]
python-version: [3.8, 3.9, 3.10.15, 3.11.10]
runs-on: ubuntu-latest

steps:
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -54,7 +54,7 @@ runpod = "runpod.cli.entry:runpod_cli"
test = [
"asynctest",
"nest_asyncio",
"pylint==3.2.5",
"faker",
"pytest-asyncio",
"pytest-cov",
"pytest-timeout",
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -5,7 +5,7 @@ backoff >= 2.2.1
boto3 >= 1.26.165
click >= 8.1.7
colorama >= 0.2.5, < 0.4.7
cryptography < 44.0.0
cryptography < 45.0.0
fastapi[all] >= 0.94.0
paramiko >= 3.3.1
prettytable >= 3.9.0
26 changes: 11 additions & 15 deletions runpod/api/ctl_commands.py
Original file line number Diff line number Diff line change
@@ -107,6 +107,8 @@ def create_pod(
template_id: Optional[str] = None,
network_volume_id: Optional[str] = None,
allowed_cuda_versions: Optional[list] = None,
min_download = None,
min_upload = None,
) -> dict:
"""
Create a pod
@@ -125,7 +127,8 @@ def create_pod(
for example {EXAMPLE_VAR:"example_value", EXAMPLE_VAR2:"example_value 2"}, will
inject EXAMPLE_VAR and EXAMPLE_VAR2 into the pod with the mentioned values
:param template_id: the id of the template to use for the pod
:param min_download: minimum download speed in Mbps
:param min_upload: minimum upload speed in Mbps
:example:
>>> pod_id = runpod.create_pod("test", "runpod/stack", "NVIDIA GeForce RTX 3070")
@@ -167,6 +170,8 @@ def create_pod(
template_id,
network_volume_id,
allowed_cuda_versions,
min_download,
min_upload,
)
)

@@ -297,24 +302,13 @@ def create_endpoint(
workers_min: int = 0,
workers_max: int = 3,
flashboot=False,
allowed_cuda_versions: str = "12.1,12.2,12.3,12.4,12.5",
gpu_count: int = 1,
):
"""
Create an endpoint
:param name: the name of the endpoint
:param template_id: the id of the template to use for the endpoint
:param gpu_ids: the ids of the GPUs to use for the endpoint
:param network_volume_id: the id of the network volume to use for the endpoint
:param locations: the locations to use for the endpoint
:param idle_timeout: the idle timeout for the endpoint
:param scaler_type: the scaler type for the endpoint
:param scaler_value: the scaler value for the endpoint
:param workers_min: the minimum number of workers for the endpoint
:param workers_max: the maximum number of workers for the endpoint
:example:
>>> endpoint_id = runpod.create_endpoint("test", "template_id")
:param allowed_cuda_versions: Comma-separated string of allowed CUDA versions (e.g., "12.4,12.5").
"""
raw_response = run_graphql_query(
endpoint_mutations.generate_endpoint_mutation(
@@ -329,6 +323,8 @@ def create_endpoint(
workers_min,
workers_max,
flashboot,
allowed_cuda_versions,
gpu_count
)
)

3 changes: 2 additions & 1 deletion runpod/api/graphql.py
Original file line number Diff line number Diff line change
@@ -21,11 +21,12 @@ def run_graphql_query(query: str) -> Dict[str, Any]:
from runpod import api_key # pylint: disable=import-outside-toplevel, cyclic-import

api_url_base = os.environ.get("RUNPOD_API_BASE_URL", "https://api.runpod.io")
url = f"{api_url_base}/graphql?api_key={api_key}"
url = f"{api_url_base}/graphql"

headers = {
"Content-Type": "application/json",
"User-Agent": USER_AGENT,
"Authorization": f"Bearer {api_key}",
}

data = json.dumps({"query": query})
11 changes: 11 additions & 0 deletions runpod/api/mutations/endpoints.py
Original file line number Diff line number Diff line change
@@ -15,6 +15,8 @@ def generate_endpoint_mutation(
workers_min: int = 0,
workers_max: int = 3,
flashboot=False,
allowed_cuda_versions: str = "12.1,12.2,12.3,12.4,12.5",
gpu_count: int = None,
):
"""Generate a string for a GraphQL mutation to create a new endpoint."""
input_fields = []
@@ -44,6 +46,12 @@ def generate_endpoint_mutation(
input_fields.append(f"workersMin: {workers_min}")
input_fields.append(f"workersMax: {workers_max}")

if allowed_cuda_versions is not None:
input_fields.append(f'allowedCudaVersions: "{allowed_cuda_versions}"')

if gpu_count is not None:
input_fields.append(f"gpuCount: {gpu_count}")

# Format the input fields into a string
input_fields_string = ", ".join(input_fields)

@@ -65,11 +73,14 @@ def generate_endpoint_mutation(
scalerValue
workersMin
workersMax
allowedCudaVersions
gpuCount
}}
}}
"""



def update_endpoint_template_mutation(endpoint_id: str, template_id: str):
"""Generate a string for a GraphQL mutation to update an existing endpoint's template."""
input_fields = []
9 changes: 8 additions & 1 deletion runpod/api/mutations/pods.py
Original file line number Diff line number Diff line change
@@ -28,6 +28,8 @@ def generate_pod_deployment_mutation(
template_id=None,
network_volume_id=None,
allowed_cuda_versions: Optional[List[str]] = None,
min_download=None,
min_upload=None,
):
"""
Generates a mutation to deploy a pod on demand.
@@ -89,9 +91,14 @@ def generate_pod_deployment_mutation(
)
input_fields.append(f"allowedCudaVersions: [{allowed_cuda_versions_string}]")

if min_download is not None:
input_fields.append(f'minDownload: {min_download}')

if min_upload is not None:
input_fields.append(f'minUpload: {min_upload}')

# Format input fields
input_string = ", ".join(input_fields)

return f"""
mutation {{
podFindAndDeployOnDemand(
9 changes: 7 additions & 2 deletions runpod/endpoint/asyncio/asyncio_runner.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
""" Module for running endpoints asynchronously. """
"""Module for running endpoints asynchronously."""

# pylint: disable=too-few-public-methods,R0801

@@ -89,9 +89,14 @@ async def stream(self) -> Any:
while True:
await asyncio.sleep(1)
stream_partial = await self._fetch_job(source="stream")
if stream_partial["status"] not in FINAL_STATES:
if (
stream_partial["status"] not in FINAL_STATES
or len(stream_partial.get("stream", [])) > 0
):
for chunk in stream_partial.get("stream", []):
yield chunk["output"]
elif stream_partial["status"] in FINAL_STATES:
break

async def cancel(self) -> dict:
"""Cancels current job
49 changes: 6 additions & 43 deletions runpod/http_client.py
Original file line number Diff line number Diff line change
@@ -5,13 +5,16 @@
import os

import requests
from aiohttp import ClientSession, ClientTimeout, TCPConnector
from aiohttp import ClientSession, ClientTimeout, TCPConnector, ClientResponseError

from .cli.groups.config.functions import get_credentials
from .tracer import create_aiohttp_tracer, create_request_tracer
from .user_agent import USER_AGENT


class TooManyRequests(ClientResponseError):
pass


def get_auth_header():
"""
Produce a header dict with the `Authorization` key derived from
@@ -33,13 +36,11 @@ def AsyncClientSession(*args, **kwargs): # pylint: disable=invalid-name
"""
Deprecation from aiohttp.ClientSession forbids inheritance.
This is now a factory method
TODO: use httpx
"""
return ClientSession(
connector=TCPConnector(limit=0),
headers=get_auth_header(),
timeout=ClientTimeout(600, ceil_threshold=400),
trace_configs=[create_aiohttp_tracer()],
*args,
**kwargs,
)
@@ -48,43 +49,5 @@ def AsyncClientSession(*args, **kwargs): # pylint: disable=invalid-name
class SyncClientSession(requests.Session):
"""
Inherits requests.Session to override `request()` method for tracing
TODO: use httpx
"""

def request(self, method, url, **kwargs): # pylint: disable=arguments-differ
"""
Override for tracing. Not using super().request()
to capture metrics for connection and transfer times
"""
with create_request_tracer() as tracer:
# Separate out the kwargs that are not applicable to `requests.Request`
request_kwargs = {
k: v
for k, v in kwargs.items()
# contains the names of the arguments
if k in requests.Request.__init__.__code__.co_varnames
}

# Separate out the kwargs that are applicable to `requests.Request`
send_kwargs = {k: v for k, v in kwargs.items() if k not in request_kwargs}

# Create a PreparedRequest object to hold the request details
req = requests.Request(method, url, **request_kwargs)
prepped = self.prepare_request(req)
tracer.request = prepped # Assign the request to the tracer

# Merge environment settings
settings = self.merge_environment_settings(
prepped.url,
send_kwargs.get("proxies"),
send_kwargs.get("stream"),
send_kwargs.get("verify"),
send_kwargs.get("cert"),
)
send_kwargs.update(settings)

# Send the request
response = self.send(prepped, **send_kwargs)
tracer.response = response # Assign the response to the tracer

return response
pass
2 changes: 1 addition & 1 deletion runpod/serverless/__init__.py
Original file line number Diff line number Diff line change
@@ -10,7 +10,6 @@
import signal
import sys
import time
import typing
from typing import Any, Dict

from runpod.serverless import core
@@ -23,6 +22,7 @@

log = RunPodLogger()


# ---------------------------------------------------------------------------- #
# Run Time Arguments #
# ---------------------------------------------------------------------------- #
Loading