Skip to content

Experimental alt-core Cog runtime implementation

Notifications You must be signed in to change notification settings

replicate/cog-runtime

Folders and files

NameName
Last commit message
Last commit date
May 1, 2025
Mar 18, 2025
Apr 3, 2025
May 2, 2025
Apr 2, 2025
Nov 17, 2024
Mar 24, 2025
Dec 12, 2024
Nov 17, 2024
Nov 22, 2024
May 1, 2025
May 1, 2025
Mar 31, 2025

Repository files navigation

Cog Runtime

Caution

This project is experimental and should not be handled near open flame.

Alt-core Cog runtime implementation.

The original Cog seeks to be a great developer tool and also an arguably great production runtime. Cog runtime is focused on being a fantastic production runtime only.

How is cog-runtime formed?

Loading
sequenceDiagram
    participant r8
    participant server as cog-server
    participant runner as coglet
    participant predictor as Predictor
    r8->>server: Boot
    activate server
    server->>runner: exec("python3", ...)
    activate runner
    runner<<->>predictor: setup()
    activate predictor
    runner--)server: SIGHUP (output)
    runner--)server: SIGUSR1 (ready)
    server->>runner: read("setup-result.json")
    r8->>server: GET /health-check
    r8->>server: POST /predictions
    server->>runner: write("request-{id}.json")
    runner--)server: SIGUSR2 (busy)
    runner<<->>predictor: predict()
    deactivate predictor
    runner--)server: SIGHUP (output)
    runner--)server: SIGUSR1 (ready)
    server->>runner: read("response-{id}.json")
    deactivate runner
    server->>r8: POST /webhook
    deactivate server

This sequence is simplified, but the rough idea is that the Replicate platform (r8) depends on cog-server to provide an HTTP API in front of a coglet that communicates via files and signals.

cog-server

Go-based HTTP server that knows how to spawn and communicate with coglet.

coglet

Python-based model runner with zero dependencies outside of the standard library. The same in-process API provided by Cog is available, e.g.:

from cog import BasePredictor, Input

class MyPredictor(BasePredictor):
    def predict(self, n=Input(default="how many", ge=1, le=100)) -> str:
        return "ok" * n

In addition to simple cases like the above, the runner is async by default and supports continuous batching.

Communication with the cog-server parent process is managed via input and output files and the following signals:

  • SIGUSR1 model is ready
  • SIGUSR2 model is busy
  • SIGHUP output is available