Skip to content

bytewax/bytewax-clickhouse

Repository files navigation

Actions Status PyPI Bytewax User Guide

Bytewax

bytewax-clickhouse

bytewax-clickhouse is commercially licensed with publicly available source code. Please see the full details in LICENSE.

Installation

pip install bytewax-clickhouse

Usage

Before running any workload, you will need to start ClickHouse if you are not already running it.

docker compose up -d

ClickHouse Sink requires a PyArrow table, Schema and keys for order and partition.

The Sink is eventually consistent based on the keys.

Add the import

from bytewax.clickhouse import operators as chop

Add A schema and order by string to your code.

CH_SCHEMA = """
        metric String,
        value Float64,
        ts DateTime,
        """

ORDER_BY = "metric, ts"

define a pyarrow schema

PA_SCHEMA = pa.schema(
    [
        ("metric", pa.string()),
        ("value", pa.float64()),
        ("ts", pa.timestamp("us")),  # microsecond
    ]
)

Use the ClickHouse Sink to write data to ClickHouse

chop.output(
    "output_clickhouse",
    metrics,
    "metrics",
    "admin",
    "password",
    database="bytewax",
    port=8123,
    ch_schema=CH_SCHEMA,
    order_by=ORDER_BY,
    pa_schema=PA_SCHEMA,
    timeout=timedelta(seconds=1),
    max_size=10,
)

Setting up the project

Install just

We use just as a command runner for actions / recipes related to developing Bytewax. Please follow the installation instructions. There's probably a package for your OS already.

Install pyenv and Python 3.12

I suggest using pyenv to manage python versions. the installation instructions.

You can also use your OS's package manager to get access to different Python versions.

Ensure that you have Python 3.12 installed and available as a "global shim" so that it can be run anywhere. The following will make plain python run your OS-wide interpreter, but will make 3.12 available via python3.12.

$ pyenv global system 3.12

Install uv

We use uv as a virtual environment creator, package installer, and dependency pin-er. There are a few different ways to install it, but I recommend installing it through either brew on macOS or pipx.

Development

We have a just recipe that will:

  1. Set up a venv in venvs/dev/.

  2. Install all dependencies into it in a reproducible way.

Start by adding any dependencies that are needed into pyproject.toml or into requirements/dev.in if they are needed for development.

Next, generate the pinned set of dependencies with

> just venv-compile-all

Create and activate a virtual environment

Once you have compiled your dependencies, run the following:

> just get-started

Activate your development environment and run the development task:

> . venvs/dev/bin/activate
> just develop

License

bytewax-clickhouse is commercially licensed with publicly available source code. You are welcome to prototype using this module for free, but any use on business data requires a paid license. See https://modules.bytewax.io/ for a license. Please see the full details in LICENSE.