Documentation: CLBlast support in Kubernetes, to enable AMD and Intel iGPU #1182
Replies: 3 comments
-
|
Beta Was this translation helpful? Give feedback.
-
This looks amazing. I am far more familiar with docker compose in my homelab but can't seem to find an example of a compose file with clblas set up. Any tips (I see you mentioned your set up could be modified for docker)? |
Beta Was this translation helpful? Give feedback.
-
I migrate the issue to discussion because it is really helpful. More people can see this in the discussion. |
Beta Was this translation helpful? Give feedback.
-
Spawned off of #404
This is a runbook for enabling clblast in kubernetes, and can be applied to docker as well with a bit of work. This will enable AMD GPUs and Intel iGPUs.
The main steps that need to be done are:
To jump to the end, this is a working helm release for LocalAI which contains everything except step 1: https://github.com/lenaxia/home-ops-prod/blob/5039ba39489347e2753e7a333d53664dc3f8daf7/cluster/apps/home/localai/app/helm-release.yaml
Step 1: Enable GPU passthrough (Intel iGPU)
This is done through three helm releases which combine to automatically identify what features are available on a given node and label each node accordingly. In the case of Intel iGPUs, it also enables resource requests for GPU. If you have other ways of tagging your nodes with GPU resources, then that should work too.
Step 2: Install OpenCL drivers and clblas
Reference the latest Intel OpenCL driver and installation instructions here: https://github.com/intel/compute-runtime/releases
In order to get your helm release to automatically install these drivers, you can utilize the pod lifecycle postStart option:
Step 3: Configure GPU offloading
As defined here: https://github.com/go-skynet/LocalAI/blob/cdf0a6e7667e1fb3412951f078aaf017a6fd6437/api/config.go#L35, each model should contain a
gpu_layers
configuration that defines how many layers should be offloaded.In the case of Vicuna, the Model Library yaml can be found here: https://raw.githubusercontent.com/go-skynet/model-gallery/main/vicuna.yaml. Under
config_file
, add a reference to this:Step 4: Environment Variables
Set both
BUILD_TYPE
andLLAMA_CLBLAST
in order to ensure that the pod is built with CLBLAST supportRun a query
In order to verify that the GPU is being used, check your pod logs, and you should see something along the following lines:
Beta Was this translation helpful? Give feedback.
All reactions