|
| 1 | +# Migrate ktransformers to SYCL version |
| 2 | +[SYCLomatic](https://github.com/oneapi-src/SYCLomatic) is a project to assist developers in migrating their existing code written in different programming languages to the SYCL* C++ heterogeneous programming model. It is an open source version of the Intel® DPC++ Compatibility Tool. |
| 3 | + |
| 4 | +This file lists the detailed steps to migrate CUDA version of [ktransformers](https://github.com/kvcache-ai/ktransformers.git) to SYCL version with SYCLomatic. As follow table summarizes the migration environment, the software required, and so on. |
| 5 | + |
| 6 | + | Optimized for | Description |
| 7 | + |:--- |:--- |
| 8 | + | OS | Linux* Ubuntu* 22.04 |
| 9 | + | Software | Intel® oneAPI Base Toolkit, SYCLomatic |
| 10 | + | What you will learn | Migration of CUDA code, Run SYCL code on oneAPI and Intel device |
| 11 | + | Time to complete | TBD |
| 12 | + |
| 13 | + |
| 14 | +## Migrating ktransformers to SYCL |
| 15 | + |
| 16 | +### 1 Prepare the migration |
| 17 | +#### 1.1 Get the source code of ktransformers and install the dependencies |
| 18 | +```sh |
| 19 | + $ git clone https://github.com/kvcache-ai/ktransformers.git |
| 20 | + $ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 |
| 21 | + $ export PATH=/usr/local/cuda:$PATH |
| 22 | + $ export PATH=/usr/local/cuda-12.4/bin:$PATH |
| 23 | +``` |
| 24 | + |
| 25 | +#### 1.2 Prepare migration tool and environment |
| 26 | + |
| 27 | + * Install SYCL run environment [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html). After installation, the Intel® DPC++ Compatibility tool is also available, set up the SYCL run environment as follows: |
| 28 | + |
| 29 | +``` |
| 30 | + $ source /opt/intel/oneapi/setvars.sh |
| 31 | + $ dpct --version # Intel® DPC++ Compatibility tool version |
| 32 | +``` |
| 33 | + * If want to try the latest version of the compatibility tool, try to install SYCLomatic by downloading prebuild of [SYCLomatic release](https://github.com/oneapi-src/SYCLomatic/blob/SYCLomatic/README.md#Releases) or [build from source](https://github.com/oneapi-src/SYCLomatic/blob/SYCLomatic/README.md), as follow give the steps to install prebuild version: |
| 34 | + ``` |
| 35 | + $ export SYCLomatic_HOME=/path/to/install/SYCLomatic |
| 36 | + $ mkdir $SYCLomatic_HOME |
| 37 | + $ cd $SYCLomatic_HOME |
| 38 | + $ wget https://github.com/oneapi-src/SYCLomatic/releases/download/20240203/linux_release.tgz #Change the timestamp 20240203 to latest one |
| 39 | + $ tar xzvf linux_release.tgz |
| 40 | + $ source setvars.sh |
| 41 | + $ dpct --version #SYCLomatic version |
| 42 | + ``` |
| 43 | + |
| 44 | +For more information on configuring environment variables, see [Use the setvars Script with Linux*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html). |
| 45 | + |
| 46 | +### 2 Migrate the source code |
| 47 | +Here, we use [custom_gguf](https://github.com/kvcache-ai/ktransformers/tree/main/ktransformers/ktransformers_ext/cuda/custom_gguf) as an example to explain the migrate process. |
| 48 | + |
| 49 | +```sh |
| 50 | +# custom_gguf_HOME=ktransformers/ktransformers/ktransformers_ext/cuda/custom_gguf/ |
| 51 | +$ export PATH_TO_C2S_INSTALL_FOLDER=~/workspace/c2s_install |
| 52 | +$ source $PATH_TO_C2S_INSTALL_FOLDER/setvars.sh |
| 53 | +$ cd ${custom_gguf_HOME} |
| 54 | +$ c2s dequant.cu \ |
| 55 | + --extra-arg="-I/~/.local/lib/python3.10/site-packages/torch/include" \ |
| 56 | + --extra-arg="-I/~/.local/lib/python3.10/site-packages/torch/include/torch/csrc/api/include" \ |
| 57 | + --extra-arg="-I/usr/include/python3.10" \ |
| 58 | + --rule-file=~/workspace/c2s_install/extensions/pytorch_api_rules/pytorch_api.yaml |
| 59 | +``` |
| 60 | + |
| 61 | +Now you can see the migrated files in ${custom_gguf_HOME}/dpct_output. |
| 62 | + |
| 63 | +### 3 Prepare the running environment |
| 64 | +#### 3.1 Create virtual environment and source oneapi |
| 65 | +``` |
| 66 | +$ python3 -m venv xputorch |
| 67 | +$ source ~/workspace/xputorch/bin/activate |
| 68 | +$ source /opt/intel/oneapi/setvars.sh |
| 69 | +$ export LD_LIBRARY_PATH=~/workspace/xputorch/lib/python3.10/site-packages/torch/lib:$LD_LIBRARY_PATH |
| 70 | +``` |
| 71 | +#### 3.2 Install xpu torch |
| 72 | +Install xpu torch through |
| 73 | + |
| 74 | +``` |
| 75 | +pip install torch==2.7.0.dev20250305+xpu --extra-index-url https://download.pytorch.org/whl/nightly/xpu |
| 76 | +``` |
| 77 | + |
| 78 | +### 4 Build the migrated ktransformers |
| 79 | +There 8 tests available in the current stage: |
| 80 | +* 3 sycl tests to test single kernel (passed) in ./migrated/single_kernel_test |
| 81 | +* 4 sycl tests to test single kernel (results mismatch) in ./migrated/single_kernel_test_need_debug |
| 82 | +* 1 torch test to test dequantize_q8_0 in ./migrated/torch_test |
| 83 | + |
| 84 | +You can select one - ${test_directory}/${test_name}, and compile it through |
| 85 | +``` |
| 86 | +$ cd ${test_directory} |
| 87 | +$ source /opt/intel/oneapi/setvars.sh |
| 88 | +$ icpx -fsycl -I/opt/intel/oneapi/compiler/latest/include/sycl -I/~/workspace/xputorch/lib/python3.10/site-packages/torch/include -I/usr/include/python3.10 -I/~/workspace/xputorch/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -L/~/workspace/xputorch/lib/python3.10/site-packages/torch/lib -ltorch_xpu -ltorch_cpu -lc10_xpu -lc10 ${test_name} -o ${out_name} |
| 89 | +``` |
| 90 | + |
| 91 | +### 5 Run migrated SYCL version ktransformers |
| 92 | +``` |
| 93 | +$ ./${out_name} |
| 94 | +``` |
| 95 | + |
| 96 | + |
| 97 | +## ktransformers License |
| 98 | +[LICENSE](https://github.com/kvcache-ai/ktransformers/blob/main/LICENSE) |
| 99 | + |
| 100 | +## Reference |
| 101 | +* Command Line Options of [SYCLomatic](https://oneapi-src.github.io/SYCLomatic/dev_guide/command-line-options-reference.html) or [Intel® DPC++ Compatibility Tool](https://software.intel.com/content/www/us/en/develop/documentation/intel-dpcpp-compatibility-tool-user-guide/top/command-line-options-reference.html) |
| 102 | +* [oneAPI GPU Optimization Guide](https://www.intel.com/content/www/us/en/docs/oneapi/optimization-guide-gpu/) |
| 103 | +* [SYCLomatic project](https://github.com/oneapi-src/SYCLomatic/) |
| 104 | + |
| 105 | + |
| 106 | +## Trademarks information |
| 107 | +Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. |
| 108 | +\*Other names and brands may be claimed as the property of others. SYCL is a trademark of the Khronos Group Inc. |
0 commit comments