Skip to content

whitebaifyy/AlignedServe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AlignedServe

Alignedserve is a prototype service framework implemented on top of Distserve, and all plugins provided by the framework are located in the alignedserve folder.

It utilizes a high-performance C++ Transformer inference library SwiftTransformer as the execution backend, which supports many features like model/pipeline parallelism, FlashAttention, Continuous Batching, and PagedAttention.

Build && Install

# clone the project

# setup the distserve conda environment
conda env create -f environment.yml && conda activate alignedserve

# clone and build the SwiftTransformer library  
git clone https://github.com/LLMServe/SwiftTransformer.git && cd SwiftTransformer && git submodule update --init --recursive
cmake -B build && cmake --build build -j$(nproc)
cd ..

# install distserve
pip install -e .

Launching

Run example

AlignedServe requires at least two GPUs to play with. We provide an inference example in examples/serving_example.py.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages