-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
configurable filter chains #169
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Kuromesi The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @Kuromesi! |
Hi @Kuromesi. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hey there @Kuromesi! Thanks for the contribution! This looks like a sizable PR, so it may take some time to review and digest. But I am looking at it. 👍🏻 |
Thanks for the contribution and many great ideas! A few high level thoughts:
|
I generally agree with @liu-cong here. Especially point's 3 & 4. I totally understand using a config map for rapid dev as you dont need to rebuild an image and can just update a config map to set the algo you want to test and iterate on. So from a development and experimentation standpoint I think that is very strong. But from a supportability standpoint I think that could be problematic, for reasons that Cong mentioned. Debugging a live service that has its algorithm specified in code sounds brutal. Could we perhaps make this a flag that can be passed in on startup? Perhaps it can be a I do want to emphasize that I think this is very powerful for development and experimentation so I think we want this, but need a way to keep it separate from the way we configure a production algo. |
@kfswain @liu-cong Thanks for your reviews! I will break down my pr, and I do agree that configuring filters in configmap may introduce overhead and instability, I have made some optimization to resolve that. I added type FilterOrchestratorImpl struct {
datastore *backend.K8sDatastore
lastUpdated string
storedFilter *filterChainImpl
} And whenever error happens during filter orchestration, I just return the default filter, so it should work fine. filter, err := o.orchestrate(f)
if err != nil {
klog.Errorf("error orchestrating filters: %v", err)
filter = defaultFilter
} For the question "do you find modifying the configmap easier than modifying the go code?", I think for me it makes my testing much more easier since I'm trying various filter strategies to find the most suitable for my environment, modifying go code with IDE is definitely easier than configmap, but the rebuild and redeploy is time consuming (and my test environment is deployed in cloud provider which makes this process much more difficult). And I think configuration in configmap is not the final solution (I just use for tests), if you agree that configuring filters is necessary, maybe we should consider designing a new CRD to standardize and simplify the configuration process. BTW, I do thin we should provide a configurable way for filters (like envoy did) and also provide different filter strategies as candidates, since different environment may require different filter strategy, I'm not sure about that, but I'm making tests to prove that. I agree that the solution may be problematic and need more tests, I will add a flag to enable this feature and making more tests in my environment! Also, I utilized llmperf to test the performance in my environment, I can share my results with you if you need. And I am trying to integrate it with the Istio service mesh, and so far it works pretty well. Thanks for your comments again! Please be free to ask me if you have any questions! |
7de96f5
to
87aeaac
Compare
Signed-off-by: Kuromesi <[email protected]>
Signed-off-by: Kuromesi <[email protected]>
Signed-off-by: Kuromesi <[email protected]>
Signed-off-by: Kuromesi <[email protected]>
Signed-off-by: Kuromesi <[email protected]>
Signed-off-by: Kuromesi <[email protected]>
27c8b3c
to
d7edec8
Compare
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
@liu-cong @kfswain Hello guys, I have made some progress since our last discussion. I conducted several experiments with various filter chains on a cluster with 5 nodes, and each node has 16 vCPUs, 60 GB of RAM, and an A10 GPU. 10 pods that loaded the Lora Llama2 model are running in this cluster. I conducted following sets of experiments:
The following metrics were used to evaluate performance:
And the filter chains I have tested are:
And the results can be summarized as:
In conclusion, I think configurable filter chains are needed not only in development stage, but also in production stage, since different environment needs different filter chains. I can provide the complete test results if you need! |
Honestly, if you don't mind, the complete test results would be great to see just in general! |
WRT:
That is very likely true, but we might want a more gentle approach. It's my supposition that the load balancing algo will not change, in production, on-the-fly. My justification is that there are going to be many workloads all running on the same pool, so altering the way they share resources in real time could lead to quite a bit of footshoot. Which is why I think relying on the flags passed at startup that can specify a set of pre-determined algorithms, and/or a flag that lets someone create a configurable filter chain on the fly (We would have to specify that its a "USE AT OWN RISK" style. As there is no way to test the combinatorial explosion of options that will/would be possible). |
Some of this conversation is discussed here: #162 Would love your insights there! |
@Kuromesi Thanks for the detailed benchmarks and insights! I agree that we currently don't have a single algorithm that works the best for all environments, and therefore some configurability is likely desired. However I am very concerned about the configuration interface. IMO exposing the entire filter tree as a text format is very difficult to use, and error prone. And there is no good way to prevent regressions such as changing filter names. I think the following can be explored:
|
Motivation
Recently, I have been conducting tests in my environment based on your efforts. I wanted to evaluate the performance of different filter logics, but it is quite time-consuming to modify the filters through source code changes, followed by rebuilding and redeploying the application. During this process, I also noticed two TODOs in the code that align well with my needs:
and
As a result, I wrote some code to address these points and would like to know if it meets your approval.
Design
First, I renamed
Filter
interface toFilterChain
andfilterFunc
tofilter
since I think this may better describe their functionality (Maybe this is not needed).Then I added some interfaces.
Filters generation
FilterGen
is responsible for generating filters with filter options by callingGet
, and filter options should be validated beforeGet
by callingValidate
And I added a topk filter, though the performance of this filter have not been tested yet.
Orchestrate filter chains
FilterOrchestrator
is responsible to orchestrate a filter chain withFilterOrchestration
configuration which is loaded from configmap for now (maybe use a CRD?).I also added a reconciler for configmaps and store the configmap in the datastore. It is quite simple and need to be improved.
Configurable filter chains
By doing this, the scheduler can be initialized with a
FilterOrchestrator
and dynamically configure filters now.A configmap demo is shown in
pkg/ext-proc/scheduling/orchestrate_test.go
which can be orchestrated to a default filter.