请教：【微流水线并行机制】是如何执行的 #70

sikey647 · 2024-10-29T10:32:21Z

想请教下，介绍中的【微流水线并行机制】是指GraphProcessor间并行，还是在GraphProcessor内并行。通常有这样的场景：当服务收到的数据集比较大时，一般会分batch并行处理，一种是分多个GraphProcessor并行，但目前看example中，在构图阶段，GraphProcessor的数量是固定的，是否可以在运行阶段调整GraphProcessor数量？还一种方式在GraphProcessor内并行，研读代码时貌似没有这种支持。

oathdruid · 2024-10-29T14:59:52Z

这边典型的用法是用channel接上下游processor，之后下游每消费一个batch size就启动一个task/coroutine去处理；具体并发度通过这个batchsize来调控；实际效果是processor间流水线并行，processor内部minibatch并行；

oathdruid · 2024-10-29T15:05:12Z

内部因为起bthread协程比较简单就没在process层做封装；业务有时候会把普通data和channel混合使用，还有做多流join机制的；脱离业务做通用base processor可能得考虑下怎么做能不伤这类灵活性，应该可以抽象一些通用的builtin出来

sikey647 · 2024-10-29T16:05:37Z

明白了，感谢。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

请教：【微流水线并行机制】是如何执行的 #70

请教：【微流水线并行机制】是如何执行的 #70

sikey647 commented Oct 29, 2024

oathdruid commented Oct 29, 2024 •

edited

Loading

oathdruid commented Oct 29, 2024

sikey647 commented Oct 29, 2024

请教：【微流水线并行机制】是如何执行的 #70

请教：【微流水线并行机制】是如何执行的 #70

Comments

sikey647 commented Oct 29, 2024

oathdruid commented Oct 29, 2024 • edited Loading

oathdruid commented Oct 29, 2024

sikey647 commented Oct 29, 2024

oathdruid commented Oct 29, 2024 •

edited

Loading