Skip to content

Commit a8ba225

Browse files
Merge pull request #1504 from roboflow/feature/try-to-beat-the-limitation-of-ee-in-terms-of-singular-elements-pushed-into-batch-inputs
Beat the limitations of EE in terms of singular elements pushed into batch inputs
2 parents 8a7d1ca + 8355698 commit a8ba225

File tree

59 files changed

+7122
-506
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+7122
-506
lines changed

docs/workflows/create_workflow_block.md

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1528,7 +1528,7 @@ the method signatures.
15281528
In this example, the block visualises crops predictions and creates tiles
15291529
presenting all crops predictions in single output image.
15301530

1531-
```{ .py linenums="1" hl_lines="29-31 48-49 59-60"}
1531+
```{ .py linenums="1" hl_lines="30-32 34-36 53-55 65-66"}
15321532
from typing import List, Literal, Type, Union
15331533

15341534
import supervision as sv
@@ -1556,10 +1556,15 @@ the method signatures.
15561556
crops_predictions: Selector(
15571557
kind=[OBJECT_DETECTION_PREDICTION_KIND]
15581558
)
1559+
scalar_parameter: Union[float, Selector()]
15591560
15601561
@classmethod
15611562
def get_output_dimensionality_offset(cls) -> int:
15621563
return -1
1564+
1565+
@classmethod
1566+
def get_parameters_enforcing_auto_batch_casting(cls) -> List[str]:
1567+
return ["crops", "crops_predictions"]
15631568
15641569
@classmethod
15651570
def describe_outputs(cls) -> List[OutputDefinition]:
@@ -1578,6 +1583,7 @@ the method signatures.
15781583
self,
15791584
crops: Batch[WorkflowImageData],
15801585
crops_predictions: Batch[sv.Detections],
1586+
scalar_parameter: float,
15811587
) -> BlockResult:
15821588
annotator = sv.BoxAnnotator()
15831589
visualisations = []
@@ -1591,18 +1597,22 @@ the method signatures.
15911597
return {"visualisations": tile}
15921598
```
15931599

1594-
* in lines `29-31` manifest class declares output dimensionality
1600+
* in lines `30-32` manifest class declares output dimensionality
15951601
offset - value `-1` should be understood as decreasing dimensionality level by `1`
15961602

1597-
* in lines `48-49` you can see the impact of output dimensionality decrease
1598-
on the method signature. Both inputs are artificially wrapped in `Batch[]` container.
1599-
This is done by Execution Engine automatically on output dimensionality decrease when
1600-
all inputs have the same dimensionality to enable access to all elements occupying
1601-
the last dimensionality level. Obviously, only elements related to the same element
1603+
* in lines `34-36` manifest class declares `run(...)` method inputs that will be subject to auto-batch casting
1604+
ensuring that the signature is always stable. Auto-batch casting was introduced in Execution Engine `v0.1.6.0`
1605+
- refer to [changelog](./execution_engine_changelog.md) for more details.
1606+
1607+
* in lines `53-55` you can see the impact of output dimensionality decrease
1608+
on the method signature. First two inputs (declared in line `36`) are artificially wrapped in `Batch[]`
1609+
container, whereas `scalar_parameter` remains primitive type. This is done by Execution Engine automatically
1610+
on output dimensionality decrease when all inputs have the same dimensionality to enable access to
1611+
all elements occupying the last dimensionality level. Obviously, only elements related to the same element
16021612
from top-level batch will be grouped. For instance, if you had two input images that you
16031613
cropped - crops from those two different images will be grouped separately.
16041614

1605-
* lines `59-60` illustrate how output is constructed - single value is returned and that value
1615+
* lines `65-66` illustrate how output is constructed - single value is returned and that value
16061616
will be indexed by Execution Engine in output batch with reduced dimensionality
16071617

16081618
=== "different input dimensionalities"

docs/workflows/execution_engine_changelog.md

Lines changed: 267 additions & 65 deletions
Large diffs are not rendered by default.

docs/workflows/workflow_execution.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,14 @@ influencing the processing for all elements in the batch and this type of data w
124124
the reference images remain unchanged as you process each input. Thus, the reference images are considered
125125
*scalar* data, while the list of input images is *batch-oriented*.
126126

127+
**Great news!**
128+
129+
Since Execution Engine `v1.6.0`, the practical aspects of dealing with *scalars* and *batches* are offloaded to
130+
the Execution Engine (refer to [changelog](./execution_engine_changelog.md) for more details). As a block
131+
developer, it is still important to understand the difference, but when building blocks you are not forced to
132+
think about the nuances that much.
133+
134+
127135
To illustrate the distinction, Workflow definitions hold inputs of the two categories:
128136

129137
- **Scalar inputs** - like `WorkflowParameter`
@@ -356,6 +364,16 @@ execution excludes steps at higher `dimensionality levels` from producing output
356364
output field selecting that values will be presented as nested list of empty lists, with depth matching
357365
`dimensionality level - 1` of referred output.
358366

367+
Since Execution Engine `v1.6.0`, blocks within a workflow may collapse batches into scalars, as well as create new
368+
batches from scalar inputs. The first scenario is pretty easy to understand - each dictionary in the output list will
369+
simply be populated with the same scalar value. The case of *emergent* batch is slightly more complicated.
370+
In such case we can find batch at dimensionality level 1, which has shape or elements order not compliant
371+
with input batches. To prevent semantic ambiguity, we treat such batch as if it's dimensionality is one level higher
372+
(as if **there is additional batch-oriented input of size one attached to the input of the block creating batch
373+
dynamically**). Such virtually nested outputs are broadcast, such that each dictionary in the output list will be given
374+
new key with the same nested output. This nesting property is preserved even if there is no input-derived outputs
375+
for given workflow - in such case, output is a list of size 1 which contains dictionary with nested output.
376+
359377
Some outputs would require serialisation when Workflows Execution Engine runs behind HTTP API. We use the following
360378
serialisation strategies:
361379

docs/workflows/workflows_execution_engine.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,19 @@ batch-oriented input, it will be treated as a SIMD step.
8686
Non-SIMD steps, by contrast, are expected to deliver a single result for the input data. In the case of non-SIMD
8787
flow-control steps, they affect all downstream steps as a whole, rather than individually for each element in a batch.
8888

89+
Historically, Execution Engine could not handle well all scenarios when non-SIMD steps' outputs were fed into SIMD steps
90+
inputs - causing compilation error due to lack of ability to automatically cast such outputs into batches when feeding
91+
into SIMD seps. Starting with Execution Engine `v1.6.0`, the handling of SIMD and non-SIMD blocks has been improved
92+
through the introduction of **Auto Batch Casting**:
93+
94+
* When a SIMD input is detected but receives scalar data, the Execution Engine automatically casts it into a batch.
95+
96+
* The dimensionality of the batch is determined at compile time, using *lineage* information from other
97+
batch-oriented inputs when available. Missing dimensions are generated in a manner similar to `torch.unsqueeze(...)`.
98+
99+
* Outputs are evaluated against the casting context - leaving them as scalars when block keeps or decreases output
100+
dimensionality or **creating new batches** when increase of dimensionality is expected.
101+
89102

90103
### Preparing step inputs
91104

inference/core/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
__version__ = "0.52.2"
1+
__version__ = "0.53.0"
22

33

44
if __name__ == "__main__":

inference/core/workflows/core_steps/fusion/dimension_collapse/v1.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,10 @@ def get_output_dimensionality_offset(
5959
) -> int:
6060
return -1
6161

62+
@classmethod
63+
def get_parameters_enforcing_auto_batch_casting(cls) -> List[str]:
64+
return ["data"]
65+
6266
@classmethod
6367
def describe_outputs(cls) -> List[OutputDefinition]:
6468
return [

inference/core/workflows/errors.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55

66
class WorkflowBlockError(BaseModel):
7-
block_id: str
7+
block_id: Optional[str] = None
88
block_type: Optional[str] = None
99
block_details: Optional[str] = None
1010
property_name: Optional[str] = None

inference/core/workflows/execution_engine/constants.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
PARSED_NODE_INPUT_SELECTORS_PROPERTY = "parsed_node_input_selectors"
33
STEP_DEFINITION_PROPERTY = "definition"
44
WORKFLOW_INPUT_BATCH_LINEAGE_ID = "<workflow_input>"
5+
TOP_LEVEL_LINEAGES_KEY = "top_level_lineages"
56
IMAGE_TYPE_KEY = "type"
67
IMAGE_VALUE_KEY = "value"
78
ROOT_PARENT_ID_KEY = "root_parent_id"

inference/core/workflows/execution_engine/introspection/schema_parser.py

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,12 +64,16 @@ def parse_block_manifest(
6464
inputs_accepting_batches_and_scalars = set(
6565
manifest_type.get_parameters_accepting_batches_and_scalars()
6666
)
67+
inputs_enforcing_auto_batch_casting = set(
68+
manifest_type.get_parameters_enforcing_auto_batch_casting()
69+
)
6770
return parse_block_manifest_schema(
6871
schema=schema,
6972
inputs_dimensionality_offsets=inputs_dimensionality_offsets,
7073
dimensionality_reference_property=dimensionality_reference_property,
7174
inputs_accepting_batches=inputs_accepting_batches,
7275
inputs_accepting_batches_and_scalars=inputs_accepting_batches_and_scalars,
76+
inputs_enforcing_auto_batch_casting=inputs_enforcing_auto_batch_casting,
7377
)
7478

7579

@@ -79,6 +83,7 @@ def parse_block_manifest_schema(
7983
dimensionality_reference_property: Optional[str],
8084
inputs_accepting_batches: Set[str],
8185
inputs_accepting_batches_and_scalars: Set[str],
86+
inputs_enforcing_auto_batch_casting: Set[str],
8287
) -> BlockManifestMetadata:
8388
primitive_types = retrieve_primitives_from_schema(
8489
schema=schema,
@@ -89,6 +94,7 @@ def parse_block_manifest_schema(
8994
dimensionality_reference_property=dimensionality_reference_property,
9095
inputs_accepting_batches=inputs_accepting_batches,
9196
inputs_accepting_batches_and_scalars=inputs_accepting_batches_and_scalars,
97+
inputs_enforcing_auto_batch_casting=inputs_enforcing_auto_batch_casting,
9298
)
9399
return BlockManifestMetadata(
94100
primitive_types=primitive_types,
@@ -255,6 +261,7 @@ def retrieve_selectors_from_schema(
255261
dimensionality_reference_property: Optional[str],
256262
inputs_accepting_batches: Set[str],
257263
inputs_accepting_batches_and_scalars: Set[str],
264+
inputs_enforcing_auto_batch_casting: Set[str],
258265
) -> Dict[str, SelectorDefinition]:
259266
result = []
260267
for property_name, property_definition in schema[PROPERTIES_KEY].items():
@@ -277,6 +284,7 @@ def retrieve_selectors_from_schema(
277284
is_list_element=True,
278285
inputs_accepting_batches=inputs_accepting_batches,
279286
inputs_accepting_batches_and_scalars=inputs_accepting_batches_and_scalars,
287+
inputs_enforcing_auto_batch_casting=inputs_enforcing_auto_batch_casting,
280288
)
281289
elif property_definition.get(TYPE_KEY) == OBJECT_TYPE and isinstance(
282290
property_definition.get(ADDITIONAL_PROPERTIES_KEY), dict
@@ -290,6 +298,7 @@ def retrieve_selectors_from_schema(
290298
is_dict_element=True,
291299
inputs_accepting_batches=inputs_accepting_batches,
292300
inputs_accepting_batches_and_scalars=inputs_accepting_batches_and_scalars,
301+
inputs_enforcing_auto_batch_casting=inputs_enforcing_auto_batch_casting,
293302
)
294303
else:
295304
selector = retrieve_selectors_from_simple_property(
@@ -300,6 +309,7 @@ def retrieve_selectors_from_schema(
300309
is_dimensionality_reference_property=is_dimensionality_reference_property,
301310
inputs_accepting_batches=inputs_accepting_batches,
302311
inputs_accepting_batches_and_scalars=inputs_accepting_batches_and_scalars,
312+
inputs_enforcing_auto_batch_casting=inputs_enforcing_auto_batch_casting,
303313
)
304314
if selector is not None:
305315
result.append(selector)
@@ -314,6 +324,7 @@ def retrieve_selectors_from_simple_property(
314324
is_dimensionality_reference_property: bool,
315325
inputs_accepting_batches: Set[str],
316326
inputs_accepting_batches_and_scalars: Set[str],
327+
inputs_enforcing_auto_batch_casting: Set[str],
317328
is_list_element: bool = False,
318329
is_dict_element: bool = False,
319330
) -> Optional[SelectorDefinition]:
@@ -323,9 +334,15 @@ def retrieve_selectors_from_simple_property(
323334
)
324335
if declared_points_to_batch == "dynamic":
325336
if property_name in inputs_accepting_batches_and_scalars:
326-
points_to_batch = {True, False}
337+
if property_name in inputs_enforcing_auto_batch_casting:
338+
points_to_batch = {True}
339+
else:
340+
points_to_batch = {True, False}
327341
else:
328-
points_to_batch = {property_name in inputs_accepting_batches}
342+
points_to_batch = {
343+
property_name in inputs_accepting_batches
344+
or property_name in inputs_enforcing_auto_batch_casting
345+
}
329346
else:
330347
points_to_batch = {declared_points_to_batch}
331348
allowed_references = [
@@ -359,6 +376,7 @@ def retrieve_selectors_from_simple_property(
359376
is_dimensionality_reference_property=is_dimensionality_reference_property,
360377
inputs_accepting_batches=inputs_accepting_batches,
361378
inputs_accepting_batches_and_scalars=inputs_accepting_batches_and_scalars,
379+
inputs_enforcing_auto_batch_casting=inputs_enforcing_auto_batch_casting,
362380
is_list_element=True,
363381
)
364382
if property_defines_union(property_definition=property_definition):
@@ -372,6 +390,7 @@ def retrieve_selectors_from_simple_property(
372390
is_dimensionality_reference_property=is_dimensionality_reference_property,
373391
inputs_accepting_batches=inputs_accepting_batches,
374392
inputs_accepting_batches_and_scalars=inputs_accepting_batches_and_scalars,
393+
inputs_enforcing_auto_batch_casting=inputs_enforcing_auto_batch_casting,
375394
)
376395
return None
377396

@@ -394,6 +413,7 @@ def retrieve_selectors_from_union_definition(
394413
is_dimensionality_reference_property: bool,
395414
inputs_accepting_batches: Set[str],
396415
inputs_accepting_batches_and_scalars: Set[str],
416+
inputs_enforcing_auto_batch_casting: Set[str],
397417
) -> Optional[SelectorDefinition]:
398418
union_types = (
399419
union_definition.get(ANY_OF_KEY, [])
@@ -410,6 +430,7 @@ def retrieve_selectors_from_union_definition(
410430
is_dimensionality_reference_property=is_dimensionality_reference_property,
411431
inputs_accepting_batches=inputs_accepting_batches,
412432
inputs_accepting_batches_and_scalars=inputs_accepting_batches_and_scalars,
433+
inputs_enforcing_auto_batch_casting=inputs_enforcing_auto_batch_casting,
413434
is_list_element=is_list_element,
414435
)
415436
if result is None:

inference/core/workflows/execution_engine/v1/compiler/entities.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,12 @@ def iterate_through_definitions(self) -> Generator[StepInputDefinition, None, No
216216
StepInputData = Dict[str, Union[StepInputDefinition, CompoundStepInputDefinition]]
217217

218218

219+
@dataclass
220+
class AutoBatchCastingConfig:
221+
casted_dimensionality: int
222+
lineage_support: Optional[List[str]]
223+
224+
219225
@dataclass
220226
class StepNode(ExecutionGraphNode):
221227
step_manifest: WorkflowBlockManifest
@@ -224,6 +230,9 @@ class StepNode(ExecutionGraphNode):
224230
child_execution_branches: Dict[str, str] = field(default_factory=dict)
225231
execution_branches_impacting_inputs: Set[str] = field(default_factory=set)
226232
batch_oriented_parameters: Set[str] = field(default_factory=set)
233+
auto_batch_casting_lineage_supports: Dict[str, AutoBatchCastingConfig] = field(
234+
default_factory=dict
235+
)
227236
step_execution_dimensionality: int = 0
228237

229238
def controls_flow(self) -> bool:

0 commit comments

Comments
 (0)