Skip to content

Commit 2d4bcb9

Browse files
Update documentation (#911)
Co-authored-by: daniil-quix <[email protected]>
1 parent a9b4f63 commit 2d4bcb9

File tree

2 files changed

+89
-101
lines changed

2 files changed

+89
-101
lines changed

docs/api-reference/dataframe.md

Lines changed: 47 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -1546,14 +1546,14 @@ sdf_metadata = app.dataframe(app.topic("metadata"))
15461546
sdf_joined = sdf_measurements.join_asof(sdf_metadata, how="inner", grace_ms=timedelta(days=14))
15471547
```
15481548

1549-
<a id="quixstreams.dataframe.dataframe.StreamingDataFrame.lookup_join"></a>
1549+
<a id="quixstreams.dataframe.dataframe.StreamingDataFrame.join_lookup"></a>
15501550

15511551
<br><br>
15521552

1553-
#### StreamingDataFrame.lookup\_join
1553+
#### StreamingDataFrame.join\_lookup
15541554

15551555
```python
1556-
def lookup_join(
1556+
def join_lookup(
15571557
lookup: BaseLookup,
15581558
fields: dict[str, BaseField],
15591559
on: Optional[Union[str, Callable[[dict[str, Any], Any], str]]] = None
@@ -1594,18 +1594,18 @@ StreamingDataFrame: The same StreamingDataFrame instance with the enrichment app
15941594

15951595
**Example**:
15961596

1597+
15971598
```python
15981599
from quixstreams import Application
1599-
from quixstreams.dataframe.joins.lookups import QuixConfigurationService,
1600-
QuixConfigurationServiceField as Field
1600+
from quixstreams.dataframe.joins.lookups import QuixConfigurationService, QuixConfigurationServiceField as Field
16011601

16021602
app = Application()
16031603

16041604
sdf = app.dataframe(app.topic("input"))
16051605
lookup = QuixConfigurationService(app.topic("config"), config=app.config)
16061606

16071607
fields = {
1608-
"test": Field(type="test", default="test_default")
1608+
"test": Field(type="test", default="test_default")
16091609
}
16101610

16111611
sdf = sdf.join_lookup(lookup, fields)
@@ -2142,7 +2142,7 @@ Abstract base class for implementing custom lookup join strategies for data enri
21422142
This class defines the interface for lookup joins, where incoming records are enriched with external data based on a key and
21432143
a set of fields. Subclasses should implement the `join` method to specify how enrichment is performed.
21442144

2145-
Typical usage involves passing an instance of a subclass to `StreamingDataFrame.lookup_join`, along with a mapping of field names
2145+
Typical usage involves passing an instance of a subclass to `StreamingDataFrame.join_lookup`, along with a mapping of field names
21462146
to BaseField instances that describe how to extract or map enrichment data.
21472147

21482148
**Example**:
@@ -2293,34 +2293,32 @@ Rows will be deserialized into a dictionary with column names as keys.
22932293

22942294
**Example**:
22952295

2296+
22962297
```python
22972298
lookup = SQLiteLookup(path="/path/to/db.sqlite")
22982299

2299-
# Select the value in `col1` from the table `my_table` where `col2` matches the `sdf.lookup_join` on parameter.
2300-
fields = {
2301-
"my_field": SQLiteLookupField(table="my_table", columns=["col1", "col2"], on="col2")}
2300+
# Select the value in `col1` from the table `my_table` where `col2` matches the `sdf.join_lookup` on parameter.
2301+
fields = {"my_field": SQLiteLookupField(table="my_table", columns=["col1", "col2"], on="col2")}
23022302

2303-
# After the lookup the `my_field` column in the message will contains:
2304-
# {"col1": <row1 col1 value>, "col2": <row1 col2 value>}
2305-
sdf = sdf.join_lookup(lookup, fields)
2303+
# After the lookup the `my_field` column in the message will contains:
2304+
# {"col1": <row1 col1 value>, "col2": <row1 col2 value>}
2305+
sdf = sdf.join_lookup(lookup, fields)
23062306
```
2307-
2307+
23082308
```python
23092309
lookup = SQLiteLookup(path="/path/to/db.sqlite")
23102310

2311-
# Select the value in `col1` from the table `my_table` where `col2` matches the `sdf.lookup_join` on parameter.
2312-
fields = {
2313-
"my_field": SQLiteLookupField(table="my_table", columns=["col1", "col2"], on="col2",
2314-
first_match_only=False)}
2315-
2316-
# After the lookup the `my_field` column in the message will contains:
2317-
# [
2318-
# {"col1": <row1 col1 value>, "col2": <row1 col2 value>},
2319-
# {"col1": <row2 col1 value>, "col2": <row2 col2 value>},
2320-
# ...
2321-
# {"col1": <rowN col1 value>, "col2": <rowN col2 value>,},
2322-
# ]
2323-
sdf = sdf.join_lookup(lookup, fields)
2311+
# Select the value in `col1` from the table `my_table` where `col2` matches the `sdf.join_lookup` on parameter.
2312+
fields = {"my_field": SQLiteLookupField(table="my_table", columns=["col1", "col2"], on="col2", first_match_only=False)}
2313+
2314+
# After the lookup the `my_field` column in the message will contains:
2315+
# [
2316+
# {"col1": <row1 col1 value>, "col2": <row1 col2 value>},
2317+
# {"col1": <row2 col1 value>, "col2": <row2 col2 value>},
2318+
# ...
2319+
# {"col1": <rowN col1 value>, "col2": <rowN col2 value>,},
2320+
# ]
2321+
sdf = sdf.join_lookup(lookup, fields)
23242322
```
23252323

23262324

@@ -2405,40 +2403,38 @@ Field definition for use with SQLiteLookup in lookup joins.
24052403

24062404
Enables advanced SQL queries with support for parameter substitution from message columns, allowing dynamic lookups.
24072405

2408-
The `sdf.lookup_join` `on` parameter is not used in the query itself, but is important for cache management. When caching is enabled, the query is executed once per TTL for each unique target key.
2406+
The `sdf.join_lookup` `on` parameter is not used in the query itself, but is important for cache management. When caching is enabled, the query is executed once per TTL for each unique target key.
24092407

24102408
Query results are returned as tuples of values, without additional deserialization.
24112409

24122410
**Example**:
24132411

2412+
24142413
```python
24152414
lookup = SQLiteLookup(path="/path/to/db.sqlite")
24162415

2417-
# Select all columns from the first row of `my_table` where `col2` matches the value of `field1` in the message.
2418-
fields = {
2419-
"my_field": SQLiteLookupQueryField("SELECT * FROM my_table WHERE col2 = :field1")}
2416+
# Select all columns from the first row of `my_table` where `col2` matches the value of `field1` in the message.
2417+
fields = {"my_field": SQLiteLookupQueryField("SELECT * FROM my_table WHERE col2 = :field1")}
24202418

2421-
# After the lookup, the `my_field` column in the message will contain:
2422-
# [<row1 col1 value>, <row1 col2 value>, ..., <row1 colN value>]
2423-
sdf = sdf.join_lookup(lookup, fields)
2419+
# After the lookup, the `my_field` column in the message will contain:
2420+
# [<row1 col1 value>, <row1 col2 value>, ..., <row1 colN value>]
2421+
sdf = sdf.join_lookup(lookup, fields)
24242422
```
2425-
2423+
24262424
```python
24272425
lookup = SQLiteLookup(path="/path/to/db.sqlite")
24282426

2429-
# Select all columns from all rows of `my_table` where `col2` matches the value of `field1` in the message.
2430-
fields = {
2431-
"my_field": SQLiteLookupQueryField("SELECT * FROM my_table WHERE col2 = :field1",
2432-
first_match_only=False)}
2433-
2434-
# After the lookup, the `my_field` column in the message will contain:
2435-
# [
2436-
# [<row1 col1 value>, <row1 col2 value>, ..., <row1 colN value>],
2437-
# [<row2 col1 value>, <row2 col2 value>, ..., <row2 colN value>],
2438-
# ...
2439-
# [<rowN col1 value>, <rowN col2 value>, ..., <rowN colN value>],
2440-
# ]
2441-
sdf = sdf.join_lookup(lookup, fields)
2427+
# Select all columns from all rows of `my_table` where `col2` matches the value of `field1` in the message.
2428+
fields = {"my_field": SQLiteLookupQueryField("SELECT * FROM my_table WHERE col2 = :field1", first_match_only=False)}
2429+
2430+
# After the lookup, the `my_field` column in the message will contain:
2431+
# [
2432+
# [<row1 col1 value>, <row1 col2 value>, ..., <row1 colN value>],
2433+
# [<row2 col1 value>, <row2 col2 value>, ..., <row2 colN value>],
2434+
# ...
2435+
# [<rowN col1 value>, <rowN col2 value>, ..., <rowN colN value>],
2436+
# ]
2437+
sdf = sdf.join_lookup(lookup, fields)
24422438
```
24432439

24442440

@@ -2494,11 +2490,11 @@ based on a configurable TTL. The cache is a least recently used (LRU) cache with
24942490

24952491
**Example**:
24962492

2493+
24972494
```python
24982495
lookup = SQLiteLookup(path="/path/to/db.sqlite")
2499-
fields = {"my_field": SQLiteLookupField(table="my_table", columns=["col2"],
2500-
on="primary_key_col")}
2501-
sdf = sdf.join_lookup(lookup, fields)
2496+
fields = {"my_field": SQLiteLookupField(table="my_table", columns=["col2"], on="primary_key_col")}
2497+
sdf = sdf.join_lookup(lookup, fields)
25022498
```
25032499

25042500

docs/api-reference/quixstreams.md

Lines changed: 42 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -2294,12 +2294,12 @@ sdf_metadata = app.dataframe(app.topic("metadata"))
22942294
sdf_joined = sdf_measurements.join_asof(sdf_metadata, how="inner", grace_ms=timedelta(days=14))
22952295
```
22962296

2297-
<a id="quixstreams.dataframe.dataframe.StreamingDataFrame.lookup_join"></a>
2297+
<a id="quixstreams.dataframe.dataframe.StreamingDataFrame.join_lookup"></a>
22982298

2299-
#### StreamingDataFrame.lookup\_join
2299+
#### StreamingDataFrame.join\_lookup
23002300

23012301
```python
2302-
def lookup_join(
2302+
def join_lookup(
23032303
lookup: BaseLookup,
23042304
fields: dict[str, BaseField],
23052305
on: Optional[Union[str, Callable[[dict[str, Any], Any], str]]] = None
@@ -2337,8 +2337,7 @@ Example:
23372337

23382338
```python
23392339
from quixstreams import Application
2340-
from quixstreams.dataframe.joins.lookups import QuixConfigurationService,
2341-
QuixConfigurationServiceField as Field
2340+
from quixstreams.dataframe.joins.lookups import QuixConfigurationService, QuixConfigurationServiceField as Field
23422341

23432342
app = Application()
23442343

@@ -2900,31 +2899,28 @@ Example:
29002899
```python
29012900
lookup = SQLiteLookup(path="/path/to/db.sqlite")
29022901

2903-
# Select the value in `col1` from the table `my_table` where `col2` matches the `sdf.lookup_join` on parameter.
2904-
fields = {"my_field": SQLiteLookupField(table="my_table", columns=["col1", "col2"],
2905-
on="col2")}
2902+
# Select the value in `col1` from the table `my_table` where `col2` matches the `sdf.join_lookup` on parameter.
2903+
fields = {"my_field": SQLiteLookupField(table="my_table", columns=["col1", "col2"], on="col2")}
29062904

2907-
# After the lookup the `my_field` column in the message will contains:
2908-
# {"col1": <row1 col1 value>, "col2": <row1 col2 value>}
2909-
sdf = sdf.join_lookup(lookup, fields)
2905+
# After the lookup the `my_field` column in the message will contains:
2906+
# {"col1": <row1 col1 value>, "col2": <row1 col2 value>}
2907+
sdf = sdf.join_lookup(lookup, fields)
29102908
```
29112909

29122910
```python
29132911
lookup = SQLiteLookup(path="/path/to/db.sqlite")
29142912

2915-
# Select the value in `col1` from the table `my_table` where `col2` matches the `sdf.lookup_join` on parameter.
2916-
fields = {
2917-
"my_field": SQLiteLookupField(table="my_table", columns=["col1", "col2"], on="col2",
2918-
first_match_only=False)}
2919-
2920-
# After the lookup the `my_field` column in the message will contains:
2921-
# [
2922-
# {"col1": <row1 col1 value>, "col2": <row1 col2 value>},
2923-
# {"col1": <row2 col1 value>, "col2": <row2 col2 value>},
2924-
# ...
2925-
# {"col1": <rowN col1 value>, "col2": <rowN col2 value>,},
2926-
# ]
2927-
sdf = sdf.join_lookup(lookup, fields)
2913+
# Select the value in `col1` from the table `my_table` where `col2` matches the `sdf.join_lookup` on parameter.
2914+
fields = {"my_field": SQLiteLookupField(table="my_table", columns=["col1", "col2"], on="col2", first_match_only=False)}
2915+
2916+
# After the lookup the `my_field` column in the message will contains:
2917+
# [
2918+
# {"col1": <row1 col1 value>, "col2": <row1 col2 value>},
2919+
# {"col1": <row2 col1 value>, "col2": <row2 col2 value>},
2920+
# ...
2921+
# {"col1": <rowN col1 value>, "col2": <rowN col2 value>,},
2922+
# ]
2923+
sdf = sdf.join_lookup(lookup, fields)
29282924
```
29292925

29302926
**Arguments**:
@@ -2995,7 +2991,7 @@ Field definition for use with SQLiteLookup in lookup joins.
29952991

29962992
Enables advanced SQL queries with support for parameter substitution from message columns, allowing dynamic lookups.
29972993

2998-
The `sdf.lookup_join` `on` parameter is not used in the query itself, but is important for cache management. When caching is enabled, the query is executed once per TTL for each unique target key.
2994+
The `sdf.join_lookup` `on` parameter is not used in the query itself, but is important for cache management. When caching is enabled, the query is executed once per TTL for each unique target key.
29992995

30002996
Query results are returned as tuples of values, without additional deserialization.
30012997

@@ -3004,31 +3000,28 @@ Example:
30043000
```python
30053001
lookup = SQLiteLookup(path="/path/to/db.sqlite")
30063002

3007-
# Select all columns from the first row of `my_table` where `col2` matches the value of `field1` in the message.
3008-
fields = {
3009-
"my_field": SQLiteLookupQueryField("SELECT * FROM my_table WHERE col2 = :field1")}
3003+
# Select all columns from the first row of `my_table` where `col2` matches the value of `field1` in the message.
3004+
fields = {"my_field": SQLiteLookupQueryField("SELECT * FROM my_table WHERE col2 = :field1")}
30103005

3011-
# After the lookup, the `my_field` column in the message will contain:
3012-
# [<row1 col1 value>, <row1 col2 value>, ..., <row1 colN value>]
3013-
sdf = sdf.join_lookup(lookup, fields)
3006+
# After the lookup, the `my_field` column in the message will contain:
3007+
# [<row1 col1 value>, <row1 col2 value>, ..., <row1 colN value>]
3008+
sdf = sdf.join_lookup(lookup, fields)
30143009
```
30153010

30163011
```python
30173012
lookup = SQLiteLookup(path="/path/to/db.sqlite")
30183013

3019-
# Select all columns from all rows of `my_table` where `col2` matches the value of `field1` in the message.
3020-
fields = {
3021-
"my_field": SQLiteLookupQueryField("SELECT * FROM my_table WHERE col2 = :field1",
3022-
first_match_only=False)}
3023-
3024-
# After the lookup, the `my_field` column in the message will contain:
3025-
# [
3026-
# [<row1 col1 value>, <row1 col2 value>, ..., <row1 colN value>],
3027-
# [<row2 col1 value>, <row2 col2 value>, ..., <row2 colN value>],
3028-
# ...
3029-
# [<rowN col1 value>, <rowN col2 value>, ..., <rowN colN value>],
3030-
# ]
3031-
sdf = sdf.join_lookup(lookup, fields)
3014+
# Select all columns from all rows of `my_table` where `col2` matches the value of `field1` in the message.
3015+
fields = {"my_field": SQLiteLookupQueryField("SELECT * FROM my_table WHERE col2 = :field1", first_match_only=False)}
3016+
3017+
# After the lookup, the `my_field` column in the message will contain:
3018+
# [
3019+
# [<row1 col1 value>, <row1 col2 value>, ..., <row1 colN value>],
3020+
# [<row2 col1 value>, <row2 col2 value>, ..., <row2 colN value>],
3021+
# ...
3022+
# [<rowN col1 value>, <rowN col2 value>, ..., <rowN colN value>],
3023+
# ]
3024+
sdf = sdf.join_lookup(lookup, fields)
30323025
```
30333026

30343027
**Arguments**:
@@ -3078,9 +3071,8 @@ Example:
30783071

30793072
```python
30803073
lookup = SQLiteLookup(path="/path/to/db.sqlite")
3081-
fields = {"my_field": SQLiteLookupField(table="my_table", columns=["col2"],
3082-
on="primary_key_col")}
3083-
sdf = sdf.join_lookup(lookup, fields)
3074+
fields = {"my_field": SQLiteLookupField(table="my_table", columns=["col2"], on="primary_key_col")}
3075+
sdf = sdf.join_lookup(lookup, fields)
30843076
```
30853077

30863078
**Arguments**:
@@ -3443,7 +3435,7 @@ and timestamp.
34433435

34443436
Usage:
34453437
- Instantiate with a configuration topic and (optionally) application config or connection details.
3446-
- Use as the `lookup` argument in `StreamingDataFrame.lookup_join()` with a mapping of field names to Field objects.
3438+
- Use as the `lookup` argument in `StreamingDataFrame.join_lookup()` with a mapping of field names to Field objects.
34473439
- The `join` method is called for each record to enrich, updating the record in-place with configuration data.
34483440

34493441
Features:
@@ -3455,7 +3447,7 @@ Features:
34553447
**Example**:
34563448

34573449
lookup = Lookup(topic, app_config=app.config)
3458-
sdf = sdf.lookup_join(lookup, fields)
3450+
sdf = sdf.join_lookup(lookup, fields)
34593451

34603452
<a id="quixstreams.dataframe.joins.lookups.quix_configuration_service.lookup.Lookup.cache_info"></a>
34613453

@@ -3598,7 +3590,7 @@ Abstract base class for implementing custom lookup join strategies for data enri
35983590
This class defines the interface for lookup joins, where incoming records are enriched with external data based on a key and
35993591
a set of fields. Subclasses should implement the `join` method to specify how enrichment is performed.
36003592

3601-
Typical usage involves passing an instance of a subclass to `StreamingDataFrame.lookup_join`, along with a mapping of field names
3593+
Typical usage involves passing an instance of a subclass to `StreamingDataFrame.join_lookup`, along with a mapping of field names
36023594
to BaseField instances that describe how to extract or map enrichment data.
36033595

36043596
**Example**:

0 commit comments

Comments
 (0)