Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add json path doc #1279

Merged
merged 5 commits into from
Nov 13, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 119 additions & 0 deletions docs/user-guide/logs/pipeline-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,125 @@ In the above example, the configuration of the `csv` processor includes the foll
- `trim`: Whether to trim whitespace. Defaults to `false`.
- `ignore_missing`: Ignores the case when the field is missing. Defaults to `false`. If the field is missing and this configuration is set to `false`, an exception will be thrown.

### `json_path` (experimental)

Note: The `json_path` processor is currently in the experimental stage and may be subject to change.

The `json_path` processor is used to extract fields from JSON data. Here's an example configuration:

```yaml
processors:
- json_path:
fields:
- complex_object
json_path: "$.shop.orders[?(@.active)].id"
ignore_missing: true
result_index: 1
```

In the above example, the configuration of the `json_path` processor includes the following fields:

- `fields`: A list of field names to be extracted.
- `json_path`: The JSON path to extract.
- `ignore_missing`: Ignores the case when the field is missing. Defaults to `false`. If the field is missing and this configuration is set to `false`, an exception will be thrown.
- `result_index`: The index of the result to extract. Defaults to all results. because the result is an array.
paomian marked this conversation as resolved.
Show resolved Hide resolved

#### JSON path syntax

The JSON path syntax is based on the [jsonpath-rust](https://github.com/besok/jsonpath-rust) library.

At this stage we only recommend using some simple field extraction operations to facilitate the extraction of nested fields to the top level.

#### `json_path` example

For example, given the following log data:

```json
{
"product_object": {
"hello": "world"
},
"product_array": [
"hello",
"world"
],
"complex_object": {
"shop": {
"orders": [
{
"id": 1,
"active": true
},
{
"id": 2
},
{
"id": 3
},
{
"id": 4,
"active": true
}
]
}
}
}
```

Using the following configuration:

```yaml
processors:
- json_path:
fields:
- product_object, object_target
json_path: "$.hello"
result_index: 0
- json_path:
fields:
- product_array, array_target
json_path: "$.[1]"
result_index: 0
- json_path:
fields:
- complex_object, complex_target_1
paomian marked this conversation as resolved.
Show resolved Hide resolved
json_path: "$.shop.orders[?(@.active)].id"
- json_path:
fields:
- complex_target_1, complex_target_2
json_path: "$.[1]"
result_index: 0
- json_path:
fields:
- complex_object, complex_target_3
json_path: "$.shop.orders[?(@.active)].id"
result_index: 1
transform:
- fields:
- object_target
- array_target
type: string
- fields:
- complex_target_3
- complex_target_2
type: uint32
- fields:
- complex_target_1
type: json
```

The result will be:

```json
{
"object_target": "world",
"array_target": "world",
"complex_target_3": 4,
"complex_target_2": 4,
"complex_target_1": [1, 4]
}
```


## Transform

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,124 @@ processors:
- `trim`: 是否去除空格。默认为 `false`。
- `ignore_missing`: 忽略字段不存在的情况。默认为 `false`。如果字段不存在,并且此配置为 false,则会抛出异常。

### `json_path`(实验性)

注意:`json_path` 处理器目前处于实验阶段,可能会有所变动。

`json_path` 处理器用于从 JSON 数据中提取字段。以下是一个配置示例:

```yaml
processors:
- json_path:
fields:
- complex_object
json_path: "$.shop.orders[?(@.active)].id"
ignore_missing: true
result_index: 1
```

在上述示例中,`json_path` processor 的配置包括以下字段:

- `fields`:要提取的字段名称列表。
- `json_path`:要提取的 JSON 路径。
- `ignore_missing`:忽略字段缺失的情况。默认为 `false`。如果字段缺失且此配置设置为 `false`,将抛出异常。
- `result_index`:要提取的结果值的下标。默认为所有结果,因为结果是一个数组。

#### JSON 路径语法

JSON 路径语法基于 [jsonpath-rust](https://github.com/besok/jsonpath-rust) 库。

在此阶段,我们仅推荐使用一些简单的字段提取操作,以便将嵌套字段提取到顶层。

#### `json_path` 示例

例如,给定以下日志数据:

```json
{
"product_object": {
"hello": "world"
},
"product_array": [
"hello",
"world"
],
"complex_object": {
"shop": {
"orders": [
{
"id": 1,
"active": true
},
{
"id": 2
},
{
"id": 3
},
{
"id": 4,
"active": true
}
]
}
}
}
```

使用以下配置:

```yaml
processors:
- json_path:
fields:
- product_object, object_target
json_path: "$.hello"
result_index: 0
- json_path:
fields:
- product_array, array_target
json_path: "$.[1]"
result_index: 0
- json_path:
fields:
- complex_object, complex_target_1
json_path: "$.shop.orders[?(@.active)].id"
- json_path:
fields:
- complex_target_1, complex_target_2
json_path: "$.[1]"
result_index: 0
- json_path:
fields:
- complex_object, complex_target_3
json_path: "$.shop.orders[?(@.active)].id"
result_index: 1
transform:
- fields:
- object_target
- array_target
type: string
- fields:
- complex_target_3
- complex_target_2
type: uint32
- fields:
- complex_target_1
type: json
```

结果将是:

```json
{
"object_target": "world",
"array_target": "world",
"complex_target_3": 4,
"complex_target_2": 4,
"complex_target_1": [1, 4]
}
```

## Transform

Expand Down