Ingest pipeline processors: Syntax for explicit access of fields with dots #125841

flash1293 · 2025-03-28T13:27:28Z

Description

Context

It's not uncommon for documents to have fields with dots in them:

{
  "nested": {
     "a.b.c": "This is a test"
  }
}

When specifying a field in a processor (e.g. grok, rename or others), it's currently not possible to target these fields, because dots are always interpreted as nested objects. { "grok": { "field": "nested.a.b.c" }} will only work on { "nested": { "a": { "b": { "c": "This is a test" } } } }.

This is especially relevant for OTel data and the streams project which plans to transform all incoming data to match the otel format.

Solution

A new syntax should be introduced to allow accessing these fields in all processors. Dots are interpreted as nested objects except when enclosed in [' and ']:

{ "grok": { "field": "nested['a.b.c']" }}

Some examples:

"resource.attributes['bar.foo']" // matches {"resource": {"attributes": {"bar.foo": "…"}}}
"['resource']['attributes']['bar.foo']" // same as above
"resource.attributes.bar.foo" // matches {"resource": {"attributes": {"bar": {"foo": "…"}}}}
"['resource']['attributes']['bar']['foo']" // matches {"resource": {"attributes": {"bar": {"foo": "…"}}}}
"['resource.attributes']['bar.foo']" // matches {"resource.attributes": {"bar.foo": "…"}}
"['resource.attributes.bar.foo']" // matches {"resource.attributes.bar.foo": "…"}}

It's possible to escape quotes within the quotes using \ to still access field names with brackets in them:

my['weird[\'fieldname\']'] // matches { "my": { "weird['fieldname']": "..." } }

Open questions

How does this syntax play with mustache template which are supported in some cases? For the scope of the observability team, it would be OK to not support it initially - this could be added later on.

Breaking change

This feature constitutes a change of behavior - using [' followed by '] in a field name specified in an ingest pipeline is currently allowed and treats these as regular characters. However, these cases are expected to be very rare.

Draft for breaking change proposal: https://github.com/elastic/dev/issues/3091

Why not dot_expander?

The dot_expander processor is addressing a similar need by normalizing the data instead of allowing the user to specify the difference. However, it has some downsides which are unacceptable in some cases:

Not possible to have a prefix of a dotted field name as a primitive value (especially in OTel this is a common format):

{
  "host": "abc",
  "host.name": "def"  // can't be dot-expanded without breaking host
}

Possible collisions

{
  "host": { "name": "abc" },
  "host.name": "def"
}

Different from OTTL, which allows this style of access
Changes the shape of the data which loses information - it becomes impossible to tell the difference between dotted field names and nested field names

References

POC: #125566
Discussion: https://github.com/elastic/streams-program/discussions/224

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2025-03-28T13:27:52Z

Pinging @elastic/es-data-management (Team:Data Management)

flash1293 added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement needs:triage Requires assignment of a team area label labels Mar 28, 2025

elasticsearchmachine added Team:Data Management Meta label for data/management team and removed needs:triage Requires assignment of a team area label labels Mar 28, 2025

flash1293 mentioned this issue Mar 28, 2025

Ingest pipeline processors: Syntax for fields API-style access of fields with dots #125847

Open

felixbarny linked a pull request Mar 28, 2025 that will close this issue

Bracket syntax for accessing dotted field names in ingest processors #125566

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ingest pipeline processors: Syntax for explicit access of fields with dots #125841

Ingest pipeline processors: Syntax for explicit access of fields with dots #125841

flash1293 commented Mar 28, 2025 •

edited

Loading

elasticsearchmachine commented Mar 28, 2025

Ingest pipeline processors: Syntax for explicit access of fields with dots #125841

Ingest pipeline processors: Syntax for explicit access of fields with dots #125841

Comments

flash1293 commented Mar 28, 2025 • edited Loading

Description

Context

Solution

Open questions

Breaking change

Why not dot_expander?

References

elasticsearchmachine commented Mar 28, 2025

flash1293 commented Mar 28, 2025 •

edited

Loading