Skip to content

Conversation

@frisbeeman
Copy link
Member

@frisbeeman frisbeeman commented Jan 5, 2026

Extends test module with various unwrap range aggregation cases, such as handling null values, offsets, quantile parameters, and error tracking. Implements the core unwrap planning in the planner, including date grid generation, grouping, and aggregation enhancements. Updates UDF and UDAF registration to support unwrap functions like parse_numeric, parse_bytes, and parse_duration.

Summary by CodeRabbit

  • New Features

    • Full implementation of unwrap-based range aggregations (including rate counter with counter-reset detection) and new parser/conversion UDFs for numeric, bytes, duration, and map-insert.
  • Tests

    • Extensive unit and integration tests covering unwrap aggregations, conversions, bucketing, error labeling, counter-reset scenarios, and new UDFs (including duplicated test blocks).
  • Documentation

    • README and technical spec updated to reflect implemented unwrap behavior and conversion/counter-reset details.
  • Chores

    • Expanded audit ignore entries.

✏️ Tip: You can customize this high-level summary in your review settings.

… logic

Extends test module with various unwrap range aggregation cases, such as handling null values, offsets, quantile parameters, and error tracking. Implements the core unwrap planning in the planner, including date grid generation, grouping, and aggregation enhancements. Updates UDF and UDAF registration to support unwrap functions like `parse_numeric`, `parse_bytes`, and `parse_duration`.
@frisbeeman frisbeeman requested review from a team and s-prosvirnin January 5, 2026 19:52
@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

Walkthrough

Implements LogQL unwrap-based range aggregations in the DataFusion planner, registers and implements four new scalar UDFs (parse_numeric, parse_bytes, parse_duration, map_insert), adds extensive planner and integration tests, updates UDF registry and docs, and expands .cargo/audit.toml ignore entries.

Changes

Cohort / File(s) Summary
Audit Configuration
\.cargo/audit\.toml
Expanded ignore list to an array and added RUSTSEC-2026-0001, preserving RUSTSEC-2024-0436.
Planner — unwrap range aggregation
crates/icegate-query/src/logql/datafusion/planner\.rs
Implemented unwrap-range-aggregation path with new helpers (adjust_time_range_for_lookback, build_date_grid_args, build_label_grouping_exprs, preserve_attributes_column, extract_unwrapped_value), integrated unwrap into existing range-aggregation flows, added counter-reset handling for rate_counter, and extended DataFusion imports.
UDF implementations & tests
crates/icegate-query/src/logql/datafusion/udf\.rs
Added public UDFs ParseNumeric, ParseBytes, ParseDuration, MapInsert with implementations and extensive unit tests (parsing, conversions, nulls, map insert/update). Added DataFusionError import.
UDF registry
crates/icegate-query/src/logql/datafusion/registry\.rs
Registered new scalar UDFs (map_insert, parse_numeric, parse_bytes, parse_duration) and updated registry documentation/comments.
Planner tests
crates/icegate-query/src/logql/datafusion/planner_tests\.rs
Added a large suite of Unwrap Range Aggregation tests covering many ops (sum/avg/min/max/first/last/quantile/stddev/variance/rate/bytes), offsets, conversions, error propagation, and date_grid behavior; a large test block is duplicated.
Docs / Spec / README
crates/icegate-query/src/logql/README\.md, crates/icegate-query/src/logql/TECHNICAL_SPEC\.md
Marked unwrap-based aggregations as implemented, added conversion and counter-reset details, updated UDF registration notes and recommendations.
Loki integration tests
crates/icegate-query/tests/loki/mod\.rs, crates/icegate-query/tests/loki/unwrap\.rs
Added unwrap test module and extensive integration tests that write test logs to Iceberg/Parquet, spin up a test Loki server, and query unwrap aggregations.
ANTLR parsing & tests
crates/icegate-query/src/logql/antlr/visitors\.rs, crates/icegate-query/src/logql/antlr/tests\.rs
Parse optional numeric parameter for quantile_over_time and added a parameter-parsing test.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Parser as LogQL Parser
    participant Planner as DataFusion Planner
    participant Inner as Inner Log Query Planner
    participant UDFs as parse_/map_ UDFs
    participant DateGrid as date_grid UDF
    participant Agg as Aggregation Engine

    Parser->>Planner: RangeExpr with unwrap clause
    Planner->>Planner: adjust_time_range_for_lookback()
    Planner->>Inner: plan inner log query (adjusted range)
    Inner-->>Planner: inner logical plan (rows + attributes)
    Planner->>UDFs: extract_unwrapped_value() → parse_numeric/parse_bytes/parse_duration
    UDFs-->>Planner: numeric values (NULL on failure)
    Planner->>Planner: coalesce NULLs, record unwrap errors
    Planner->>DateGrid: build_date_grid_args() & call date_grid UDF
    DateGrid-->>Planner: expanded bucket timestamps
    Planner->>Agg: build_label_grouping_exprs(), aggregate per bucket, preserve attributes
    Agg-->>Planner: bucketed aggregates (with __error__ where needed)
    Planner-->>Parser: final aggregated time-series result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • s-prosvirnin

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly summarizes the main changes: implementing unwrap range aggregation planning logic and adding comprehensive test coverage for unwrap-based aggregations.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Fix all issues with AI Agents 🤖
In @crates/icegate-query/src/logql/datafusion/planner.rs:
- Around line 561-566: In RangeAggregationOp::QuantileOverTime, replace the sort
expression passed to approx_percentile_cont — don't call .sort(true, true) on
col("unwrapped_value") because that yields a SortExpr; pass the plain column
expression (col("unwrapped_value")) to approx_percentile_cont instead so the
function receives a regular column Expr as its first argument.

In @crates/icegate-query/src/logql/datafusion/udf.rs:
- Around line 2195-2215: The scalar branch currently extracts the string by
calling scalar.to_string() and manually stripping "Utf8(...)" which is fragile;
instead match the scalar value directly against
datafusion::scalar::ScalarValue::Utf8(opt_s, _) (inside the
ColumnarValue::Scalar arm) to get the Option<String>/null handling, then call
Self::parse_duration_string on the contained &str if Some, convert nanos to
seconds when as_secs is true, and finally map the Option<f64> to a
datafusion::scalar::ScalarValue::Float64(Some/None) as the ColumnarValue::Scalar
result; ensure you preserve null semantics and use the existing
parse_duration_string and as_secs logic.
- Around line 1896-1914: The current scalar branch of the ColumnarValue handling
is fragile because it parses scalar.to_string() for "Utf8(...)"; replace that
with direct pattern matching on the ScalarValue enum: when matching
ColumnarValue::Scalar(scalar) downcast or match scalar to
datafusion::scalar::ScalarValue::Utf8(Some(s)) / Utf8(None) (and handle other
ScalarValue variants as non-parsable), attempt s.parse::<f64>() for Some(s) and
map to datafusion::scalar::ScalarValue::Float64(Some(v)) or Float64(None) on
parse failure or None input; ensure the surrounding code still returns
Ok(ColumnarValue::Scalar(...)) like before.
- Around line 2017-2034: The scalar branch in the ColumnarValue::Scalar arm uses
fragile to_string() parsing; replace that with direct pattern matching on
datafusion::scalar::ScalarValue::Utf8 (e.g., match scalar {
ScalarValue::Utf8(Some(s)) => ... , _ => None }) to extract the string, call
Self::parse_byte_string on it, and then return Ok(ColumnarValue::Scalar(...))
mapping the Option<f64> into ScalarValue::Float64(Some(v)) or Float64(None) as
currently done; update references to ScalarValue in this match instead of using
strip_prefix/strip_suffix string manipulation.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b957670 and 104b0cf.

📒 Files selected for processing (5)
  • .cargo/audit.toml
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/src/logql/datafusion/registry.rs
  • crates/icegate-query/src/logql/datafusion/udf.rs
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{rs,toml}

📄 CodeRabbit inference engine (AGENTS.md)

Use cargo build for debug builds, cargo build --release for release builds, and specific binary builds with cargo build --bin <name>

Files:

  • crates/icegate-query/src/logql/datafusion/registry.rs
  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/src/logql/datafusion/udf.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Run all tests with cargo test, specific tests with cargo test test_name, and use --nocapture flag to show test output
Use make fmt to check code format; DO NOT run via rustup because it doesn't respect rustfmt.toml
Use make clippy to run the linter with warnings as errors
Run make audit to perform security audits and use make install to install cargo-audit
Run make ci to execute all CI checks (check, fmt, clippy, test, audit)
Use rustfmt for code formatting with configuration in rustfmt.toml

Files:

  • crates/icegate-query/src/logql/datafusion/registry.rs
  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/src/logql/datafusion/udf.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
crates/icegate-query/src/logql/**

📄 CodeRabbit inference engine (AGENTS.md)

Regenerate LogQL ANTLR4 parser using make install and make gen in crates/icegate-query/src/logql/ directory (requires Java)

Files:

  • crates/icegate-query/src/logql/datafusion/registry.rs
  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/src/logql/datafusion/udf.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
**/*.{rs,toml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Ensure each file ends with a newline; do not duplicate if it already exists

Files:

  • crates/icegate-query/src/logql/datafusion/registry.rs
  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/src/logql/datafusion/udf.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
🧠 Learnings (2)
📚 Learning: 2025-12-27T13:38:58.955Z
Learnt from: CR
Repo: icegatetech/icegate PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-27T13:38:58.955Z
Learning: Applies to **/*.rs : Run `make audit` to perform security audits and use `make install` to install cargo-audit

Applied to files:

  • .cargo/audit.toml
📚 Learning: 2025-12-27T13:38:58.955Z
Learnt from: CR
Repo: icegatetech/icegate PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-27T13:38:58.955Z
Learning: Applies to Cargo.toml : Configure strict clippy and rustc lints: forbid unsafe_code, deny missing_docs and dead_code, and enable clippy pedantic/nursery

Applied to files:

  • .cargo/audit.toml
🧬 Code graph analysis (3)
crates/icegate-query/src/logql/datafusion/registry.rs (3)
crates/icegate-query/src/logql/datafusion/planner.rs (1)
  • new (60-62)
crates/icegate-query/src/logql/datafusion/udf.rs (7)
  • new (47-51)
  • new (107-111)
  • new (364-368)
  • new (1864-1868)
  • new (1962-1966)
  • new (2084-2088)
  • new (2267-2271)
crates/icegate-query/src/logql/datafusion/udaf.rs (8)
  • new (83-115)
  • new (471-475)
  • new (566-570)
  • new (664-668)
  • new (764-768)
  • new (864-868)
  • new (979-983)
  • new (1030-1034)
crates/icegate-query/src/logql/datafusion/planner_tests.rs (3)
crates/icegate-query/src/logql/datafusion/planner.rs (2)
  • new (60-62)
  • plan (68-88)
crates/icegate-query/src/logql/datafusion/registry.rs (1)
  • new (38-40)
crates/icegate-query/src/logql/datafusion/udaf.rs (9)
  • new (83-115)
  • new (471-475)
  • new (566-570)
  • new (664-668)
  • new (764-768)
  • new (864-868)
  • new (979-983)
  • expr (353-353)
  • expr (370-370)
crates/icegate-query/src/logql/datafusion/planner.rs (2)
crates/icegate-query/src/logql/datafusion/udf.rs (7)
  • new (47-51)
  • new (107-111)
  • new (364-368)
  • new (1864-1868)
  • new (1962-1966)
  • new (2084-2088)
  • new (2267-2271)
crates/icegate-query/src/logql/datafusion/udaf.rs (8)
  • new (83-115)
  • new (471-475)
  • new (566-570)
  • new (664-668)
  • new (764-768)
  • new (864-868)
  • new (979-983)
  • new (1030-1034)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Security Audit
  • GitHub Check: Test beta on ubuntu-latest
  • GitHub Check: Benchmark
  • GitHub Check: Test stable on ubuntu-latest
🔇 Additional comments (11)
crates/icegate-query/src/logql/datafusion/registry.rs (1)

13-13: LGTM! Registry correctly updated to include new UDFs.

The new UDFs (MapInsert, ParseNumeric, ParseBytes, ParseDuration) are properly imported and registered following the established pattern. Documentation accurately describes each function's purpose.

Also applies to: 21-25, 47-51, 64-68

crates/icegate-query/src/logql/datafusion/udf.rs (2)

1541-1836: Excellent test coverage for parsing UDFs.

Comprehensive tests covering:

  • Valid inputs (integers, floats, scientific notation, byte units, duration formats)
  • Invalid inputs returning NULL
  • NULL input propagation
  • Whitespace trimming
  • Edge cases (case insensitivity, alternative formats, compound durations)

2238-2434: LGTM! MapInsert implementation is well-structured.

  • Correctly uses pattern matching for scalar key/value extraction (unlike the parsing UDFs)
  • Properly handles existing key updates vs new key insertions
  • Preserves map metadata and null handling
  • Clear error messages for invalid inputs
crates/icegate-query/src/logql/datafusion/planner_tests.rs (1)

768-1185: Excellent test coverage for unwrap range aggregation.

Comprehensive tests covering:

  • All unwrap aggregation operations (sum, avg, min, max, first, last, quantile, stddev, stdvar, rate_counter)
  • Conversion types (numeric, bytes, duration, duration_seconds)
  • Error handling (missing quantile parameter, conversion error tracking)
  • NULL handling with coalesce
  • Offset modifier interaction
  • Plan structure verification via string inspection

The tests effectively validate the planner produces expected UDF calls (parse_numeric, parse_bytes, parse_duration, date_grid, map_insert) and aggregation functions.

crates/icegate-query/src/logql/datafusion/planner.rs (6)

395-485: LGTM! Well-factored helper functions.

  • adjust_time_range_for_lookback: Correctly calculates lookback window
  • build_date_grid_args: Proper error handling for duration overflow
  • build_label_grouping_exprs: Consistent delimiter usage with plan_series
  • preserve_attributes_column: Clean abstraction

These helpers effectively reduce duplication across range aggregation methods.


636-704: LGTM! Clean refactoring using new helper functions.

The plan_log_range_absent_aggregation now uses the shared helpers:

  • adjust_time_range_for_lookback
  • build_date_grid_args
  • build_label_grouping_exprs
  • preserve_attributes_column

This improves consistency and reduces duplication.


714-800: LGTM! Consistent refactoring with helper functions.

The plan_log_range_aggregation now uses the shared helpers, matching the pattern in other range aggregation methods.


1028-1068: LGTM! Clear and correct unwrap value extraction.

The function properly:

  • Distinguishes between top-level indexed columns and attributes MAP
  • Applies the appropriate conversion UDF based on the UnwrapConversion type
  • Returns NULL on conversion failure (enabling error tracking)

384-393: LGTM! Range aggregation routing is correctly updated.

The plan_range_aggregation method now properly delegates to the new plan_unwrap_range_aggregation implementation when an unwrap expression is present.


13-13: No action needed. The first_value import on line 13 from functions_aggregate::expr_fn does not conflict with the scoped import on line 271 from functions_aggregate::first_last. The scoped import within plan_series correctly shadows the module-level import only within that function's scope, which is standard Rust behavior and intentional here to use different implementations of first_value in different contexts.

.cargo/audit.toml (1)

2-5: Document the risk assessment and justification for ignoring RUSTSEC-2026-0001.

The advisory addresses undefined behavior in rkyv 0.7.45's Arc/Rc implementations during OOM conditions. While the codebase does not directly call from_value, rkyv is used transitively via rust_decimal v1.39.0.

To properly justify ignoring this advisory, please add a comment explaining:

  • Whether rust_decimal's usage of rkyv can trigger the vulnerable code path under OOM
  • Why upgrading rust_decimal (if a patched version exists) is not preferred
  • Confirmation that OOM scenarios are acceptable risk in your deployment environment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
crates/icegate-query/src/logql/datafusion/udf.rs (4)

1895-1906: Consider handling LargeUtf8 variant for completeness.

The scalar path now correctly uses direct pattern matching instead of fragile to_string() parsing (good fix!). However, it only handles ScalarValue::Utf8 and not ScalarValue::LargeUtf8. Consider adding LargeUtf8 support for better compatibility with different string representations in DataFusion.

🔎 Proposed enhancement
             ColumnarValue::Scalar(scalar) => {
                 let result = match scalar {
                     datafusion::scalar::ScalarValue::Utf8(Some(s)) => s.trim().parse::<f64>().ok(),
+                    datafusion::scalar::ScalarValue::LargeUtf8(Some(s)) => s.trim().parse::<f64>().ok(),
                     _ => None,
                 };

                 Ok(ColumnarValue::Scalar(result.map_or_else(
                     || datafusion::scalar::ScalarValue::Float64(None),
                     |v| datafusion::scalar::ScalarValue::Float64(Some(v)),
                 )))
             }

2008-2019: Consider handling LargeUtf8 variant for completeness.

Same improvement opportunity as ParseNumeric: the scalar path should handle both Utf8 and LargeUtf8 variants.

🔎 Proposed enhancement
             ColumnarValue::Scalar(scalar) => {
                 let result = match scalar {
                     datafusion::scalar::ScalarValue::Utf8(Some(s)) => Self::parse_byte_string(s.trim()),
+                    datafusion::scalar::ScalarValue::LargeUtf8(Some(s)) => Self::parse_byte_string(s.trim()),
                     _ => None,
                 };

                 Ok(ColumnarValue::Scalar(result.map_or_else(
                     || datafusion::scalar::ScalarValue::Float64(None),
                     |v| datafusion::scalar::ScalarValue::Float64(Some(v)),
                 )))
             }

2179-2193: Consider handling LargeUtf8 variant for completeness.

Same improvement opportunity as the other parse UDFs: handle both Utf8 and LargeUtf8 variants.

🔎 Proposed enhancement
             ColumnarValue::Scalar(scalar) => {
                 let result = match scalar {
                     datafusion::scalar::ScalarValue::Utf8(Some(s)) => Self::parse_duration_string(s.trim()).map(
                         |nanos| {
                             if as_secs { nanos / 1_000_000_000.0 } else { nanos }
                         },
                     ),
+                    datafusion::scalar::ScalarValue::LargeUtf8(Some(s)) => Self::parse_duration_string(s.trim()).map(
+                        |nanos| {
+                            if as_secs { nanos / 1_000_000_000.0 } else { nanos }
+                        },
+                    ),
                     _ => None,
                 };

                 Ok(ColumnarValue::Scalar(result.map_or_else(
                     || datafusion::scalar::ScalarValue::Float64(None),
                     |v| datafusion::scalar::ScalarValue::Float64(Some(v)),
                 )))
             }

2282-2296: Consider handling LargeUtf8 variant for completeness.

The key and value extraction only handles Utf8 scalars. Consider adding LargeUtf8 support for consistency with DataFusion's string handling.

🔎 Proposed enhancement
         // Extract scalar key and value
         let insert_key = match key_arg {
             ColumnarValue::Scalar(datafusion::scalar::ScalarValue::Utf8(Some(s))) => s.clone(),
+            ColumnarValue::Scalar(datafusion::scalar::ScalarValue::LargeUtf8(Some(s))) => s.clone(),
             ColumnarValue::Scalar(datafusion::scalar::ScalarValue::Utf8(None)) => {
                 return exec_err!("map_insert key cannot be NULL");
             }
+            ColumnarValue::Scalar(datafusion::scalar::ScalarValue::LargeUtf8(None)) => {
+                return exec_err!("map_insert key cannot be NULL");
+            }
             _ => return exec_err!("map_insert key must be a string scalar"),
         };

         let insert_value = match value_arg {
             ColumnarValue::Scalar(datafusion::scalar::ScalarValue::Utf8(Some(s))) => s.clone(),
+            ColumnarValue::Scalar(datafusion::scalar::ScalarValue::LargeUtf8(Some(s))) => s.clone(),
             ColumnarValue::Scalar(datafusion::scalar::ScalarValue::Utf8(None)) => {
                 return exec_err!("map_insert value cannot be NULL");
             }
+            ColumnarValue::Scalar(datafusion::scalar::ScalarValue::LargeUtf8(None)) => {
+                return exec_err!("map_insert value cannot be NULL");
+            }
             _ => return exec_err!("map_insert value must be a string scalar"),
         };
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 104b0cf and 12bfb70.

📒 Files selected for processing (1)
  • crates/icegate-query/src/logql/datafusion/udf.rs
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{rs,toml}

📄 CodeRabbit inference engine (AGENTS.md)

Use cargo build for debug builds, cargo build --release for release builds, and specific binary builds with cargo build --bin <name>

Files:

  • crates/icegate-query/src/logql/datafusion/udf.rs
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Run all tests with cargo test, specific tests with cargo test test_name, and use --nocapture flag to show test output
Use make fmt to check code format; DO NOT run via rustup because it doesn't respect rustfmt.toml
Use make clippy to run the linter with warnings as errors
Run make audit to perform security audits and use make install to install cargo-audit
Run make ci to execute all CI checks (check, fmt, clippy, test, audit)
Use rustfmt for code formatting with configuration in rustfmt.toml

Files:

  • crates/icegate-query/src/logql/datafusion/udf.rs
crates/icegate-query/src/logql/**

📄 CodeRabbit inference engine (AGENTS.md)

Regenerate LogQL ANTLR4 parser using make install and make gen in crates/icegate-query/src/logql/ directory (requires Java)

Files:

  • crates/icegate-query/src/logql/datafusion/udf.rs
**/*.{rs,toml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Ensure each file ends with a newline; do not duplicate if it already exists

Files:

  • crates/icegate-query/src/logql/datafusion/udf.rs
🧬 Code graph analysis (1)
crates/icegate-query/src/logql/datafusion/udf.rs (1)
crates/icegate-query/src/logql/datafusion/udaf.rs (17)
  • values (525-527)
  • values (620-622)
  • values (719-721)
  • values (723-725)
  • values (818-820)
  • values (822-824)
  • values (918-920)
  • values (1114-1116)
  • values (1401-1401)
  • values (1429-1429)
  • values (1467-1467)
  • as_any (479-481)
  • as_any (574-576)
  • as_any (672-674)
  • as_any (772-774)
  • as_any (872-874)
  • as_any (987-989)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Benchmark
  • GitHub Check: Test beta on ubuntu-latest
  • GitHub Check: Security Audit
  • GitHub Check: Test stable on ubuntu-latest
🔇 Additional comments (1)
crates/icegate-query/src/logql/datafusion/udf.rs (1)

1540-1836: Excellent test coverage!

The test coverage for the new parsing UDFs is comprehensive and well-structured. The tests cover:

  • Valid inputs (integers, floats, scientific notation, various units)
  • Invalid inputs returning NULL
  • NULL input handling
  • Whitespace handling
  • Edge cases (zero values, fractional values, compound durations)

The consistent use of helper functions (execute_parse_numeric, execute_parse_bytes, execute_parse_duration) makes the tests maintainable and easy to follow.

…LAG`

Introduces tests for `rate_counter` scenarios, including single reset, multiple resets, no resets, and label grouping. Updates the planner to support Prometheus-compatible counter reset detection using the `LAG` window function.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
crates/icegate-query/src/logql/TECHNICAL_SPEC.md (1)

502-503: Verify the source line reference accuracy.

The source reference points to planner.rs:487-622, but line numbers can drift as the codebase evolves. Consider using a more stable reference like function names or anchoring to specific function signatures.

crates/icegate-query/src/logql/datafusion/planner.rs (1)

1125-1137: Consider reusing UDF instances.

New UDF instances are created on each call to extract_unwrapped_value. For hot paths with many unwrap operations, consider caching these UDFs or passing them as parameters.

🔎 Alternative: Pass UDFs as parameters or use lazy_static
// Option 1: Pass UDFs from caller
fn extract_unwrapped_value(
    label: &str,
    conversion: Option<UnwrapConversion>,
    _schema: &DFSchema,
    parse_numeric_udf: &ScalarUDF,
    parse_bytes_udf: &ScalarUDF,
    parse_duration_udf: &ScalarUDF,
) -> Expr { ... }

// Option 2: Use lazy_static or once_cell (if UDFs are expensive to create)
crates/icegate-query/tests/loki/unwrap.rs (2)

238-241: Consider using dynamic port allocation for test robustness.

Hardcoded ports (3220-3231) could cause conflicts if:

  • Another process uses these ports
  • Tests are run multiple times in parallel (e.g., in CI with multiple jobs)

Consider using port 0 to let the OS allocate an available port.


454-456: Good practice documenting the ignore reason.

The ignore annotation clearly explains the parser limitation. Consider creating a tracking issue for this parser enhancement.

Would you like me to help draft an issue to track the parser enhancement for extracting quantile parameters?

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12bfb70 and 24a5a53.

📒 Files selected for processing (6)
  • crates/icegate-query/src/logql/README.md
  • crates/icegate-query/src/logql/TECHNICAL_SPEC.md
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/tests/loki/mod.rs
  • crates/icegate-query/tests/loki/unwrap.rs
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{rs,toml}

📄 CodeRabbit inference engine (AGENTS.md)

Use cargo build for debug builds, cargo build --release for release builds, and specific binary builds with cargo build --bin <name>

Files:

  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/tests/loki/unwrap.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/tests/loki/mod.rs
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Run all tests with cargo test, specific tests with cargo test test_name, and use --nocapture flag to show test output
Use make fmt to check code format; DO NOT run via rustup because it doesn't respect rustfmt.toml
Use make clippy to run the linter with warnings as errors
Run make audit to perform security audits and use make install to install cargo-audit
Run make ci to execute all CI checks (check, fmt, clippy, test, audit)
Use rustfmt for code formatting with configuration in rustfmt.toml

Files:

  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/tests/loki/unwrap.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/tests/loki/mod.rs
crates/icegate-query/src/logql/**

📄 CodeRabbit inference engine (AGENTS.md)

Regenerate LogQL ANTLR4 parser using make install and make gen in crates/icegate-query/src/logql/ directory (requires Java)

Files:

  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/logql/TECHNICAL_SPEC.md
  • crates/icegate-query/src/logql/README.md
**/*.{rs,toml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Ensure each file ends with a newline; do not duplicate if it already exists

Files:

  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
  • crates/icegate-query/tests/loki/unwrap.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/logql/TECHNICAL_SPEC.md
  • crates/icegate-query/src/logql/README.md
  • crates/icegate-query/tests/loki/mod.rs
**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

Never proactively create documentation files (*.md) or README files unless explicitly requested

Files:

  • crates/icegate-query/src/logql/TECHNICAL_SPEC.md
  • crates/icegate-query/src/logql/README.md
🧬 Code graph analysis (1)
crates/icegate-query/src/logql/datafusion/planner_tests.rs (6)
crates/icegate-query/src/logql/expr.rs (2)
  • log (26-28)
  • metric (32-34)
crates/icegate-query/src/logql/datafusion/planner.rs (2)
  • new (61-63)
  • plan (69-89)
crates/icegate-query/src/logql/datafusion/udf.rs (7)
  • new (47-51)
  • new (107-111)
  • new (364-368)
  • new (1864-1868)
  • new (1954-1958)
  • new (2069-2073)
  • new (2245-2249)
crates/icegate-query/src/logql/planner.rs (1)
  • plan (98-98)
crates/icegate-query/src/logql/log.rs (1)
  • contains (134-139)
crates/icegate-query/src/logql/antlr/visitors.rs (1)
  • err (52-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test beta on ubuntu-latest
  • GitHub Check: Security Audit
  • GitHub Check: Test stable on ubuntu-latest
  • GitHub Check: Benchmark
🔇 Additional comments (12)
crates/icegate-query/src/logql/datafusion/planner.rs (5)

396-410: LGTM!

The time range adjustment helper cleanly encapsulates the lookback window calculation logic and handles the optional offset correctly.


556-595: Counter reset detection implementation looks correct.

The LAG-based counter reset detection follows Prometheus semantics properly:

  • First sample gets delta=0 (correct - no prior value to compare)
  • Reset detection when current < previous
  • Normal increase calculates delta = current - previous

The partition expressions correctly include grid_timestamp and all label columns for per-series tracking.


597-610: LGTM!

The conditional grouping expression logic correctly handles the difference between rate_counter (which already created serialized attribute columns in step 8.5) and other operations (which need them created as part of grouping).


621-629: LGTM!

The first_value and last_value implementations correctly include timestamp ordering to ensure temporal correctness of the aggregation results.


655-664: LGTM!

The error tracking implementation correctly:

  1. Uses bool_or to aggregate error flags per group
  2. Conditionally adds __error__ label only for groups with conversion errors
  3. Uses the MapInsert UDF to add the label to the attributes MAP
crates/icegate-query/tests/loki/mod.rs (1)

7-18: LGTM!

Module organization correctly adds the new unwrap test category with appropriate documentation.

crates/icegate-query/src/logql/README.md (1)

33-36: LGTM!

The implementation status table accurately reflects the newly implemented unwrap aggregations, including the note about rate_counter supporting Prometheus-compatible counter reset detection.

crates/icegate-query/src/logql/datafusion/planner_tests.rs (3)

1032-1061: LGTM!

The rate_counter tests comprehensively cover various counter scenarios:

  • Single reset detection
  • Multiple resets
  • No reset (monotonic increase)
  • Single value edge case
  • Label-based grouping/partitioning

The assertions correctly verify the presence of LAG window function and delta calculation in the query plan.


1249-1283: LGTM!

The error handling test correctly verifies the plan structure includes:

  • NULL checks for conversion errors
  • bool_or for aggregating error flags
  • map_insert for adding __error__ label

942-968: LGTM!

The test correctly verifies that quantile_over_time without a parameter produces an appropriate error message about the missing parameter.

crates/icegate-query/tests/loki/unwrap.rs (2)

106-113: LGTM!

The schema extraction correctly handles the nested Map type structure to build test attributes.


269-280: LGTM!

The assertion pattern correctly verifies:

  1. Response structure (status, resultType)
  2. Presence of metric labels
  3. At least one non-null value in the time series

This approach is appropriate for integration tests where exact values can vary based on time bucketing.

Introduces tests for `quantile_over_time` including boundary cases and invalid parameter handling. Updates the planner to enforce parameter range checks (0.0-1.0) and improve error messaging. Enhances parser to support extracting numeric parameters for range aggregations.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
crates/icegate-query/src/logql/datafusion/planner.rs (1)

674-698: Consider validating range duration for rate calculations.

For rate_counter (line 687), the code divides by range_secs computed from nanoseconds. While extremely small ranges are unlikely in practice, consider adding a validation check to ensure the range duration is reasonable (e.g., > 0) to prevent division by very small numbers.

This is a minor edge case and may be acceptable as-is, since the range comes from user input that would typically be validated earlier.

🔎 Optional validation example
             RangeAggregationOp::RateCounter => {
                 let range_nanos = agg
                     .range_expr
                     .range
                     .num_nanoseconds()
                     .ok_or(QueryError::Config("Range duration too large".to_string()))?;
+                if range_nanos <= 0 {
+                    return Err(QueryError::Config("Range duration must be positive".to_string()));
+                }
                 #[allow(clippy::cast_precision_loss)]
                 let range_secs = range_nanos as f64 / 1_000_000_000.0;
                 (col("_agg_value").cast_to(&DataType::Float64, &schema)? / lit(range_secs)).alias("value")
             }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24a5a53 and fbb2968.

📒 Files selected for processing (4)
  • crates/icegate-query/src/logql/antlr/tests.rs
  • crates/icegate-query/src/logql/antlr/visitors.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/icegate-query/src/logql/datafusion/planner_tests.rs
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{rs,toml}

📄 CodeRabbit inference engine (AGENTS.md)

Use cargo build for debug builds, cargo build --release for release builds, and specific binary builds with cargo build --bin <name>

Files:

  • crates/icegate-query/src/logql/antlr/tests.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/logql/antlr/visitors.rs
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Run all tests with cargo test, specific tests with cargo test test_name, and use --nocapture flag to show test output
Use make fmt to check code format; DO NOT run via rustup because it doesn't respect rustfmt.toml
Use make clippy to run the linter with warnings as errors
Run make audit to perform security audits and use make install to install cargo-audit
Run make ci to execute all CI checks (check, fmt, clippy, test, audit)
Use rustfmt for code formatting with configuration in rustfmt.toml

Files:

  • crates/icegate-query/src/logql/antlr/tests.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/logql/antlr/visitors.rs
crates/icegate-query/src/logql/**

📄 CodeRabbit inference engine (AGENTS.md)

Regenerate LogQL ANTLR4 parser using make install and make gen in crates/icegate-query/src/logql/ directory (requires Java)

Files:

  • crates/icegate-query/src/logql/antlr/tests.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/logql/antlr/visitors.rs
**/*.{rs,toml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Ensure each file ends with a newline; do not duplicate if it already exists

Files:

  • crates/icegate-query/src/logql/antlr/tests.rs
  • crates/icegate-query/src/logql/datafusion/planner.rs
  • crates/icegate-query/src/logql/antlr/visitors.rs
🧠 Learnings (1)
📚 Learning: 2025-12-27T13:38:58.955Z
Learnt from: CR
Repo: icegatetech/icegate PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-27T13:38:58.955Z
Learning: Applies to crates/icegate-query/src/logql/** : Regenerate LogQL ANTLR4 parser using `make install` and `make gen` in crates/icegate-query/src/logql/ directory (requires Java)

Applied to files:

  • crates/icegate-query/src/logql/antlr/tests.rs
🧬 Code graph analysis (1)
crates/icegate-query/src/logql/datafusion/planner.rs (2)
crates/icegate-query/src/logql/datafusion/udaf.rs (7)
  • new (83-115)
  • new (471-475)
  • new (566-570)
  • new (664-668)
  • new (764-768)
  • new (864-868)
  • new (979-983)
crates/icegate-query/src/logql/datafusion/udf.rs (7)
  • new (47-51)
  • new (107-111)
  • new (364-368)
  • new (1864-1868)
  • new (1954-1958)
  • new (2069-2073)
  • new (2245-2249)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test beta on ubuntu-latest
  • GitHub Check: Test stable on ubuntu-latest
  • GitHub Check: Benchmark
  • GitHub Check: Security Audit
🔇 Additional comments (5)
crates/icegate-query/src/logql/antlr/visitors.rs (1)

735-743: LGTM! Clean parameter extraction logic.

The conditional extraction of the NUMBER parameter for quantile_over_time is well-implemented. The fallback to the helper's param value ensures backward compatibility with operations that don't require a parameter.

crates/icegate-query/src/logql/antlr/tests.rs (1)

402-414: LGTM! Good test coverage for quantile parameter parsing.

The test correctly validates that the parser accepts numeric parameters without enforcing range constraints. I verified that the planner (in planner.rs lines 636-640) does validate the 0.0–1.0 range as claimed in the comment on line 410.

crates/icegate-query/src/logql/datafusion/planner.rs (3)

630-643: LGTM! Proper quantile parameter validation.

The parameter range validation (0.0–1.0) is correctly implemented and matches the test expectations in tests.rs. The error messages are clear and helpful.


555-595: LGTM! Prometheus-compatible counter reset detection.

The counter reset logic correctly implements Prometheus semantics:

  • First value: delta = 0 (no previous to compare)
  • Reset detected: delta = current (assumes restart from 0)
  • Normal: delta = current - previous

The partitioning by grid_timestamp + all label columns ensures LAG operates within the same time series.


1105-1145: LGTM! Clean unwrapped value extraction.

The extraction logic properly handles both indexed columns and MAP attributes, with appropriate conversion UDFs. The NULL return on conversion failure integrates well with the error tracking strategy in plan_unwrap_range_aggregation.

@frisbeeman frisbeeman self-assigned this Jan 6, 2026
@frisbeeman frisbeeman merged commit 02a16aa into main Jan 6, 2026
8 checks passed
@frisbeeman frisbeeman linked an issue Jan 6, 2026 that may be closed by this pull request
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LogQL Unwrap Aggregations (sum_over_time, avg_over_time, etc.)

3 participants