Skip to content

Conversation

@haileyesus2433
Copy link
Contributor

@haileyesus2433 haileyesus2433 commented Jan 20, 2026

Prepare


Description

Target issue

closes #13008


Test and Document the changes

  • Static code analysis has been run locally and issues have been fixed
  • Relevant unit and integration tests have been added/updated
  • Relevant documentation has been updated if any (i.e. user guides, installation and configuration guides, technical design docs etc)

Please check the below before submitting your PR. The PR will not be merged if there are no commits that start with docs: to indicate documentation changes or if the below checklist is not selected.

  • I confirm that there is no impact on the docs due to the code changes in this PR.

Summary by CodeRabbit

Release Notes

  • New Features
    • Data entries now include configurable time-to-live (TTL) with consistency validation to prevent invalid configurations.
    • Metrics tracking capability added with optional toggle for monitoring data access patterns.
    • New method to retrieve complete entry metadata including creation timestamp, expiration time, and access frequency.

✏️ Tip: You can customize this high-level summary in your review settings.

… validation

- Added `enable_metrics` field to `DataStoreConfig` to control metrics tracking.
- Implemented `DataEntry` struct to encapsulate data with metadata including access count and timestamps.
- Introduced `ConfigValidationError` for validation errors in configuration.
- Updated `DataStore` to validate configuration on initialization and track access counts for entries when metrics are enabled.
- Added tests for configuration validation, metrics tracking, and Cedar type inference.

This update improves the DataStore's functionality by allowing for better monitoring and management of data entries.

Signed-off-by: haileyesus2433 <haileyesusbe@gmail.com>
@haileyesus2433 haileyesus2433 added this to the 1.16.0 milestone Jan 20, 2026
@haileyesus2433 haileyesus2433 self-assigned this Jan 20, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Introduces a DataEntry structure for storing values with metadata (creation time, expiration, access count), adds configuration validation to DataStoreConfig with TTL constraint checking, and refactors DataStore to store DataEntry objects instead of raw values, making new() return Result with validation.

Changes

Cohort / File(s) Summary
Configuration & Validation
jans-cedarling/cedarling/src/data/config.rs, jans-cedarling/cedarling/src/data/mod.rs
Added enable_metrics: bool field to DataStoreConfig with default true; introduced ConfigValidationError enum with DefaultTtlExceedsMax variant; added validate() method enforcing default_ttl ≤ max_ttl when both present; updated module re-exports to include new error type.
Data Entry Model
jans-cedarling/cedarling/src/data/entry.rs
New module implementing CedarType enum (String, Long, Bool, Set, Record, Entity) with type inference; DataEntry struct with key, value, data_type, created_at, expires_at, access_count; RFC3339 datetime helpers; methods: new(), increment_access(), is_expired(); comprehensive test coverage for serialization and expiration logic.
Data Store Refactoring
jans-cedarling/cedarling/src/data/store.rs
Changed storage from SparKV<Value> to SparKV<DataEntry>; new() now returns Result<Self, ConfigValidationError> with upfront validation; added get_entry() method for full metadata access with metrics increment support; refactored push() to construct DataEntry objects with TTL logic; added get_effective_ttl() helper; updated get() and get_all() to extract values from entries; size calculation updated to account for DataEntry.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

kind-feature, comp-jans-cedarling

Suggested reviewers

  • olehbozhok
  • dagregi
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(data): enhance DataStore with metrics tracking and configuration validation' accurately reflects the main changes: adding metrics tracking and config validation.
Description check ✅ Passed The description follows the template with target issue #13008 linked, test items checked, and confirmation of no documentation impact.
Linked Issues check ✅ Passed The code implements all core requirements: DataEntry struct with metadata tracking, DataStoreConfig with validation, serialization support, TTL logic, and metrics toggle [#13008].
Out of Scope Changes check ✅ Passed All changes are directly aligned with #13008 requirements: DataEntry, DataStoreConfig, validation, metadata tracking, and metrics—no out-of-scope modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mo-auto
Copy link
Member

mo-auto commented Jan 20, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@haileyesus2433
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
jans-cedarling/cedarling/src/data/store.rs (1)

103-150: Inconsistency between DataEntry.expires_at and actual storage TTL.

The DataEntry is created at line 104 with the original ttl parameter, computing expires_at based on that value. However, the entry is stored at line 150 with chrono_ttl which is the effective TTL (potentially capped by max_ttl).

This causes a mismatch: entry.expires_at reflects the original TTL while the actual storage expiration uses the capped TTL. DataEntry::is_expired() will return incorrect results when TTL capping occurs.

Proposed fix: Create DataEntry with effective TTL
+        // Determine the effective TTL to use
+        // Priority: explicit ttl > config.default_ttl > infinite (10 years)
+        let requested_ttl = ttl
+            .or(self.config.default_ttl)
+            .unwrap_or(StdDuration::from_secs(INFINITE_TTL_SECS as u64));
+
+        // If an explicit TTL was provided, validate it against max_ttl
+        if ttl.is_some() {
+            if let Some(max_ttl) = self.config.max_ttl {
+                if requested_ttl > max_ttl {
+                    return Err(DataError::TTLExceeded {
+                        requested: requested_ttl,
+                        max: max_ttl,
+                    });
+                }
+            }
+        }
+
+        // Cap the effective TTL at max_ttl if set
+        let effective_ttl = if let Some(max_ttl) = self.config.max_ttl {
+            requested_ttl.min(max_ttl)
+        } else {
+            requested_ttl
+        };
+
         // Create DataEntry with metadata
-        let entry = DataEntry::new(key.to_string(), value, ttl);
+        let entry = DataEntry::new(key.to_string(), value, Some(effective_ttl));
 
         // Check entry size before storing (including metadata)
         let entry_size = serde_json::to_string(&entry)
             .map_err(DataError::from)?
             .len();
 
         // ... size check ...
-
-        // Determine the effective TTL to use
-        // Priority: explicit ttl > config.default_ttl > infinite (10 years)
-        let requested_ttl = ttl
-            .or(self.config.default_ttl)
-            .unwrap_or(StdDuration::from_secs(INFINITE_TTL_SECS as u64));
-        // ... rest of TTL logic moved above ...
🤖 Fix all issues with AI agents
In `@jans-cedarling/cedarling/src/data/config.rs`:
- Around line 117-125: The test test_default_ttl_exceeds_max currently uses
assert!(config.validate().is_err()) without an explicit message; update it to
use the matches! pattern per guidelines by asserting that config.validate()
matches Err(_) and include a clear failure message (e.g., "expected error when
default_ttl exceeds max_ttl") so the assertion reads as
assert!(matches!(config.validate(), Err(_)), "<message>"); reference
DataStoreConfig and its validate() method when making the change.

In `@jans-cedarling/cedarling/src/data/entry.rs`:
- Around line 143-147: The new() constructor in entry.rs currently uses
chrono::Duration::from_std(duration).unwrap_or_default(), which silently
converts oversized StdDuration to zero and causes immediate expiration; update
new(key: String, value: Value, ttl: Option<StdDuration>) to perform a
safe/saturating conversion instead (e.g., reuse the
std_duration_to_chrono_duration logic from store.rs or implement a saturating
conversion that caps to chrono::Duration::max_value()), and when conversion
fails either log the oversized TTL or return an Err/result variant as
appropriate for your API; ensure you reference and call the safe converter where
expires_at is computed so expires_at is never set to created_at due to a silent
unwrap_or_default.
- Around line 241-251: Update the test_serialization unit test for DataEntry to
also assert that created_at and expires_at survive a serde_json round-trip:
after serializing with serde_json::to_string and deserializing with
serde_json::from_str, compare the original entry.created_at and entry.expires_at
against deserialized.created_at and deserialized.expires_at (use direct equality
or a stable representation like to_rfc3339()/timestamp to avoid
timezone/precision issues) so the custom serde handling for datetimes is
explicitly verified.
- Around line 100-106: The current heuristic in entry.rs that classifies a
Value::Object as Self::Entity when it contains exactly "type" and "id"
(otherwise Self::Record) is brittle; to avoid misclassification of normal
records that coincidentally have those two fields, update the detection in the
code that sets DataEntry.data_type to require an explicit marker or validate the
"type" value against a whitelist of known Cedar entity types before returning
Self::Entity; otherwise leave as Self::Record. Reference the Value::Object
branch and the places where DataEntry.data_type is assigned so you can add the
marker check or whitelist validation rather than relying solely on
obj.contains_key("type") && obj.contains_key("id") && obj.len() == 2.

In `@jans-cedarling/cedarling/src/data/mod.rs`:
- Around line 10-17: The four types ConfigValidationError, DataStoreConfig,
CedarType, and DataEntry are declared pub in data::mod.rs but are not
re-exported from lib.rs; either remove their pub in data::mod.rs to make them
module-private or add re-exports in lib.rs (following the existing pattern used
for PolicyStoreLoadError from init::policy_store) to expose DataStoreConfig,
ConfigValidationError, CedarType, and DataEntry at the crate root; update
whichever file (lib.rs or data/mod.rs) accordingly so the public visibility and
crate exports are consistent.

In `@jans-cedarling/cedarling/src/data/store.rs`:
- Around line 302-321: Refactor the TTL calculation in push() to call
get_effective_ttl(ttl, default_ttl, max_ttl) instead of duplicating logic:
locate the TTL computation in push() (where it currently builds
requested_ttl/effective/capped), replace that block with a single call to
get_effective_ttl using the same ttl/default_ttl/max_ttl variables, and use the
returned ChronoDuration for expiry handling and any conversions; ensure behavior
(10-year INFINITE_TTL_SECS fallback and max_ttl capping) remains identical and
remove the now-redundant local TTL variables.
- Around line 714-717: Replace verbose uses of super::super::entry::CedarType in
the tests with the re-exported type from the parent module (use super::CedarType
or the parent-module re-export) to simplify imports; update the occurrences in
the test that asserts entry.data_type (in the test around the
assert_eq!(entry.data_type, ...)) and the similar reference in
test_cedar_type_inference so both use the re-exported CedarType symbol instead
of super::super::entry::CedarType.
- Around line 186-229: In get_entry(), avoid taking a write lock
unconditionally: acquire a read lock on self.storage first, locate and clone the
DataEntry (using storage.get and entry.clone()), check expiration via
entry.expires_at and if already expired treat it as absent (release read lock
and return None, optionally acquire a write lock only to remove the expired key
from storage); only when self.config.enable_metrics is true, upgrade by
acquiring a write lock and perform entry.increment_access() and
storage.set_with_ttl(key, entry.clone(), remaining_ttl, &[]) where remaining_ttl
is computed only for non-expired entries (do not recalculate and reset TTL for
already-expired entries), referencing get_entry, self.config.enable_metrics,
DataEntry.expires_at, storage.get, storage.set_with_ttl, and get_effective_ttl.

@haileyesus2433
Copy link
Contributor Author

@coderabbitai please review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

@haileyesus2433: I'll review the changes in this PR right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Signed-off-by: haileyesus2433 <haileyesusbe@gmail.com>
…g and validation

- Added `std_duration_to_chrono_duration` function for safe conversion between `std::time::Duration` and `chrono::Duration`.
- Updated `DataStore` to validate explicit TTL against `max_ttl` before calculating effective TTL.
- Enhanced `DataEntry` creation to use saturating conversion for TTL to prevent overflow.
- Improved entity reference validation in `CedarType` to ensure correct type and id fields.
- Updated tests to verify new TTL handling and serialization behavior.

This update improves the robustness of data handling in the DataStore and enhances the overall data integrity.

Signed-off-by: haileyesus2433 <haileyesusbe@gmail.com>
@olehbozhok
Copy link
Contributor

Pls concider this comment #12349 (comment)
We don't need to serialize and deserialize Entity constantly. Or I`m wrong?

And also. In SparKV, as storage is used data: BTreeMap<String, KvEntry<T>>,. Perhaps we can refactor SparKV to use cedar_policy::Entities as storage, utilizing generics. (Only if it gives some benefits).

Or maybe we can use cedar_policy_core::Entities because it stores Arc<Entity>. But at first, we need to investigate if we can utilize cloning Arc instead of just the entity.

Comment on lines 128 to 139
fn std_duration_to_chrono_duration(d: StdDuration) -> ChronoDuration {
let secs = d.as_secs();
let nanos = d.subsec_nanos();

// Saturating conversion: i64::MAX seconds is ~292 billion years
// We cap at a safe maximum to prevent chrono panics
// i64::MAX / 1000 (milliseconds per second) is a safe upper bound
const MAX_SAFE_SECS: u64 = (i64::MAX / 1000) as u64;
let secs_capped = secs.min(MAX_SAFE_SECS);

ChronoDuration::seconds(secs_capped as i64) + ChronoDuration::nanoseconds(nanos as i64)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function might be duplicated since I previously saw it in store.rs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in bde59bd

…ore module and update error enum

Signed-off-by: haileyesus2433 <haileyesusbe@gmail.com>
@haileyesus2433 haileyesus2433 merged commit a0849e7 into jans-cedarling-12349 Jan 22, 2026
4 checks passed
@haileyesus2433 haileyesus2433 deleted the jans-cedarling-13008 branch January 22, 2026 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants