-
Notifications
You must be signed in to change notification settings - Fork 160
feat(data): enhance DataStore with metrics tracking and configuration validation #13042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… validation - Added `enable_metrics` field to `DataStoreConfig` to control metrics tracking. - Implemented `DataEntry` struct to encapsulate data with metadata including access count and timestamps. - Introduced `ConfigValidationError` for validation errors in configuration. - Updated `DataStore` to validate configuration on initialization and track access counts for entries when metrics are enabled. - Added tests for configuration validation, metrics tracking, and Cedar type inference. This update improves the DataStore's functionality by allowing for better monitoring and management of data entries. Signed-off-by: haileyesus2433 <haileyesusbe@gmail.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughIntroduces a Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
jans-cedarling/cedarling/src/data/store.rs (1)
103-150: Inconsistency between DataEntry.expires_at and actual storage TTL.The
DataEntryis created at line 104 with the originalttlparameter, computingexpires_atbased on that value. However, the entry is stored at line 150 withchrono_ttlwhich is the effective TTL (potentially capped bymax_ttl).This causes a mismatch:
entry.expires_atreflects the original TTL while the actual storage expiration uses the capped TTL.DataEntry::is_expired()will return incorrect results when TTL capping occurs.Proposed fix: Create DataEntry with effective TTL
+ // Determine the effective TTL to use + // Priority: explicit ttl > config.default_ttl > infinite (10 years) + let requested_ttl = ttl + .or(self.config.default_ttl) + .unwrap_or(StdDuration::from_secs(INFINITE_TTL_SECS as u64)); + + // If an explicit TTL was provided, validate it against max_ttl + if ttl.is_some() { + if let Some(max_ttl) = self.config.max_ttl { + if requested_ttl > max_ttl { + return Err(DataError::TTLExceeded { + requested: requested_ttl, + max: max_ttl, + }); + } + } + } + + // Cap the effective TTL at max_ttl if set + let effective_ttl = if let Some(max_ttl) = self.config.max_ttl { + requested_ttl.min(max_ttl) + } else { + requested_ttl + }; + // Create DataEntry with metadata - let entry = DataEntry::new(key.to_string(), value, ttl); + let entry = DataEntry::new(key.to_string(), value, Some(effective_ttl)); // Check entry size before storing (including metadata) let entry_size = serde_json::to_string(&entry) .map_err(DataError::from)? .len(); // ... size check ... - - // Determine the effective TTL to use - // Priority: explicit ttl > config.default_ttl > infinite (10 years) - let requested_ttl = ttl - .or(self.config.default_ttl) - .unwrap_or(StdDuration::from_secs(INFINITE_TTL_SECS as u64)); - // ... rest of TTL logic moved above ...
🤖 Fix all issues with AI agents
In `@jans-cedarling/cedarling/src/data/config.rs`:
- Around line 117-125: The test test_default_ttl_exceeds_max currently uses
assert!(config.validate().is_err()) without an explicit message; update it to
use the matches! pattern per guidelines by asserting that config.validate()
matches Err(_) and include a clear failure message (e.g., "expected error when
default_ttl exceeds max_ttl") so the assertion reads as
assert!(matches!(config.validate(), Err(_)), "<message>"); reference
DataStoreConfig and its validate() method when making the change.
In `@jans-cedarling/cedarling/src/data/entry.rs`:
- Around line 143-147: The new() constructor in entry.rs currently uses
chrono::Duration::from_std(duration).unwrap_or_default(), which silently
converts oversized StdDuration to zero and causes immediate expiration; update
new(key: String, value: Value, ttl: Option<StdDuration>) to perform a
safe/saturating conversion instead (e.g., reuse the
std_duration_to_chrono_duration logic from store.rs or implement a saturating
conversion that caps to chrono::Duration::max_value()), and when conversion
fails either log the oversized TTL or return an Err/result variant as
appropriate for your API; ensure you reference and call the safe converter where
expires_at is computed so expires_at is never set to created_at due to a silent
unwrap_or_default.
- Around line 241-251: Update the test_serialization unit test for DataEntry to
also assert that created_at and expires_at survive a serde_json round-trip:
after serializing with serde_json::to_string and deserializing with
serde_json::from_str, compare the original entry.created_at and entry.expires_at
against deserialized.created_at and deserialized.expires_at (use direct equality
or a stable representation like to_rfc3339()/timestamp to avoid
timezone/precision issues) so the custom serde handling for datetimes is
explicitly verified.
- Around line 100-106: The current heuristic in entry.rs that classifies a
Value::Object as Self::Entity when it contains exactly "type" and "id"
(otherwise Self::Record) is brittle; to avoid misclassification of normal
records that coincidentally have those two fields, update the detection in the
code that sets DataEntry.data_type to require an explicit marker or validate the
"type" value against a whitelist of known Cedar entity types before returning
Self::Entity; otherwise leave as Self::Record. Reference the Value::Object
branch and the places where DataEntry.data_type is assigned so you can add the
marker check or whitelist validation rather than relying solely on
obj.contains_key("type") && obj.contains_key("id") && obj.len() == 2.
In `@jans-cedarling/cedarling/src/data/mod.rs`:
- Around line 10-17: The four types ConfigValidationError, DataStoreConfig,
CedarType, and DataEntry are declared pub in data::mod.rs but are not
re-exported from lib.rs; either remove their pub in data::mod.rs to make them
module-private or add re-exports in lib.rs (following the existing pattern used
for PolicyStoreLoadError from init::policy_store) to expose DataStoreConfig,
ConfigValidationError, CedarType, and DataEntry at the crate root; update
whichever file (lib.rs or data/mod.rs) accordingly so the public visibility and
crate exports are consistent.
In `@jans-cedarling/cedarling/src/data/store.rs`:
- Around line 302-321: Refactor the TTL calculation in push() to call
get_effective_ttl(ttl, default_ttl, max_ttl) instead of duplicating logic:
locate the TTL computation in push() (where it currently builds
requested_ttl/effective/capped), replace that block with a single call to
get_effective_ttl using the same ttl/default_ttl/max_ttl variables, and use the
returned ChronoDuration for expiry handling and any conversions; ensure behavior
(10-year INFINITE_TTL_SECS fallback and max_ttl capping) remains identical and
remove the now-redundant local TTL variables.
- Around line 714-717: Replace verbose uses of super::super::entry::CedarType in
the tests with the re-exported type from the parent module (use super::CedarType
or the parent-module re-export) to simplify imports; update the occurrences in
the test that asserts entry.data_type (in the test around the
assert_eq!(entry.data_type, ...)) and the similar reference in
test_cedar_type_inference so both use the re-exported CedarType symbol instead
of super::super::entry::CedarType.
- Around line 186-229: In get_entry(), avoid taking a write lock
unconditionally: acquire a read lock on self.storage first, locate and clone the
DataEntry (using storage.get and entry.clone()), check expiration via
entry.expires_at and if already expired treat it as absent (release read lock
and return None, optionally acquire a write lock only to remove the expired key
from storage); only when self.config.enable_metrics is true, upgrade by
acquiring a write lock and perform entry.increment_access() and
storage.set_with_ttl(key, entry.clone(), remaining_ttl, &[]) where remaining_ttl
is computed only for non-expired entries (do not recalculate and reset TTL for
already-expired entries), referencing get_entry, self.config.enable_metrics,
DataEntry.expires_at, storage.get, storage.set_with_ttl, and get_effective_ttl.
|
@coderabbitai please review |
|
✅ Actions performedReview triggered.
|
Signed-off-by: haileyesus2433 <haileyesusbe@gmail.com>
…g and validation - Added `std_duration_to_chrono_duration` function for safe conversion between `std::time::Duration` and `chrono::Duration`. - Updated `DataStore` to validate explicit TTL against `max_ttl` before calculating effective TTL. - Enhanced `DataEntry` creation to use saturating conversion for TTL to prevent overflow. - Improved entity reference validation in `CedarType` to ensure correct type and id fields. - Updated tests to verify new TTL handling and serialization behavior. This update improves the robustness of data handling in the DataStore and enhances the overall data integrity. Signed-off-by: haileyesus2433 <haileyesusbe@gmail.com>
|
Pls concider this comment #12349 (comment) And also. In SparKV, as storage is used Or maybe we can use |
| fn std_duration_to_chrono_duration(d: StdDuration) -> ChronoDuration { | ||
| let secs = d.as_secs(); | ||
| let nanos = d.subsec_nanos(); | ||
|
|
||
| // Saturating conversion: i64::MAX seconds is ~292 billion years | ||
| // We cap at a safe maximum to prevent chrono panics | ||
| // i64::MAX / 1000 (milliseconds per second) is a safe upper bound | ||
| const MAX_SAFE_SECS: u64 = (i64::MAX / 1000) as u64; | ||
| let secs_capped = secs.min(MAX_SAFE_SECS); | ||
|
|
||
| ChronoDuration::seconds(secs_capped as i64) + ChronoDuration::nanoseconds(nanos as i64) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function might be duplicated since I previously saw it in store.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved in bde59bd
…ore module and update error enum Signed-off-by: haileyesus2433 <haileyesusbe@gmail.com>
Prepare
Description
Target issue
closes #13008
Test and Document the changes
Please check the below before submitting your PR. The PR will not be merged if there are no commits that start with
docs:to indicate documentation changes or if the below checklist is not selected.Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.