Skip to content

Conversation

@capcom6
Copy link
Member

@capcom6 capcom6 commented Oct 16, 2025

Summary by CodeRabbit

  • New Features

    • 5-minute caching for message states (caches found and not-found)
    • Prometheus metrics for message totals and cache hits/misses
    • Background hashing task runs on startup
    • Grafana "SMS Messages Monitoring" dashboard
  • Improvements

    • Cache-first retrieval; automatic caching on enqueue/update and improved instrumentation
    • Message payloads include JSON tags; cache TTL now configurable via config
  • Breaking Changes

    • Cache API now uses byte payloads and adds Get options (TTL/update/delete)
    • Service wiring and some method signatures changed
  • Other

    • Third-party message GET path updated

@coderabbitai
Copy link

coderabbitai bot commented Oct 16, 2025

Walkthrough

Refactors messages DI into a namespaced fx.Module; adds an internal message cache and Prometheus metrics; changes cache payloads from string → []byte and introduces GetOption semantics; makes service cache-first (GetState) and UpdateState accept device pointer; unexports hashing task; updates online last-seen handling; adds Grafana dashboard and an API example tweak.

Changes

Cohort / File(s) Summary
API example
api/requests.http
Updated example GET request message ID in the request path.
App wiring & handlers
internal/sms-gateway/app.go, internal/sms-gateway/handlers/messages/3rdparty.go, internal/sms-gateway/handlers/messages/mobile.go
App now calls messages.Module(); 3rdparty handler dereferences returned state (MessageStateToDTO(*state)); mobile handler passes &device to UpdateState (signature changed).
Messages module wiring
internal/sms-gateway/modules/messages/module.go
Replaced exported Module var with Module() fx.Option; added private providers (newMetrics, newRepository, newHashingTask, newCache); registers NewService and cleaner; removed prior FxResult type and old fx.Provide pattern.
Messages service & tasks
internal/sms-gateway/modules/messages/service.go, internal/sms-gateway/modules/messages/tasks.go
Service now takes explicit deps (config, metrics, cache, repo, events, hashingTask, logger, idgen); adds metrics and cache fields; GetState is cache-first and returns *MessageStateOut; UpdateState accepts *models.Device; RunBackgroundTasks added. Hashing task unexported (hashingTask), queue added, constructor renamed, initial hashing run added.
Messages domain, cache wrapper & metrics
internal/sms-gateway/modules/messages/domain.go, .../messages/cache.go, .../messages/metrics.go
Added JSON tags; MessageStateOut embeds MessageStateIn and exposes device_id, is_hashed, is_encrypted; new internal cache wrapper serializes MessageStateOut with 100ms bounded ctx; new Prometheus metrics type with IncTotal and IncCache.
Online service
internal/sms-gateway/online/service.go
SetOnline now stores last-seen as []byte; drain/persist reads []byte and converts to string for parsing/logging.
Cache core & options
pkg/cache/cache.go, pkg/cache/options.go
Cache API changed: Set/SetOrFail accept []byte; Get/GetAndDelete return []byte and Get accepts variadic GetOption; Drain returns map[string][]byte; added GetOption functional API (AndSetTTL, AndUpdateTTL, AndSetValidUntil, AndDefaultTTL, AndDelete).
Cache implementations (memory & redis)
pkg/cache/memory.go, pkg/cache/redis.go
Memory and Redis adapted to []byte payloads and new Get/GetAndDelete/Set/SetOrFail/Drain signatures; constructors now return concrete types; Redis adds TTL-aware Lua script and option-driven Get; interface compliance assertions added.
Cache tests & benches
pkg/cache/memory_*.go (tests & benches)
Tests and benchmarks updated to use []byte(value) for Set/SetOrFail, compare retrieved bytes via string(retrieved), and expect Drain to return map[string][]byte.
Config & module config
internal/config/config.go, internal/config/module.go, internal/sms-gateway/modules/messages/config.go
Added Messages config block (CacheTTLSeconds, ProcessedLifetimeHours) and wired CacheTTL / processed lifetime into messages module; messages.Config gains CacheTTL time.Duration.
Observability dashboard
deployments/grafana/dashboards/messages.json
Added "SMS Messages Monitoring" Grafana dashboard (state distribution, error rate, cache hit ratio, throughput panels).

Sequence Diagram(s)

sequenceDiagram
    participant Handler
    participant Service
    participant Cache
    participant Repo
    participant Metrics

    rect rgb(230,240,255)
    Note over Service,Cache: GetState — cache-first flow
    end

    Handler->>Service: GetState(user, messageID)
    Service->>Cache: Get(ctx, "userID:messageID")
    alt cache hit (found)
        Cache-->>Service: []byte (marshalled DTO)
        Service->>Metrics: IncCache(true)
        Service->>Service: unmarshal -> DTO
        Service-->>Handler: *MessageStateOut
    else cache hit (cached not-found)
        Cache-->>Service: nil
        Service->>Metrics: IncCache(true)
        Service-->>Handler: ErrMessageNotFound
    else cache miss
        Cache-->>Service: not found
        Service->>Metrics: IncCache(false)
        Service->>Repo: GetState(user, messageID)
        alt repo found
            Repo-->>Service: MessageStateOut
            Service->>Cache: Set(ctx, key, marshalledBytes)
            Service->>Metrics: IncTotal(state)
            Service-->>Handler: *MessageStateOut
        else repo not found
            Repo-->>Service: not found
            Service->>Cache: Set(ctx, key, nil)
            Service-->>Handler: ErrMessageNotFound
        end
    end
Loading
sequenceDiagram
    participant Handler
    participant Service
    participant Repo
    participant Cache
    participant Metrics
    participant Events

    rect rgb(235,255,235)
    Note over Service,Repo: Enqueue flow (create → cache → metrics → events)
    end

    Handler->>Service: Enqueue(user, message)
    Service->>Repo: Create(message)
    Repo-->>Service: MessageStateOut
    Service->>Cache: Set(ctx, key, marshalledBytes)
    Service->>Metrics: IncTotal(state)
    Service->>Events: Notify(created)
    Service-->>Handler: success
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • [messages] explicit device selection #150 — modifies messages domain/service APIs (MessageStateIn/MessageStateOut and GetState/UpdateState signatures); strongly related to the service and domain signature changes in this PR.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "[messages] cache message state" is concise and clearly identifies the primary objective of the changeset. The title is supported by substantial changes that introduce a dedicated caching layer for message state, including a new cache utility, cache-first retrieval logic in GetState/UpdateState methods, cache hit/miss metrics, and configuration support for cache TTL. While the PR includes secondary changes such as module refactoring, metrics collection, and cache interface modifications, the title accurately captures the core architectural change from the developer's perspective. A teammate scanning the git history would understand this PR introduces message state caching functionality.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch messages/cache-state

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
internal/sms-gateway/modules/messages/cache.go (1)

22-34: Consider initializing data as nil instead of empty slice.

At line 24, data := []byte{} is initialized but immediately reassigned if message != nil. Initializing as var data []byte (nil) would be clearer and more idiomatic, as it accurately represents "no data yet".

Apply this diff:

-	var err error
-	data := []byte{}
+	var (
+		err error
+		data []byte
+	)
internal/sms-gateway/modules/messages/service.go (3)

41-41: Fix field naming to follow Go conventions.

The field idgen should be idGen to maintain camelCase naming conventions in Go.

Apply this diff:

-	idgen  func() string
+	idGen  func() string

And update all usages (line 65, line 225):

-		idgen:  idgen,
+		idGen:  idgen,
-		msg.ExtID = s.idgen()
+		msg.ExtID = s.idGen()

114-114: Log cache errors for consistency and observability.

Cache errors are logged in GetState (lines 150, 159) and Enqueue (line 234), but not here in UpdateState. For consistent observability, cache failures should be logged even though they don't fail the operation.

Apply this diff:

-	s.cache.Set(context.Background(), device.UserID, existing.ExtID, anys.AsPointer(modelToMessageState(existing)))
+	if err := s.cache.Set(context.Background(), device.UserID, existing.ExtID, anys.AsPointer(modelToMessageState(existing))); err != nil {
+		s.logger.Warn("can't cache message", zap.String("id", existing.ExtID), zap.Error(err))
+	}

136-139: Consider documenting the nil-caching strategy.

Caching nil to represent "message not found" is an efficient pattern that prevents repeated database lookups for non-existent messages. However, this non-obvious behavior could benefit from a brief inline comment explaining the intent.

Example comment:

 	if err == nil {
 		s.metrics.IncCache(true)
+		// Cache nil entries represent "not found" and prevent repeated lookups
 		if dto == nil {
 			return nil, ErrMessageNotFound
 		}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fad3c1 and 4104505.

📒 Files selected for processing (19)
  • api/requests.http (1 hunks)
  • internal/sms-gateway/app.go (1 hunks)
  • internal/sms-gateway/handlers/messages/3rdparty.go (1 hunks)
  • internal/sms-gateway/handlers/messages/mobile.go (1 hunks)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
  • internal/sms-gateway/modules/messages/domain.go (1 hunks)
  • internal/sms-gateway/modules/messages/metrics.go (1 hunks)
  • internal/sms-gateway/modules/messages/module.go (1 hunks)
  • internal/sms-gateway/modules/messages/service.go (5 hunks)
  • internal/sms-gateway/modules/messages/tasks.go (4 hunks)
  • internal/sms-gateway/online/service.go (2 hunks)
  • pkg/cache/cache.go (2 hunks)
  • pkg/cache/memory.go (10 hunks)
  • pkg/cache/memory_bench_test.go (18 hunks)
  • pkg/cache/memory_concurrency_test.go (13 hunks)
  • pkg/cache/memory_edge_test.go (19 hunks)
  • pkg/cache/memory_profile_test.go (7 hunks)
  • pkg/cache/memory_test.go (28 hunks)
  • pkg/cache/redis.go (4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.

Applied to files:

  • pkg/cache/redis.go
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.

Applied to files:

  • pkg/cache/memory.go
🧬 Code graph analysis (13)
internal/sms-gateway/handlers/messages/3rdparty.go (1)
internal/sms-gateway/handlers/converters/messages.go (1)
  • MessageStateToDTO (45-55)
pkg/cache/memory_edge_test.go (1)
pkg/cache/options.go (1)
  • WithTTL (22-30)
pkg/cache/memory_test.go (1)
pkg/cache/options.go (2)
  • WithTTL (22-30)
  • WithValidUntil (34-38)
internal/sms-gateway/modules/messages/module.go (3)
internal/sms-gateway/modules/cleaner/module.go (1)
  • AsCleanable (8-14)
internal/sms-gateway/cache/factory.go (1)
  • Factory (17-19)
internal/sms-gateway/modules/messages/service.go (2)
  • NewService (44-67)
  • Service (30-42)
internal/sms-gateway/modules/messages/cache.go (2)
internal/sms-gateway/modules/messages/domain.go (1)
  • MessageStateOut (38-44)
pkg/cache/options.go (1)
  • WithTTL (22-30)
pkg/cache/cache.go (1)
pkg/cache/options.go (1)
  • Option (6-6)
internal/sms-gateway/app.go (1)
internal/sms-gateway/modules/messages/module.go (1)
  • Module (11-35)
internal/sms-gateway/modules/messages/service.go (8)
internal/sms-gateway/modules/devices/service.go (2)
  • Service (30-39)
  • NewService (157-165)
internal/sms-gateway/modules/events/service.go (2)
  • Service (13-24)
  • NewService (26-38)
internal/sms-gateway/modules/sse/service.go (2)
  • Service (17-25)
  • NewService (38-47)
internal/sms-gateway/modules/db/module.go (1)
  • IDGen (8-8)
internal/sms-gateway/models/models.go (2)
  • Device (25-36)
  • User (17-23)
internal/sms-gateway/modules/messages/domain.go (2)
  • MessageStateIn (31-36)
  • MessageStateOut (38-44)
internal/sms-gateway/modules/messages/repository_filter.go (2)
  • MessagesSelectFilter (16-23)
  • MessagesSelectOptions (25-36)
internal/sms-gateway/modules/messages/repository.go (1)
  • ErrMessageNotFound (18-18)
pkg/cache/memory_bench_test.go (1)
pkg/cache/options.go (1)
  • WithTTL (22-30)
pkg/cache/memory_concurrency_test.go (1)
pkg/cache/options.go (1)
  • WithTTL (22-30)
internal/sms-gateway/modules/messages/domain.go (1)
internal/sms-gateway/modules/messages/models.go (1)
  • ProcessingState (12-12)
pkg/cache/redis.go (3)
pkg/cache/errors.go (1)
  • ErrKeyNotFound (7-7)
pkg/cache/options.go (1)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
pkg/cache/memory.go (2)
pkg/cache/options.go (1)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: E2E
  • GitHub Check: Test
  • GitHub Check: Benchmark
  • GitHub Check: Lint
  • GitHub Check: Analyze (go)
🔇 Additional comments (24)
api/requests.http (1)

74-74: Verify the new message ID is valid.

The message ID has been updated from K56aIsVsQ2rECdv_ajzTd to Fc10ZyTRDVlqPjIm9Jbly. Since this is a test/example HTTP requests file, ensure the new ID references a valid message in your test environment.

pkg/cache/memory_profile_test.go (1)

29-29: LGTM! Cache value type migration looks correct.

The test updates correctly reflect the cache API migration from string to []byte values. All Set calls now pass []byte(value), and the retrieval comparison appropriately converts back with string(retrieved).

Also applies to: 79-79, 133-133, 179-179, 211-211, 237-237, 290-290

internal/sms-gateway/handlers/messages/mobile.go (1)

97-97: LGTM! Updated to pass device pointer.

The UpdateState call now correctly passes a pointer to the device object instead of the device ID string, aligning with the updated service signature that accepts *models.Device.

internal/sms-gateway/handlers/messages/3rdparty.go (1)

235-235: LGTM! Correctly dereferences state pointer.

The converter call now appropriately dereferences the state pointer, which aligns with the updated GetState method that returns *MessageStateOut. The converter function expects a value, not a pointer.

internal/sms-gateway/app.go (1)

49-49: LGTM! Proper fx module initialization pattern.

The change from messages.Module to messages.Module() correctly adopts the function-based module initialization pattern, which is the idiomatic approach for fx dependency injection.

pkg/cache/memory_edge_test.go (1)

22-22: LGTM! Comprehensive test updates for byte slice values.

All test cases correctly updated to use []byte for cache values, including:

  • Standard Set/Get operations
  • TTL and expiration scenarios
  • Edge cases (nil context, empty keys, overwrites)
  • Drain operations with expired items

The conversions are consistent and preserve all test semantics.

Also applies to: 35-35, 49-49, 69-69, 79-79, 91-91, 101-101, 115-115, 121-121, 131-131, 143-143, 165-167, 237-237, 274-274, 284-284, 295-295, 301-301, 319-319, 339-339, 349-349, 364-364

pkg/cache/memory_test.go (1)

19-19: LGTM! Complete test suite migration to byte slice values.

All core cache operations thoroughly updated:

  • Basic Set/Get operations
  • TTL variants (WithTTL, WithValidUntil, default TTL)
  • SetOrFail scenarios (new and existing keys)
  • Delete operations
  • GetAndDelete operations
  • Drain operations with proper map type handling (map[string][]byte)
  • Cleanup operations
  • Edge cases (empty values, special characters, large values)

The migration is comprehensive and correct.

Also applies to: 30-30, 44-44, 55-55, 69-69, 80-80, 94-94, 105-105, 130-130, 141-141, 155-155, 161-161, 172-172, 185-185, 224-224, 235-235, 271-272, 294-294, 333-333, 368-368, 385-385, 399-399, 405-405, 416-416, 429-429, 440-440, 453-453, 464-464, 477-477, 488-488

pkg/cache/memory_bench_test.go (1)

25-25: LGTM! Benchmark suite updated for byte slice performance.

All benchmarks correctly updated to use []byte values:

  • Basic operations (Set, Get, SetOrFail, Delete, GetAndDelete)
  • Cleanup and Drain operations
  • Concurrent operation benchmarks (reads, writes, mixed workloads)
  • Scaling benchmarks with various loads
  • TTL overhead comparisons
  • Large value performance tests (directly using []byte)
  • Random and hot/cold key access patterns

The updates ensure accurate performance measurements with the new value type.

Also applies to: 38-38, 61-61, 77-77, 95-95, 114-114, 129-129, 149-149, 168-168, 219-219, 260-260, 290-290, 330-330, 332-332, 366-366, 398-398, 415-415, 437-437, 457-457

pkg/cache/memory_concurrency_test.go (1)

23-23: LGTM! Concurrency tests properly updated for byte slices.

All concurrency scenarios correctly updated:

  • Concurrent reads/writes with proper string conversions
  • Read-write mixed operations
  • SetOrFail contention tests
  • Concurrent Drain with type assertion to map[string][]byte
  • Concurrent Cleanup operations
  • GetAndDelete concurrency tests
  • Race condition detection across all operations

The tests maintain thread-safety verification while using the new value type.

Also applies to: 43-43, 70-70, 92-92, 137-137, 170-170, 183-183, 220-220, 260-260, 285-285, 317-317, 364-364, 424-424

internal/sms-gateway/modules/messages/cache.go (1)

36-52: LGTM! Cache retrieval implementation is correct.

The method properly handles:

  • Missing keys (returns nil, nil)
  • Empty data (returns nil, nil)
  • Valid data (unmarshals and returns)

Error wrapping is appropriate.

internal/sms-gateway/modules/messages/metrics.go (1)

1-58: LGTM! Metrics implementation follows Prometheus best practices.

The metrics module correctly:

  • Uses namespace/subsystem organization
  • Provides descriptive help text
  • Implements appropriate metric types (CounterVec with labels, Counters)
  • Uses promauto for automatic registration
internal/sms-gateway/online/service.go (2)

56-77: LGTM! Correctly adapts to []byte cache values.

The string-to-[]byte conversion for cache storage is appropriate and aligns with the updated Cache interface.


79-128: LGTM! Correctly converts []byte values back to string.

The persist method properly:

  • Converts []byte values to string before parsing timestamps
  • Handles parsing errors with appropriate logging
  • Maintains the existing drain-and-persist logic
internal/sms-gateway/modules/messages/module.go (1)

11-35: LGTM! Clean refactoring to constructor-based dependency injection.

The new Module() function properly:

  • Uses fx.Module for scoping
  • Decorates logger with module name
  • Provides cache factory as private dependency
  • Encapsulates metrics, repository, hashingTask, and cache as private
  • Exposes Service and Cleanable publicly

This is a solid improvement over FxResult-based wiring.

internal/sms-gateway/modules/messages/tasks.go (1)

25-88: LGTM! Consistent refactoring to unexported type.

The change from exported HashingTask to unexported hashingTask is appropriate for an internal implementation detail. All methods and constructor updated correctly.

internal/sms-gateway/modules/messages/domain.go (1)

31-44: LGTM! JSON tags properly added for serialization.

The JSON tags are:

  • Consistently formatted in snake_case
  • Appropriately named
  • Enable the cache wrapper to serialize/deserialize MessageStateOut objects

This supports the new caching functionality introduced in cache.go.

pkg/cache/cache.go (1)

5-36: LGTM! Cache interface updated to support binary data.

The change from string to []byte for values:

  • Enables caching of binary data (e.g., JSON-encoded objects)
  • Is consistently applied across all methods
  • Maintains clear documentation

The added Delete and Cleanup methods complete the cache interface.

pkg/cache/redis.go (4)

73-92: LGTM! Drain correctly converts Redis strings to []byte.

The method properly:

  • Uses the hgetallAndDeleteScript for atomic drain-and-delete
  • Converts string values from Redis to []byte
  • Returns an empty map when no items exist
  • Wraps errors appropriately

Based on learnings, Redis's native TTL ensures only non-expired items are returned.


95-106: LGTM! Get and GetAndDelete correctly return []byte.

Both methods properly:

  • Convert Redis string responses to []byte
  • Return ErrKeyNotFound for missing keys
  • Wrap errors appropriately

Also applies to: 109-120


123-142: LGTM! Set and SetOrFail correctly accept []byte.

Both methods properly:

  • Accept []byte values
  • Handle TTL via HExpireAt (leveraging Redis's native expiration)
  • Use appropriate Redis commands (HSet/HSetNX)
  • Return ErrKeyExists when SetOrFail fails

Also applies to: 145-168


44-56: ****

Returning a concrete type that implements an interface does not break existing code in Go. Code assigning NewRedis() to a Cache interface variable will continue to work due to Go's implicit interface implementation:

var c Cache = NewRedis(...)  // Still valid

The concrete return type is actually more permissive—it allows both interface and concrete type assignments, whereas a pure interface return would only allow the former. The compile-time assertion at line 170 (var _ Cache = (*redisCache)(nil)) confirms the type properly implements the interface.

Likely an incorrect or invalid review comment.

pkg/cache/memory.go (1)

16-172: LGTM! Clean migration to byte-slice values.

The string-to-[]byte migration is consistent throughout the file. The error handling in getValue (line 152) correctly returns nil instead of an empty string. The interface compliance check at line 172 is good practice.

internal/sms-gateway/modules/messages/service.go (2)

69-75: LGTM! Proper WaitGroup usage.

The RunBackgroundTasks method correctly manages the goroutine lifecycle with WaitGroup and defer.


233-236: LGTM! Proper caching and error handling.

The caching of newly enqueued messages with appropriate error logging aligns with the cache-aside pattern used throughout the service.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
internal/sms-gateway/modules/messages/cache.go (1)

35-35: Consider a delimiter-safe key construction.

The key concatenation userID+":"+ID could theoretically collide if either component contains a colon. While this is unlikely given typical ID formats, consider using a more robust approach for production systems.

Apply this diff to use a safer delimiter or hashing approach:

-	return c.storage.Set(ctx, userID+":"+ID, data, cacheImpl.WithTTL(5*time.Minute))
+	return c.storage.Set(ctx, fmt.Sprintf("%s:%s", userID, ID), data, cacheImpl.WithTTL(5*time.Minute))

Or for guaranteed collision avoidance, use URL encoding or a similar escaping mechanism if IDs could contain special characters.

internal/sms-gateway/modules/messages/service.go (3)

114-116: Consider using request context for cache operations.

The cache operation uses context.Background(), which means it won't be cancelled if the request is cancelled. While this is acceptable for fire-and-forget caching, it could lead to unnecessary work if requests are frequently cancelled.

Consider accepting a context parameter and using it for cache operations:

-func (s *Service) UpdateState(device *models.Device, message MessageStateIn) error {
+func (s *Service) UpdateState(ctx context.Context, device *models.Device, message MessageStateIn) error {
 	existing, err := s.messages.Get(MessagesSelectFilter{ExtID: message.ID, DeviceID: device.ID}, MessagesSelectOptions{})
 	if err != nil {
 		return err
 	}
 
 	// ... existing logic ...
 
-	if err := s.cache.Set(context.Background(), device.UserID, existing.ExtID, anys.AsPointer(modelToMessageState(existing))); err != nil {
+	if err := s.cache.Set(ctx, device.UserID, existing.ExtID, anys.AsPointer(modelToMessageState(existing))); err != nil {
 		s.logger.Warn("can't cache message", zap.String("id", existing.ExtID), zap.Error(err))
 	}

Note: This would require updating all call sites to pass a context.


152-156: Consider shorter TTL for cached "not found" entries.

Caching nil entries to represent "not found" prevents repeated database lookups, which is good. However, the 5-minute TTL (inherited from cache.Set) might be too long. If a message is created shortly after a failed lookup, the cached nil will prevent finding it for up to 5 minutes.

Consider using a shorter TTL for nil entries:

// In cache.go, modify the Set method to accept optional TTL
func (c *cache) Set(ctx context.Context, userID, ID string, message *MessageStateOut, ttl ...time.Duration) error {
	var (
		err  error
		data []byte
	)

	if message != nil {
		data, err = json.Marshal(message)
		if err != nil {
			return fmt.Errorf("can't marshal message: %w", err)
		}
	}

	cacheTTL := 5 * time.Minute
	if len(ttl) > 0 {
		cacheTTL = ttl[0]
	}

	return c.storage.Set(ctx, userID+":"+ID, data, cacheImpl.WithTTL(cacheTTL))
}

Then use a shorter TTL for nil entries:

-		if err := s.cache.Set(context.Background(), user.ID, ID, nil); err != nil {
+		if err := s.cache.Set(context.Background(), user.ID, ID, nil, 30*time.Second); err != nil {
 			s.logger.Warn("can't cache message", zap.String("id", ID), zap.Error(err))
 		}

237-239: Inconsistent cache error logging level.

Cache errors here are logged at Error level (line 238), while similar cache failures in UpdateState (line 115) and GetState (lines 154, 163) use Warn level. Since cache failures are not critical and don't fail the operation, consider using consistent Warn level throughout.

Apply this diff for consistency:

 	if err := s.cache.Set(context.Background(), device.UserID, message.ID, anys.AsPointer(modelToMessageState(msg))); err != nil {
-		s.logger.Error("can't cache message", zap.Error(err))
+		s.logger.Warn("can't cache message", zap.String("id", msg.ExtID), zap.Error(err))
 	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4104505 and cdec0e9.

📒 Files selected for processing (3)
  • deployments/grafana/dashboards/messages.json (1 hunks)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
  • internal/sms-gateway/modules/messages/service.go (5 hunks)
✅ Files skipped from review due to trivial changes (1)
  • deployments/grafana/dashboards/messages.json
🧰 Additional context used
🧬 Code graph analysis (2)
internal/sms-gateway/modules/messages/service.go (5)
internal/sms-gateway/modules/devices/service.go (2)
  • Service (30-39)
  • NewService (157-165)
internal/sms-gateway/models/models.go (2)
  • Device (25-36)
  • User (17-23)
internal/sms-gateway/modules/messages/domain.go (2)
  • MessageStateIn (31-36)
  • MessageStateOut (38-44)
internal/sms-gateway/modules/messages/repository.go (1)
  • ErrMessageNotFound (18-18)
internal/sms-gateway/handlers/middlewares/deviceauth/deviceauth.go (1)
  • WithDevice (83-87)
internal/sms-gateway/modules/messages/cache.go (2)
internal/sms-gateway/modules/messages/domain.go (1)
  • MessageStateOut (38-44)
pkg/cache/options.go (1)
  • WithTTL (22-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E
  • GitHub Check: Benchmark
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
internal/sms-gateway/modules/messages/cache.go (2)

16-20: LGTM!

The constructor is straightforward and correctly initializes the cache wrapper.


38-54: LGTM!

The Get method correctly handles cache misses and unmarshals cached data. The pattern of returning (nil, nil) for empty data aligns with the caching strategy for "not found" entries used in the service layer.

internal/sms-gateway/modules/messages/service.go (2)

44-67: LGTM!

The refactored constructor properly accepts and initializes all dependencies, following good dependency injection practices.


135-145: LGTM! Cache-first strategy with good metrics tracking.

The cache-first approach with hit/miss metrics is well-implemented. The pattern of returning cached nil entries to represent "not found" is valid and helps prevent repeated lookups.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
internal/sms-gateway/modules/messages/service.go (1)

134-167: LGTM! Cache-first retrieval pattern is well-implemented.

The method implements a sound caching strategy:

  • Cache hits (including nil entries) return immediately with metrics
  • Cache misses fetch from the database and populate the cache
  • Nil entries are cached to prevent repeated lookups for non-existent messages (line 153)
  • Cache errors are logged but don't fail the operation

The return type change to *MessageStateOut aligns with the caching layer where nil can represent a cached "not found" state.

Optional: Consider accepting context as a parameter.

Methods like GetState currently use context.Background() for cache operations. Accepting a context.Context parameter would enable request-scoped cancellation and tracing. However, this would require changes across multiple callers and is a broader architectural decision.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cdec0e9 and 8b0bfc8.

📒 Files selected for processing (1)
  • internal/sms-gateway/modules/messages/service.go (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/sms-gateway/modules/messages/service.go (7)
internal/sms-gateway/modules/devices/service.go (2)
  • Service (30-39)
  • NewService (157-165)
internal/sms-gateway/modules/events/service.go (2)
  • Service (13-24)
  • NewService (26-38)
internal/sms-gateway/modules/db/module.go (1)
  • IDGen (8-8)
internal/sms-gateway/models/models.go (2)
  • Device (25-36)
  • User (17-23)
internal/sms-gateway/modules/messages/domain.go (2)
  • MessageStateIn (31-36)
  • MessageStateOut (38-44)
internal/sms-gateway/modules/messages/repository.go (1)
  • ErrMessageNotFound (18-18)
internal/sms-gateway/handlers/middlewares/deviceauth/deviceauth.go (1)
  • WithDevice (83-87)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build / Docker image (linux/arm64)
  • GitHub Check: Build / Docker image (linux/amd64)
  • GitHub Check: Benchmark
  • GitHub Check: Analyze (go)
🔇 Additional comments (5)
internal/sms-gateway/modules/messages/service.go (5)

33-34: LGTM! New struct fields align with explicit dependency wiring.

The addition of metrics and cache fields, along with making idgen explicit, supports the module's migration from fx-based DI to explicit constructor wiring.

Also applies to: 41-41


44-67: LGTM! Explicit dependency injection is correctly implemented.

The constructor properly wires all dependencies, replacing the fx-based DI pattern with explicit parameter passing. All parameters are correctly assigned to their respective struct fields.


69-75: LGTM! Background task lifecycle is correctly managed.

The method properly launches the hashing task in a background goroutine with appropriate WaitGroup synchronization for graceful shutdown.


90-121: LGTM! Caching integration in UpdateState is well-implemented.

The method correctly:

  • Uses the device pointer to access both device.ID and device.UserID for caching
  • Caches the updated state after persisting to the database (correct ordering)
  • Logs cache errors without failing the operation (appropriate for cache layer)
  • Increments metrics for the final state

The signature change from deviceID string to device *models.Device is a breaking change but provides necessary access to additional device fields.


237-240: LGTM! Enqueue caching is correctly implemented.

The caching logic properly:

  • Caches the newly created message after successful database insertion
  • Uses consistent cache key structure (device.UserID, message.ID)
  • Logs cache errors without failing the enqueue operation
  • Increments metrics after the cache operation

@capcom6 capcom6 force-pushed the messages/cache-state branch from 8b0bfc8 to 9a24386 Compare October 17, 2025 07:17
@capcom6
Copy link
Member Author

capcom6 commented Oct 17, 2025

Deployed...

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/cache/memory.go (1)

30-37: Avoid external mutation: copy []byte on Set/Get/Drain

Slices escape to callers as aliasing references; callers can mutate cached data or vice‑versa. Copy at boundaries.

 func newItem(value []byte, opts options) *memoryItem {
   item := &memoryItem{
-    value:      value,
+    value:      append([]byte(nil), value...),
     validUntil: opts.validUntil,
   }
   return item
 }
@@
-  items := make(map[string][]byte, len(cpy))
+  items := make(map[string][]byte, len(cpy))
   for key, item := range cpy {
-    items[key] = item.value
+    items[key] = append([]byte(nil), item.value...)
   }
@@
 func (m *memoryCache) getValue(getter func() (*memoryItem, bool)) ([]byte, error) {
   item, err := m.getItem(getter)
   if err != nil {
     return nil, err
   }
-  return item.value, nil
+  return append([]byte(nil), item.value...), nil
 }

Also applies to: 68-71, 173-180

pkg/cache/redis.go (1)

209-213: Pass field to HExpireAt in SetOrFail

The call lacks the field argument; TTL won’t be set. Align with Set.

-	if !options.validUntil.IsZero() {
-		if err := r.client.HExpireAt(ctx, r.key, options.validUntil).Err(); err != nil {
+	if !options.validUntil.IsZero() {
+		if err := r.client.HExpireAt(ctx, r.key, options.validUntil, key).Err(); err != nil {
 			return fmt.Errorf("can't set cache item ttl: %w", err)
 		}
 	}

Based on learnings

🧹 Nitpick comments (2)
pkg/cache/options.go (1)

40-84: Clarify AndUpdateTTL semantics and edge cases

Please document and enforce behavior when:

  • AndUpdateTTL is used on items without TTL (should it set from now?).
  • Negative/zero durations are passed to AndSetTTL/AndUpdateTTL (normalize or reject).
    This avoids divergent impls between memory/redis.

Would you like a small test matrix added for Get with: no opts, AndDelete, AndSetTTL, AndUpdateTTL (with/without prior TTL)?

pkg/cache/memory.go (1)

113-121: Remove commented legacy code

Dead commented block for GetAndDelete; delete it to reduce noise.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a24386 and 7337507.

📒 Files selected for processing (4)
  • pkg/cache/cache.go (2 hunks)
  • pkg/cache/memory.go (8 hunks)
  • pkg/cache/options.go (1 hunks)
  • pkg/cache/redis.go (5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.

Applied to files:

  • pkg/cache/memory.go
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.

Applied to files:

  • pkg/cache/redis.go
🧬 Code graph analysis (3)
pkg/cache/cache.go (1)
pkg/cache/options.go (2)
  • Option (6-6)
  • GetOption (47-47)
pkg/cache/memory.go (2)
pkg/cache/options.go (3)
  • GetOption (47-47)
  • AndDelete (80-84)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
pkg/cache/redis.go (3)
pkg/cache/options.go (3)
  • GetOption (47-47)
  • AndDelete (80-84)
  • Option (6-6)
pkg/cache/errors.go (1)
  • ErrKeyNotFound (7-7)
pkg/cache/cache.go (1)
  • Cache (5-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: E2E
  • GitHub Check: Benchmark
  • GitHub Check: Lint
  • GitHub Check: Analyze (go)
🔇 Additional comments (3)
pkg/cache/cache.go (1)

5-35: API migration to []byte and GetOption looks good

Interface surface is coherent: binary-safe values, Delete/Cleanup added, Drain returns map[string][]byte, and Get’s options are future‑proof. LGTM.

pkg/cache/redis.go (1)

63-75: No DI binding issues found—code is correctly wired

Verification confirms that NewRedis returns *redisCache, which implements the Cache interface via pointer receiver methods (Set, SetOrFail, Get, etc.). The factory at internal/sms-gateway/cache/factory.go:49 correctly returns the result as Cache (the interface). No changes needed.

pkg/cache/memory.go (1)

16-23: Ignore this review comment. The return type change is correct.

NewMemory returning *memoryCache poses no DI binding issues. The type implements all Cache interface methods with pointer receivers, and the explicit interface satisfaction check at line 192 (var _ Cache = (*memoryCache)(nil)) confirms this. Go's implicit interface satisfaction ensures the concrete type satisfies the interface contract automatically—no changes needed.

Likely an incorrect or invalid review comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
pkg/cache/memory.go (1)

90-106: Fix nil dereference and TTL semantics.

Multiple critical issues remain:

  1. Nil pointer dereference: Line 94 calls item.isExpired() without first checking ok. If the key doesn't exist, item is nil and this panics.
  2. Incorrect AndUpdateTTL: Line 100 adds to item.validUntil directly. If the item has no TTL (zero time), this produces a timestamp in epoch 1970 instead of "now + duration".
  3. Unsafe delete check: Line 92 checks o.delete but doesn't verify ok before proceeding to line 94.

Apply this diff to fix:

 m.mux.Lock()
 item, ok := m.items[key]
+if !ok {
+	m.mux.Unlock()
+	return nil, false
+}
+if item.isExpired(time.Now()) {
+	if o.delete {
+		delete(m.items, key)
+	}
+	m.mux.Unlock()
+	return nil, false
+}
+
 if o.delete {
 	delete(m.items, key)
-} else if !item.isExpired(time.Now()) {
+} else {
 	if o.validUntil != nil {
 		item.validUntil = *o.validUntil
 	} else if o.setTTL != nil {
 		item.validUntil = time.Now().Add(*o.setTTL)
 	} else if o.updateTTL != nil {
-		item.validUntil = item.validUntil.Add(*o.updateTTL)
+		if item.validUntil.IsZero() {
+			item.validUntil = time.Now().Add(*o.updateTTL)
+		} else {
+			item.validUntil = item.validUntil.Add(*o.updateTTL)
+		}
 	}
 }
 m.mux.Unlock()
pkg/cache/redis.go (2)

24-48: Fix Lua script bugs: undefined variable, wrong Redis command, incorrect TTL logic.

Multiple critical issues in the script:

  1. Line 44: Uses undefined variable exp — should be newTtl.
  2. Line 44: HExpire expects a relative duration in seconds, but the code treats it as absolute time.
  3. Line 42: HTTL returns seconds but may not exist in all Redis versions; use HPTTL (milliseconds) for compatibility.
  4. Lines 42-44: TTL delta logic is broken:
    • If field has no TTL (ttl == -1), adding ttlDelta to -1 gives incorrect result.
    • Should set TTL to "now + ttlDelta" when no TTL exists.
    • When TTL exists, should extend it, not replace it.

Apply this diff to fix:

 if ttlTs > 0 then
   redis.call('HExpireAt', KEYS[1], ttlTs, field)
 elseif ttlDelta > 0 then
-  local ttl = redis.call('HTTL', KEYS[1], field)
-  local newTtl = ttl + ttlDelta
-  redis.call('HExpire', KEYS[1], exp, field)
+  local pttl = redis.call('HPTTL', KEYS[1], field)
+  if pttl == -1 then
+    -- No TTL set, set relative from now
+    redis.call('HExpire', KEYS[1], ttlDelta, field)
+  elseif pttl > 0 then
+    -- Extend existing TTL
+    local currentSec = math.floor(pttl / 1000)
+    redis.call('HExpire', KEYS[1], currentSec + ttlDelta, field)
+  end
+  -- If pttl == -2, field doesn't exist (shouldn't happen after HGET succeeded)
 end

128-146: AndDelete bypassed when no TTL options provided.

Critical logic flaw: Lines 137-146 fall back to plain HGet when no TTL options are set, even if o.delete is true. This means Get(ctx, key, AndDelete()) won't delete the key.

Root cause: The else at line 136 executes when all TTL options are nil, but doesn't check whether o.delete is set. The plain HGet fallback at line 138 ignores the delete flag entirely.

Apply this diff to fix:

 // Handle TTL options atomically using Lua script
 var ttlTimestamp, ttlDelta int64
 if o.validUntil != nil {
 	ttlTimestamp = o.validUntil.Unix()
 } else if o.setTTL != nil {
 	ttlTimestamp = time.Now().Add(*o.setTTL).Unix()
 } else if o.updateTTL != nil {
 	ttlDelta = int64(o.updateTTL.Seconds())
-} else {
-	// No TTL options, fallback to simple get
-	val, err := r.client.HGet(ctx, r.key, key).Result()
-	if err != nil {
-		if err == redis.Nil {
-			return nil, ErrKeyNotFound
-		}
-		return nil, fmt.Errorf("can't get cache item: %w", err)
-	}
-	return []byte(val), nil
 }

 delArg := "0"
 if o.delete {
 	delArg = "1"
 }

 // Use atomic get and TTL update script
 result, err := r.client.Eval(ctx, getAndUpdateTTLScript, []string{r.key}, key, delArg, ttlTimestamp, ttlDelta).Result()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7337507 and 002fa6e.

📒 Files selected for processing (2)
  • pkg/cache/memory.go (8 hunks)
  • pkg/cache/redis.go (5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.

Applied to files:

  • pkg/cache/redis.go
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.

Applied to files:

  • pkg/cache/memory.go
🧬 Code graph analysis (2)
pkg/cache/redis.go (3)
pkg/cache/options.go (3)
  • GetOption (47-47)
  • AndDelete (80-84)
  • Option (6-6)
pkg/cache/errors.go (1)
  • ErrKeyNotFound (7-7)
pkg/cache/cache.go (1)
  • Cache (5-36)
pkg/cache/memory.go (2)
pkg/cache/options.go (3)
  • GetOption (47-47)
  • AndDelete (80-84)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Benchmark
  • GitHub Check: E2E
  • GitHub Check: Analyze (go)
🔇 Additional comments (14)
pkg/cache/memory.go (6)

16-23: LGTM: Concrete type return aligns with API pattern.

Returning *memoryCache instead of the Cache interface is consistent with NewRedis returning *redisCache and allows callers flexibility.


26-37: LGTM: Defensive copy prevents external mutations.

The []byte migration and defensive copy in newItem correctly prevent callers from mutating cached values.


109-112: LGTM: Clean delegation pattern.

GetAndDelete correctly delegates to Get with the AndDelete() option. Note this depends on the Get implementation being fixed (see previous comment).


115-121: LGTM: []byte migration consistent.

Both Set and SetOrFail correctly migrate to []byte parameters and rely on newItem for defensive copying.

Also applies to: 124-136


164-171: LGTM: Defensive copy in getValue.

Line 170 correctly returns a defensive copy to prevent external mutations.


187-187: LGTM: Interface compliance check.

The type assertion ensures memoryCache correctly implements the Cache interface.

pkg/cache/redis.go (8)

15-22: LGTM: Safe UNLINK with DEL fallback.

Lines 18-19 correctly use pcall to attempt UNLINK (non-blocking) and fall back to DEL for older Redis versions.


59-71: LGTM: Concrete type return consistent with memory cache.

Returning *redisCache aligns with the API pattern and allows caller flexibility.


88-107: LGTM: []byte conversion in Drain.

Line 103 correctly converts Redis string values to []byte, with Go's conversion allocating new slices preventing mutation concerns.


148-164: LGTM: Lua script invocation correct (pending fallback removal).

Lines 148-164 correctly invoke the Lua script with proper argument formatting (delArg as "0"/"1" string) and handle the result. This will work correctly once the fallback at lines 137-146 is removed (see previous comment).


166-169: LGTM: Clean delegation to Get.

GetAndDelete correctly delegates to Get with the AndDelete() option.


172-191: LGTM: []byte migration in Set.

The signature migration to []byte is correct, and Redis pipeline usage with HExpireAt properly handles field-level TTLs.


194-217: LGTM: SetOrFail with non-atomic TTL.

The []byte migration and HSetNX usage are correct. Note that the TTL is set after the value (lines 211-213), so there's a brief window where the value exists without TTL if the HExpireAt call fails. This is acceptable for the current design but could be improved with a Lua script if atomic set+TTL is required.


219-219: LGTM: Interface compliance check.

The type assertion ensures redisCache implements the Cache interface.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
pkg/cache/memory.go (2)

68-71: Missing defensive copy in Drain return values.

Line 70 assigns item.value directly without copying. Callers could mutate the returned byte slices, violating immutability expectations.

Apply this diff to add defensive copies:

 items := make(map[string][]byte, len(cpy))
 for key, item := range cpy {
-	items[key] = item.value
+	items[key] = append([]byte{}, item.value...)
 }

77-107: Fix nil pointer dereference and incorrect TTL update semantics.

Critical issues in the Get implementation:

  1. Nil pointer dereference: When the key doesn't exist (Line 91, ok == false), item is nil. Line 94 calls item.isExpired(time.Now()) without checking ok, causing a panic.

  2. Incorrect TTL update: Line 100 uses item.validUntil.Add(*o.updateTTL) which fails when validUntil is zero (no TTL set). Adding to a zero time yields Unix epoch + duration instead of now + duration.

  3. Incomplete delete semantics: When delete is requested but the key doesn't exist, the code still attempts to return the nil item.

Apply this diff to fix all issues:

 func (m *memoryCache) Get(_ context.Context, key string, opts ...GetOption) ([]byte, error) {
+	o := getOptions{}
+	o.apply(opts...)
+
+	// Read-only fast path
+	if o.isEmpty() {
+		m.mux.RLock()
+		item, ok := m.items[key]
+		if !ok {
+			m.mux.RUnlock()
+			return nil, ErrKeyNotFound
+		}
+		if item.isExpired(time.Now()) {
+			m.mux.RUnlock()
+			return nil, ErrKeyExpired
+		}
+		val := append([]byte(nil), item.value...)
+		m.mux.RUnlock()
+		return val, nil
+	}
+
+	// Mutating path (delete/TTL adjustments)
+	m.mux.Lock()
+	item, ok := m.items[key]
+	if !ok {
+		m.mux.Unlock()
+		return nil, ErrKeyNotFound
+	}
+	if item.isExpired(time.Now()) {
+		if o.delete {
+			delete(m.items, key)
+		}
+		m.mux.Unlock()
+		return nil, ErrKeyExpired
+	}
+
+	if o.delete {
+		val := append([]byte(nil), item.value...)
+		delete(m.items, key)
+		m.mux.Unlock()
+		return val, nil
+	}
+	if o.validUntil != nil {
+		item.validUntil = *o.validUntil
+	} else if o.setTTL != nil {
+		item.validUntil = time.Now().Add(*o.setTTL)
+	} else if o.updateTTL != nil {
+		if item.validUntil.IsZero() {
+			item.validUntil = time.Now().Add(*o.updateTTL)
+		} else {
+			item.validUntil = item.validUntil.Add(*o.updateTTL)
+		}
+	}
+	val := append([]byte(nil), item.value...)
+	m.mux.Unlock()
+	return val, nil
-	return m.getValue(func() (*memoryItem, bool) {
-		if len(opts) == 0 {
-			m.mux.RLock()
-			item, ok := m.items[key]
-			m.mux.RUnlock()
-
-			return item, ok
-		}
-
-		o := getOptions{}
-		o.apply(opts...)
-
-		m.mux.Lock()
-		item, ok := m.items[key]
-		if o.delete {
-			delete(m.items, key)
-		} else if !item.isExpired(time.Now()) {
-			if o.validUntil != nil {
-				item.validUntil = *o.validUntil
-			} else if o.setTTL != nil {
-				item.validUntil = time.Now().Add(*o.setTTL)
-			} else if o.updateTTL != nil {
-				item.validUntil = item.validUntil.Add(*o.updateTTL)
-			}
-		}
-		m.mux.Unlock()
-
-		return item, ok
-	})
 }
🧹 Nitpick comments (1)
pkg/cache/memory.go (1)

16-23: Consider returning the Cache interface instead of the concrete type.

Returning *memoryCache instead of Cache reduces flexibility and makes it harder to swap implementations. Unless there's a specific reason to expose the concrete type, prefer returning the interface.

Apply this diff to return the interface:

-func NewMemory(ttl time.Duration) *memoryCache {
+func NewMemory(ttl time.Duration) Cache {
 	return &memoryCache{
 		items: make(map[string]*memoryItem),
 		ttl:   ttl,
 
 		mux: sync.RWMutex{},
 	}
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 002fa6e and 0a46aec.

📒 Files selected for processing (1)
  • pkg/cache/memory.go (8 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.

Applied to files:

  • pkg/cache/memory.go
🧬 Code graph analysis (1)
pkg/cache/memory.go (2)
pkg/cache/options.go (3)
  • GetOption (47-47)
  • AndDelete (80-84)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E
  • GitHub Check: Benchmark
  • GitHub Check: Analyze (go)
🔇 Additional comments (3)
pkg/cache/memory.go (3)

109-112: LGTM! Clean wrapper implementation.

The GetAndDelete implementation correctly delegates to Get with the AndDelete option, providing a clean and type-safe API.


115-136: LGTM! Correct migration to []byte values.

Both Set and SetOrFail correctly handle []byte values with defensive copies via newItem, maintaining data integrity.


187-187: LGTM! Interface compliance check is good practice.

The compile-time assertion ensures that *memoryCache implements the Cache interface, catching any drift early.

@capcom6 capcom6 added the deployed The PR is deployed on staging label Oct 18, 2025
@capcom6 capcom6 force-pushed the messages/cache-state branch from 4a8e681 to c52c565 Compare October 19, 2025 00:07
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/sms-gateway/modules/messages/tasks.go (2)

34-38: Guard ticker interval; time.NewTicker panics on ≤ 0.

Validate and default before creating the ticker.

- ticker := time.NewTicker(t.Config.Interval)
+ interval := t.Config.Interval
+ if interval <= 0 {
+   t.Logger.Warn("invalid hashing interval; defaulting to 1m", zap.Duration("interval", interval))
+   interval = time.Minute
+ }
+ ticker := time.NewTicker(interval)

47-53: Flush pending queue on shutdown.

Avoid dropping enqueued IDs on graceful stop.

case <-ctx.Done():
  t.Logger.Info("Stopping hashing task...")
- return
+ t.process() // final flush
+ return
♻️ Duplicate comments (4)
pkg/cache/memory.go (3)

68-71: Missing defensive copy in Drain values.

Line 70 assigns item.value directly into the returned map. Even though the cache is cleared, callers can mutate the returned slices, which may violate immutability expectations.

Apply this diff:

 	items := make(map[string][]byte, len(cpy))
 	for key, item := range cpy {
-		items[key] = item.value
+		items[key] = append([]byte(nil), item.value...)
 	}

100-100: AndUpdateTTL on zero validUntil produces incorrect expiration.

Line 100 adds *o.updateTTL to item.validUntil. If validUntil.IsZero() (no TTL set), adding a duration to the zero time results in a timestamp far in the past (1970-01-01 + duration), not a future expiration.

Apply this diff:

 		} else if !item.isExpired(time.Now()) {
 			if o.validUntil != nil {
 				item.validUntil = *o.validUntil
 			} else if o.setTTL != nil {
 				item.validUntil = time.Now().Add(*o.setTTL)
 			} else if o.updateTTL != nil {
-				item.validUntil = item.validUntil.Add(*o.updateTTL)
+				if item.validUntil.IsZero() {
+					item.validUntil = time.Now().Add(*o.updateTTL)
+				} else {
+					item.validUntil = item.validUntil.Add(*o.updateTTL)
+				}
 			} else if o.defaultTTL {
 				item.validUntil = time.Now().Add(m.ttl)
 			}
 		}

77-108: Nil pointer dereference at line 94 when key not found with options.

Line 91 retrieves item, ok under lock. If !ok, item is nil. Line 94 calls item.isExpired(time.Now()) without checking ok first, causing a panic when the key doesn't exist and options are provided.

Apply this diff:

 func (m *memoryCache) Get(_ context.Context, key string, opts ...GetOption) ([]byte, error) {
 	return m.getValue(func() (*memoryItem, bool) {
 		if len(opts) == 0 {
 			m.mux.RLock()
 			item, ok := m.items[key]
 			m.mux.RUnlock()
 
 			return item, ok
 		}
 
 		o := getOptions{}
 		o.apply(opts...)
 
 		m.mux.Lock()
 		item, ok := m.items[key]
+		if !ok {
+			m.mux.Unlock()
+			return nil, false
+		}
+
 		if o.delete {
 			delete(m.items, key)
 		} else if !item.isExpired(time.Now()) {
 			if o.validUntil != nil {
 				item.validUntil = *o.validUntil
 			} else if o.setTTL != nil {
 				item.validUntil = time.Now().Add(*o.setTTL)
 			} else if o.updateTTL != nil {
 				item.validUntil = item.validUntil.Add(*o.updateTTL)
 			} else if o.defaultTTL {
 				item.validUntil = time.Now().Add(m.ttl)
 			}
 		}
 		m.mux.Unlock()
 
 		return item, ok
 	})
 }
pkg/cache/redis.go (1)

110-166: AndDelete bypassed when defaultTTL is the only option set.

Lines 136-147 provide a fallback path that executes a plain HGet when defaultTTL is set but no other TTL options are present. This bypasses the Lua script, so if o.delete is also true, the field won't be deleted atomically. The fallback should only occur when truly no side-effects are needed.

Apply this diff to ensure the script is invoked for any mutation:

 func (r *redisCache) Get(ctx context.Context, key string, opts ...GetOption) ([]byte, error) {
 	o := getOptions{}
 	o.apply(opts...)
 
 	if o.isEmpty() {
 		// No options, simple get
 		val, err := r.client.HGet(ctx, r.key, key).Result()
 		if err != nil {
 			if err == redis.Nil {
 				return nil, ErrKeyNotFound
 			}
 
 			return nil, fmt.Errorf("can't get cache item: %w", err)
 		}
 
 		return []byte(val), nil
 	}
 
 	// Handle TTL options atomically using Lua script
 	var ttlTimestamp, ttlDelta int64
 	if o.validUntil != nil {
 		ttlTimestamp = o.validUntil.Unix()
 	} else if o.setTTL != nil {
 		ttlTimestamp = time.Now().Add(*o.setTTL).Unix()
 	} else if o.updateTTL != nil {
 		ttlDelta = int64(o.updateTTL.Seconds())
 	} else if o.defaultTTL {
 		ttlTimestamp = time.Now().Add(r.ttl).Unix()
-	} else {
-		// No TTL options, fallback to simple get
-		val, err := r.client.HGet(ctx, r.key, key).Result()
-		if err != nil {
-			if err == redis.Nil {
-				return nil, ErrKeyNotFound
-			}
-			return nil, fmt.Errorf("can't get cache item: %w", err)
-		}
-		return []byte(val), nil
 	}
 
 	delArg := "0"
 	if o.delete {
 		delArg = "1"
 	}
 
 	// Use atomic get and TTL update script
 	result, err := r.client.Eval(ctx, getAndUpdateTTLScript, []string{r.key}, key, delArg, ttlTimestamp, ttlDelta).Result()
 	if err != nil {
 		return nil, fmt.Errorf("can't get cache item: %w", err)
 	}
 
 	if value, ok := result.(string); ok {
 		return []byte(value), nil
 	}
 
 	return nil, ErrKeyNotFound
 }
🧹 Nitpick comments (6)
internal/sms-gateway/modules/messages/metrics.go (2)

33-45: Prometheus naming: use _total suffix (or a single CounterVec).

Counters should end with _total. Either:

  • Rename to cache_hits_total and cache_misses_total; or
  • Prefer a single cache_requests_total{result="hit|miss"} for simpler dashboards and fewer series.

Suggested minimal rename:

- Name:      "cache_hits",
+ Name:      "cache_hits_total",
...
- Name:      "cache_misses",
+ Name:      "cache_misses_total",

24-32: Avoid global registry: inject a prometheus.Registerer.

Using promauto without a custom registerer binds to the global registry, making tests flaky (duplicate registration) and complicating multi-registry setups.

Consider:

-func newMetrics() *metrics {
+func newMetrics(reg prometheus.Registerer) *metrics {
-    totalCounter: promauto.NewCounterVec(
+    totalCounter: promauto.With(reg).NewCounterVec(
...
-    cacheHits:    promauto.NewCounter(
+    cacheHits:    promauto.With(reg).NewCounter(
...
-    cacheMisses:  promauto.NewCounter(
+    cacheMisses:  promauto.With(reg).NewCounter(

Wire prometheus.Registerer in the module via Fx.

internal/sms-gateway/modules/messages/module.go (1)

21-24: If you inject a Prometheus registerer, adjust wiring.

Example:

- fx.Provide(newMetrics, fx.Private),
+ fx.Provide(func(reg prometheus.Registerer) *metrics {
+   return newMetrics(reg)
+ }, fx.Private),
internal/sms-gateway/modules/messages/tasks.go (1)

56-61: Minor: prefer defer to ensure unlock on future edits.

func (t *hashingTask) Enqueue(id uint64) {
- t.mux.Lock()
- t.queue[id] = struct{}{}
- t.mux.Unlock()
+ t.mux.Lock()
+ defer t.mux.Unlock()
+ t.queue[id] = struct{}{}
}
internal/sms-gateway/modules/messages/service.go (2)

114-116: Use request-scoped context instead of context.Background().

Thread a ctx context.Context into Service methods (or accept one on cache ops) to honor cancellations/timeouts.

Example:

-func (s *Service) UpdateState(device *models.Device, message MessageStateIn) error {
+func (s *Service) UpdateState(ctx context.Context, device *models.Device, message MessageStateIn) error {
...
- if err := s.cache.Set(context.Background(), device.UserID, existing.ExtID, ...); err != nil { ... }
+ if err := s.cache.Set(ctx, device.UserID, existing.ExtID, ...); err != nil { ... }

Also applies to: 162-164, 237-239


90-121: Nil-device guard for new pointer API.

Add a fast check to avoid panics and improve error reporting:

if device == nil || device.ID == "" {
  return ErrValidation("missing device")
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a8e681 and c52c565.

📒 Files selected for processing (21)
  • api/requests.http (1 hunks)
  • deployments/grafana/dashboards/messages.json (1 hunks)
  • internal/sms-gateway/app.go (1 hunks)
  • internal/sms-gateway/handlers/messages/3rdparty.go (1 hunks)
  • internal/sms-gateway/handlers/messages/mobile.go (1 hunks)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
  • internal/sms-gateway/modules/messages/domain.go (1 hunks)
  • internal/sms-gateway/modules/messages/metrics.go (1 hunks)
  • internal/sms-gateway/modules/messages/module.go (1 hunks)
  • internal/sms-gateway/modules/messages/service.go (5 hunks)
  • internal/sms-gateway/modules/messages/tasks.go (4 hunks)
  • internal/sms-gateway/online/service.go (2 hunks)
  • pkg/cache/cache.go (2 hunks)
  • pkg/cache/memory.go (8 hunks)
  • pkg/cache/memory_bench_test.go (18 hunks)
  • pkg/cache/memory_concurrency_test.go (13 hunks)
  • pkg/cache/memory_edge_test.go (19 hunks)
  • pkg/cache/memory_profile_test.go (7 hunks)
  • pkg/cache/memory_test.go (28 hunks)
  • pkg/cache/options.go (1 hunks)
  • pkg/cache/redis.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • api/requests.http
  • deployments/grafana/dashboards/messages.json
  • internal/sms-gateway/handlers/messages/mobile.go
  • internal/sms-gateway/online/service.go
  • internal/sms-gateway/modules/messages/domain.go
  • internal/sms-gateway/handlers/messages/3rdparty.go
  • pkg/cache/options.go
  • pkg/cache/cache.go
  • pkg/cache/memory_test.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.

Applied to files:

  • pkg/cache/redis.go
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.

Applied to files:

  • pkg/cache/memory.go
🧬 Code graph analysis (8)
pkg/cache/memory_bench_test.go (1)
pkg/cache/options.go (1)
  • WithTTL (22-30)
pkg/cache/redis.go (3)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (87-91)
  • Option (6-6)
pkg/cache/errors.go (1)
  • ErrKeyNotFound (7-7)
pkg/cache/cache.go (1)
  • Cache (5-36)
internal/sms-gateway/modules/messages/module.go (3)
internal/sms-gateway/modules/cleaner/module.go (1)
  • AsCleanable (8-14)
internal/sms-gateway/cache/factory.go (1)
  • Factory (17-19)
internal/sms-gateway/modules/messages/service.go (2)
  • NewService (44-67)
  • Service (30-42)
pkg/cache/memory.go (2)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (87-91)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
internal/sms-gateway/app.go (1)
internal/sms-gateway/modules/messages/module.go (1)
  • Module (11-35)
internal/sms-gateway/modules/messages/cache.go (2)
internal/sms-gateway/modules/messages/domain.go (1)
  • MessageStateOut (38-44)
pkg/cache/options.go (2)
  • WithTTL (22-30)
  • AndSetTTL (63-67)
internal/sms-gateway/modules/messages/service.go (8)
internal/sms-gateway/modules/events/service.go (2)
  • Service (13-24)
  • NewService (26-38)
internal/sms-gateway/modules/messages/config.go (1)
  • Config (5-7)
internal/sms-gateway/modules/db/module.go (1)
  • IDGen (8-8)
internal/sms-gateway/models/models.go (2)
  • Device (25-36)
  • User (17-23)
internal/sms-gateway/modules/messages/domain.go (2)
  • MessageStateIn (31-36)
  • MessageStateOut (38-44)
internal/sms-gateway/modules/messages/repository_filter.go (2)
  • MessagesSelectFilter (16-23)
  • MessagesSelectOptions (25-36)
internal/sms-gateway/modules/messages/repository.go (1)
  • ErrMessageNotFound (18-18)
internal/sms-gateway/handlers/middlewares/deviceauth/deviceauth.go (1)
  • WithDevice (83-87)
pkg/cache/memory_concurrency_test.go (1)
pkg/cache/options.go (1)
  • WithTTL (22-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E
  • GitHub Check: Benchmark
  • GitHub Check: Analyze (go)
🔇 Additional comments (12)
pkg/cache/memory_edge_test.go (1)

22-350: LGTM! Test updates correctly reflect []byte API.

All test changes are mechanical conversions from string to []byte values and assertions updated to use string(retrieved). The test logic and coverage remain intact.

pkg/cache/redis.go (2)

88-106: LGTM! Drain correctly migrated to []byte.

The return type and internal handling properly convert Redis strings to byte slices, and the empty-result path returns an empty map as expected.


174-193: LGTM! Set correctly handles []byte values.

The pipelined Set operation properly accepts []byte and applies TTL via HExpireAt when configured.

pkg/cache/memory.go (1)

16-23: LGTM! Constructor correctly returns concrete type.

Returning *memoryCache instead of Cache aligns with the pattern used in Redis and supports explicit interface compliance checking.

pkg/cache/memory_profile_test.go (1)

29-211: LGTM! Profile tests correctly updated for []byte API.

All Set calls and assertions mechanically updated to use byte slices. Test logic unchanged.

pkg/cache/memory_concurrency_test.go (1)

23-424: LGTM! Concurrency tests correctly updated for []byte API.

All test updates are mechanical conversions. Concurrency semantics unchanged.

pkg/cache/memory_bench_test.go (1)

25-457: LGTM! Benchmarks correctly updated for []byte API.

All benchmark Set calls updated to pass byte slices. Benchmark logic preserved.

internal/sms-gateway/app.go (1)

49-49: LGTM! Module wiring correctly updated.

Calling messages.Module() aligns with the new constructor pattern that returns fx.Option.

internal/sms-gateway/modules/messages/cache.go (1)

22-36: LGTM! Set method correctly marshals and stores with TTL.

The Set implementation properly handles nil messages and applies a 5-minute TTL consistently.

internal/sms-gateway/modules/messages/tasks.go (1)

39-44: Confirm semantics of HashProcessed([]uint64{}).

Empty slice may mean “hash nothing” or “hash all processed.” If it triggers a full scan, gate it via config or document intent; otherwise this can be heavy on startup.

internal/sms-gateway/modules/messages/service.go (1)

69-75: Background task wiring looks good.

Run via WaitGroup with context—clean and straightforward.

internal/sms-gateway/modules/messages/module.go (1)

11-19: No issues found.

The repository pins Fx v1.24.0 in go.mod, and fx.Private has been available since v1.19.0. The codebase correctly uses fx.Private across multiple modules, and the version constraint is satisfied.

@capcom6 capcom6 force-pushed the messages/cache-state branch from 291d211 to ac815ba Compare October 19, 2025 06:33
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/cache/memory.go (1)

166-173: Return a copy from getValue to avoid external mutation.

Expose a cloned []byte.

 func (m *memoryCache) getValue(getter func() (*memoryItem, bool)) ([]byte, error) {
   item, err := m.getItem(getter)
   if err != nil {
     return nil, err
   }
-  return item.value, nil
+  return append([]byte(nil), item.value...), nil
 }
♻️ Duplicate comments (7)
internal/sms-gateway/modules/messages/cache.go (1)

38-44: Treat cache miss as (nil, nil); don’t propagate ErrKeyNotFound.

Current code returns storage errors directly; callers likely expect (nil, nil) on miss.

Apply:

 func (c *cache) Get(ctx context.Context, userID, ID string) (*MessageStateOut, error) {
   data, err := c.storage.Get(ctx, userID+":"+ID, cacheImpl.AndSetTTL(5*time.Minute))
   if err != nil {
-    return nil, err
+    if err == cacheImpl.ErrKeyNotFound {
+      return nil, nil
+    }
+    return nil, err
   }
pkg/cache/memory.go (2)

77-108: Fix nil deref, atomic AndDelete/TTL, zero-TTL handling, and remove data race.

Current path can nil-deref item, mutates without existence/expiry checks, and mis-handles updateTTL on zero validUntil; also not atomic for AndDelete. Rewrite Get as below.

 func (m *memoryCache) Get(_ context.Context, key string, opts ...GetOption) ([]byte, error) {
-  return m.getValue(func() (*memoryItem, bool) {
-    if len(opts) == 0 {
-      m.mux.RLock()
-      item, ok := m.items[key]
-      m.mux.RUnlock()
-      return item, ok
-    }
-
-    o := getOptions{}
-    o.apply(opts...)
-
-    m.mux.Lock()
-    item, ok := m.items[key]
-    if o.delete {
-      delete(m.items, key)
-    } else if !item.isExpired(time.Now()) {
-      if o.validUntil != nil {
-        item.validUntil = *o.validUntil
-      } else if o.setTTL != nil {
-        item.validUntil = time.Now().Add(*o.setTTL)
-      } else if o.updateTTL != nil {
-        item.validUntil = item.validUntil.Add(*o.updateTTL)
-      } else if o.defaultTTL {
-        item.validUntil = time.Now().Add(m.ttl)
-      }
-    }
-    m.mux.Unlock()
-    return item, ok
-  })
+  o := getOptions{}
+  o.apply(opts...)
+
+  // Read‑only fast path
+  if o.isEmpty() {
+    m.mux.RLock()
+    item, ok := m.items[key]
+    if !ok {
+      m.mux.RUnlock()
+      return nil, ErrKeyNotFound
+    }
+    if item.isExpired(time.Now()) {
+      m.mux.RUnlock()
+      return nil, ErrKeyExpired
+    }
+    val := append([]byte(nil), item.value...)
+    m.mux.RUnlock()
+    return val, nil
+  }
+
+  // Mutating path (delete/TTL adjustments) under exclusive lock
+  m.mux.Lock()
+  item, ok := m.items[key]
+  if !ok {
+    m.mux.Unlock()
+    return nil, ErrKeyNotFound
+  }
+  if item.isExpired(time.Now()) {
+    if o.delete {
+      delete(m.items, key)
+    }
+    m.mux.Unlock()
+    return nil, ErrKeyExpired
+  }
+
+  if o.delete {
+    val := append([]byte(nil), item.value...)
+    delete(m.items, key)
+    m.mux.Unlock()
+    return val, nil
+  }
+  if o.validUntil != nil {
+    item.validUntil = *o.validUntil
+  } else if o.setTTL != nil {
+    item.validUntil = time.Now().Add(*o.setTTL)
+  } else if o.updateTTL != nil {
+    if item.validUntil.IsZero() {
+      item.validUntil = time.Now().Add(*o.updateTTL)
+    } else {
+      item.validUntil = item.validUntil.Add(*o.updateTTL)
+    }
+  } else if o.defaultTTL && m.ttl > 0 {
+    item.validUntil = time.Now().Add(m.ttl)
+  }
+  val := append([]byte(nil), item.value...)
+  m.mux.Unlock()
+  return val, nil
 }

68-71: Make defensive copies in Drain.

Prevent callers from mutating internal slices after drain.

 items := make(map[string][]byte, len(cpy))
 for key, item := range cpy {
-  items[key] = item.value
+  items[key] = append([]byte(nil), item.value...)
 }
internal/sms-gateway/modules/messages/service.go (2)

134-146: Fix cache hit/miss counting; miss currently recorded as hit.

cache.Get returns (nil, nil) on miss; only increment hit when dto != nil.

 dto, err := s.cache.Get(context.Background(), user.ID, ID)
-if err == nil {
-  s.metrics.IncCache(true)
-  // Cache nil entries represent "not found" and prevent repeated lookups
-  if dto == nil {
-    return nil, ErrMessageNotFound
-  }
-  return dto, nil
-}
-s.metrics.IncCache(false)
+if err != nil {
+  s.metrics.IncCache(false)
+} else {
+  if dto != nil {
+    s.metrics.IncCache(true)
+    return dto, nil
+  }
+  // negative cache hit: known "not found"
+  s.metrics.IncCache(false)
+  return nil, ErrMessageNotFound
+}

237-239: Bug: wrong cache key when ExtID is generated.

You cache under message.ID, but when empty you generate msg.ExtID; reads use ExtID.

- if err := s.cache.Set(context.Background(), device.UserID, message.ID, anys.AsPointer(modelToMessageState(msg))); err != nil {
-   s.logger.Warn("can't cache message", zap.String("id", msg.ExtID), zap.Error(err))
- }
+ cacheKey := msg.ExtID
+ if err := s.cache.Set(context.Background(), device.UserID, cacheKey, anys.AsPointer(modelToMessageState(msg))); err != nil {
+   s.logger.Warn("can't cache message", zap.String("id", cacheKey), zap.Error(err))
+ }
pkg/cache/redis.go (2)

114-148: Ensure AndDelete always uses the Lua path; remove fallback that bypasses delete.

When only AndDelete is set, current code falls back to HGet and never deletes.

 o := getOptions{}
 o.apply(opts...)
 
- if o.isEmpty() {
+ if o.isEmpty() {
   // No options, simple get
   val, err := r.client.HGet(ctx, r.key, key).Result()
   ...
   return []byte(val), nil
 }
 
- // Handle TTL options atomically using Lua script
- var ttlTimestamp, ttlDelta int64
+ // Determine if we need the script path (delete or any TTL change)
+ var ttlTimestamp, ttlDelta int64
  if o.validUntil != nil {
    ttlTimestamp = o.validUntil.Unix()
  } else if o.setTTL != nil {
    ttlTimestamp = time.Now().Add(*o.setTTL).Unix()
  } else if o.updateTTL != nil {
    ttlDelta = int64(o.updateTTL.Seconds())
  } else if o.defaultTTL {
    ttlTimestamp = time.Now().Add(r.ttl).Unix()
- } else {
-   // No TTL options, fallback to simple get
-   val, err := r.client.HGet(ctx, r.key, key).Result()
-   ...
-   return []byte(val), nil
  }
 
- delArg := "0"
+ // If no side-effects requested (no delete, no TTL changes), do a simple get
+ if !o.delete && ttlTimestamp == 0 && ttlDelta == 0 {
+   val, err := r.client.HGet(ctx, r.key, key).Result()
+   if err != nil {
+     if err == redis.Nil {
+       return nil, ErrKeyNotFound
+     }
+     return nil, fmt.Errorf("can't get cache item: %w", err)
+   }
+   return []byte(val), nil
+ }
+
+ delArg := "0"
  if o.delete {
    delArg = "1"
  }
 
 // Use atomic get and TTL update script
 result, err := r.client.Eval(ctx, getAndUpdateTTLScript, []string{r.key}, key, delArg, ttlTimestamp, ttlDelta).Result()

41-45: Guard TTL-delta math for no-TTL items in script.

If HTTL returns <=0, set relative TTL to ttlDelta instead of computing a negative newTtl.

Based on learnings

 elseif ttlDelta > 0 then
-  local ttl = redis.call('HTTL', KEYS[1], field)
-  local newTtl = ttl + ttlDelta
-  redis.call('HExpire', KEYS[1], newTtl, field)
+  local ttl = redis.call('HTTL', KEYS[1], field)
+  if ttl and ttl > 0 then
+    redis.call('HExpire', KEYS[1], ttl + ttlDelta, field)
+  else
+    redis.call('HExpire', KEYS[1], ttlDelta, field)
+  end
 end
🧹 Nitpick comments (10)
deployments/grafana/dashboards/messages.json (3)

309-321: Use rates for counter ratios; guard against 0/0.

Counters should be ratioed via rate() over $__rate_interval. Also clamp denominator to avoid NaN at startup.

Apply on Line 314:

- "expr": "sms_messages_cache_hits_total / (sms_messages_cache_hits_total + sms_messages_cache_misses_total)",
+ "expr": "sum(rate(sms_messages_cache_hits_total[$__rate_interval])) / clamp_min(sum(rate(sms_messages_cache_hits_total[$__rate_interval])) + sum(rate(sms_messages_cache_misses_total[$__rate_interval])), 1)",

431-483: Hook templating variables into queries.

instance and job variables are defined but unused in PromQL, so filtering in the UI has no effect. Add label matchers to all queries.

Examples:

- "expr": "sum by (state) (sms_messages_total)",
+ "expr": "sum by (state) (sms_messages_total{instance=~\"$instance\",job=~\"$job\"})",

- "expr": "sum(rate(sms_messages_total{state=\"Failed\"}[$__rate_interval])) / sum(rate(sms_messages_total[$__rate_interval]))",
+ "expr": "sum(rate(sms_messages_total{state=\"Failed\",instance=~\"$instance\",job=~\"$job\"}[$__rate_interval])) / sum(rate(sms_messages_total{instance=~\"$instance\",job=~\"$job\"}[$__rate_interval]))",

- "expr": "sum by (state) (rate(sms_messages_total{state=~\"Processed|Sent|Delivered|Failed\"}[$__rate_interval]))",
+ "expr": "sum by (state) (rate(sms_messages_total{state=~\"Processed|Sent|Delivered|Failed\",instance=~\"$instance\",job=~\"$job\"}[$__rate_interval]))",

Also applies to: 95-107, 200-213, 411-425


36-39: Unify datasource references.

Panels specify datasource uid "edqp0a73uh2bka" while targets use "Prometheus". This inconsistency can break variable scoping and export. Standardize to a single UID (or a DS variable).

For example, set all target blocks to:

"datasource": { "type": "prometheus", "uid": "edqp0a73uh2bka" }

Also applies to: 113-116, 216-219, 324-327, 97-99, 202-204, 310-312, 413-415

internal/sms-gateway/modules/messages/cache.go (2)

22-36: Delete on nil input instead of storing empty bytes.

Persisting a nil/empty payload then special‑casing len(data)==0 couples callers to storage quirks. Prefer explicit Delete.

 func (c *cache) Set(ctx context.Context, userID, ID string, message *MessageStateOut) error {
-  var (
-    err  error
-    data []byte
-  )
-
-  if message != nil {
-    data, err = json.Marshal(message)
-    if err != nil {
-      return fmt.Errorf("can't marshal message: %w", err)
-    }
-  }
-
-  return c.storage.Set(ctx, userID+":"+ID, data, cacheImpl.WithTTL(5*time.Minute))
+  if message == nil {
+    return c.storage.Delete(ctx, userID+":"+ID)
+  }
+  data, err := json.Marshal(message)
+  if err != nil {
+    return fmt.Errorf("can't marshal message: %w", err)
+  }
+  return c.storage.Set(ctx, userID+":"+ID, data, cacheImpl.WithTTL(5*time.Minute))
 }

35-36: Avoid hardcoded TTL; use config.

5 minutes is embedded in code in both Set and Get. Surface a messages.Config or cache TTL field and inject it.

Also applies to: 39-40

internal/sms-gateway/modules/messages/metrics.go (1)

26-48: Inject a Registerer; avoid default global to prevent duplicate regs in tests.

Use promauto.With(reg).New... and accept prometheus.Registerer in constructor; wire it via your metrics module.

-import (
-  "github.com/prometheus/client_golang/prometheus"
-  "github.com/prometheus/client_golang/prometheus/promauto"
-)
+import (
+  "github.com/prometheus/client_golang/prometheus"
+  "github.com/prometheus/client_golang/prometheus/promauto"
+)

-func newMetrics() *metrics {
+func newMetrics(reg prometheus.Registerer) *metrics {
+  if reg == nil {
+    reg = prometheus.DefaultRegisterer
+  }
   return &metrics{
-    totalCounter: promauto.NewCounterVec(prometheus.CounterOpts{
+    totalCounter: promauto.With(reg).NewCounterVec(prometheus.CounterOpts{
       Namespace: metricsNamespace,
       Subsystem: metricsSubsystem,
       Name:      metricMessagesTotal,
       Help:      "Total number of messages by state",
     }, []string{labelState}),
-    cacheHits: promauto.NewCounter(prometheus.CounterOpts{
+    cacheHits: promauto.With(reg).NewCounter(prometheus.CounterOpts{
       Namespace: metricsNamespace,
       Subsystem: metricsSubsystem,
       Name:      metricCacheHits,
       Help:      "Number of cache hits",
     }),
-    cacheMisses: promauto.NewCounter(prometheus.CounterOpts{
+    cacheMisses: promauto.With(reg).NewCounter(prometheus.CounterOpts{
       Namespace: metricsNamespace,
       Subsystem: metricsSubsystem,
       Name:      metricCacheMisses,
       Help:      "Number of cache misses",
     }),
   }
 }

Wire in module.go with a provider that injects prometheus.Registerer.

internal/sms-gateway/modules/messages/module.go (1)

11-34: DI wiring looks good; logger scoping and private providers are appropriate.

Consider updating the metrics provider to pass a prometheus.Registerer if you adopt the suggested newMetrics(reg) signature.

pkg/cache/memory.go (1)

16-16: Optional: return the interface from constructor.

Prefer hiding concrete types behind Cache for flexibility.

-func NewMemory(ttl time.Duration) *memoryCache {
+func NewMemory(ttl time.Duration) Cache {
   return &memoryCache{
internal/sms-gateway/modules/messages/service.go (1)

114-116: Prefer request-scoped contexts for cache ops; consider TTL for negative cache.

Use handler/request ctx instead of context.Background() for cache.Set to propagate deadlines/cancellation; also ensure a bounded TTL for nil (not-found) entries to avoid sticky negatives.

Also applies to: 162-164, 237-239

pkg/cache/redis.go (1)

88-106: Optional: use EVALSHA and preload scripts.

Cache Lua scripts with SCRIPT LOAD and use EvalSha to reduce eval overhead on hot paths.

Also applies to: 155-166

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 291d211 and ac815ba.

📒 Files selected for processing (13)
  • api/requests.http (1 hunks)
  • deployments/grafana/dashboards/messages.json (1 hunks)
  • internal/sms-gateway/handlers/messages/3rdparty.go (1 hunks)
  • internal/sms-gateway/handlers/messages/mobile.go (1 hunks)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
  • internal/sms-gateway/modules/messages/domain.go (1 hunks)
  • internal/sms-gateway/modules/messages/metrics.go (1 hunks)
  • internal/sms-gateway/modules/messages/module.go (1 hunks)
  • internal/sms-gateway/modules/messages/service.go (5 hunks)
  • pkg/cache/cache.go (2 hunks)
  • pkg/cache/memory.go (8 hunks)
  • pkg/cache/options.go (1 hunks)
  • pkg/cache/redis.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • internal/sms-gateway/handlers/messages/mobile.go
  • internal/sms-gateway/handlers/messages/3rdparty.go
  • api/requests.http
  • pkg/cache/options.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.

Applied to files:

  • pkg/cache/memory.go
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.

Applied to files:

  • pkg/cache/redis.go
🧬 Code graph analysis (7)
internal/sms-gateway/modules/messages/service.go (5)
internal/sms-gateway/modules/devices/service.go (2)
  • Service (30-39)
  • NewService (157-165)
internal/sms-gateway/modules/push/service.go (2)
  • Service (36-46)
  • Config (16-23)
internal/sms-gateway/modules/db/module.go (1)
  • IDGen (8-8)
internal/sms-gateway/models/models.go (2)
  • Device (25-36)
  • User (17-23)
internal/sms-gateway/modules/messages/repository.go (1)
  • ErrMessageNotFound (18-18)
internal/sms-gateway/modules/messages/cache.go (2)
internal/sms-gateway/modules/messages/domain.go (1)
  • MessageStateOut (38-44)
pkg/cache/options.go (2)
  • WithTTL (22-30)
  • AndSetTTL (63-67)
pkg/cache/memory.go (2)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (87-91)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
pkg/cache/cache.go (1)
pkg/cache/options.go (2)
  • Option (6-6)
  • GetOption (48-48)
internal/sms-gateway/modules/messages/module.go (5)
internal/sms-gateway/app.go (1)
  • Module (35-58)
internal/config/module.go (1)
  • Module (21-124)
internal/sms-gateway/modules/cleaner/module.go (2)
  • Module (28-36)
  • AsCleanable (8-14)
internal/sms-gateway/cache/factory.go (1)
  • Factory (17-19)
internal/sms-gateway/modules/messages/service.go (2)
  • NewService (44-67)
  • Service (30-42)
internal/sms-gateway/modules/messages/domain.go (1)
internal/sms-gateway/modules/messages/models.go (1)
  • ProcessingState (12-12)
pkg/cache/redis.go (3)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (87-91)
  • Option (6-6)
pkg/cache/errors.go (1)
  • ErrKeyNotFound (7-7)
pkg/cache/cache.go (1)
  • Cache (5-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: E2E
  • GitHub Check: Benchmark
  • GitHub Check: Lint
  • GitHub Check: Analyze (go)
🔇 Additional comments (3)
deployments/grafana/dashboards/messages.json (1)

428-429: Verify Grafana supports "refresh": "auto" for your version.

Some versions expect concrete intervals ("5s", "10s"). Please confirm this renders correctly in your Grafana (12.x+).

pkg/cache/cache.go (1)

5-36: Migration to []byte is complete and correctly implemented.

All cache implementations and call sites have been properly migrated:

  • The Cache interface (pkg/cache/cache.go) consistently uses []byte for values across Set, Get, GetAndDelete, and Drain methods.
  • Direct callers (online/service.go) pass []byte values directly.
  • Domain-specific adapters (messages/cache.go) wrap the interface and convert domain objects (MessageStateOut) to/from []byte using JSON marshaling before delegating to the underlying storage.
  • No old string-based signatures remain in the codebase.
internal/sms-gateway/modules/messages/domain.go (1)

32-35: JSON tags and embedding look good; check client compatibility.

Struct embedding + tags will flatten fields as intended. Please confirm downstream clients expect snake_case keys and the embedded shape hasn’t broken decoding.

Also applies to: 41-43

@capcom6 capcom6 force-pushed the messages/cache-state branch from 0983c17 to d7bebe8 Compare October 19, 2025 23:08
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (5)
pkg/cache/redis.go (2)

39-44: ** Critical Lua script bug persists: TTL delta logic mixes units.**

Line 43 computes newTtl = ttl + ttlDelta where ttl is the result of HTTL (remaining seconds), but line 44 calls HExpire with newTtl. The issue is that HExpire expects a relative TTL (seconds from now), not an absolute timestamp. Adding the current TTL to a delta and passing it as a relative value produces incorrect expiration times.

Apply this diff to fix the TTL delta branch:

 elseif ttlDelta > 0 then
   local ttl = redis.call('HTTL', KEYS[1], field)
-  local newTtl = ttl + ttlDelta
-  redis.call('HExpire', KEYS[1], newTtl, field)
+  if ttl > 0 then
+    redis.call('HExpire', KEYS[1], ttl + ttlDelta, field)
+  else
+    redis.call('HExpire', KEYS[1], ttlDelta, field)
+  end
 end

136-148: ** Remove second fallback path that bypasses AndDelete.**

Lines 136–148 introduce a second fallback to plain HGet when o.defaultTTL is the only option set. This path bypasses the Lua script, so if the caller passes both AndDefaultTTL() and AndDelete(), the delete is silently ignored. The first isEmpty() check at line 114 is sufficient; any non-empty options (including defaultTTL or delete alone) should flow through the script path at lines 150–166.

Apply this diff:

 	var ttlTimestamp, ttlDelta int64
 	if o.validUntil != nil {
 		ttlTimestamp = o.validUntil.Unix()
 	} else if o.setTTL != nil {
 		ttlTimestamp = time.Now().Add(*o.setTTL).Unix()
 	} else if o.updateTTL != nil {
 		ttlDelta = int64(o.updateTTL.Seconds())
-	} else if o.defaultTTL {
+	} else if o.defaultTTL && r.ttl > 0 {
 		ttlTimestamp = time.Now().Add(r.ttl).Unix()
-	} else {
-		// No TTL options, fallback to simple get
-		val, err := r.client.HGet(ctx, r.key, key).Result()
-		if err != nil {
-			if err == redis.Nil {
-				return nil, ErrKeyNotFound
-			}
-			return nil, fmt.Errorf("can't get cache item: %w", err)
-		}
-		return []byte(val), nil
 	}

 	delArg := "0"
pkg/cache/memory.go (2)

77-108: ** Critical: nil pointer dereference and broken AndDelete/TTL logic.**

Line 91 fetches item, ok := m.items[key], which can be nil when ok is false. Line 94 then accesses item.isExpired(time.Now()) without checking ok first, causing a nil pointer dereference.

Additionally, the logic at lines 92–104 is flawed:

  • When o.delete is true (line 92), the item is deleted but expiration and TTL updates are skipped entirely.
  • When o.delete is false, line 94 checks expiration and applies TTL updates, but this happens after potentially deleting the item, and without verifying ok.

Apply this diff to fix the nil dereference and ensure correct semantics:

 func (m *memoryCache) Get(_ context.Context, key string, opts ...GetOption) ([]byte, error) {
+	o := getOptions{}
+	o.apply(opts...)
+
+	// Read-only fast path
+	if o.isEmpty() {
+		m.mux.RLock()
+		item, ok := m.items[key]
+		if !ok {
+			m.mux.RUnlock()
+			return nil, ErrKeyNotFound
+		}
+		if item.isExpired(time.Now()) {
+			m.mux.RUnlock()
+			return nil, ErrKeyExpired
+		}
+		val := append([]byte(nil), item.value...)
+		m.mux.RUnlock()
+		return val, nil
+	}
+
+	// Mutating path (delete/TTL adjustments)
+	m.mux.Lock()
+	item, ok := m.items[key]
+	if !ok {
+		m.mux.Unlock()
+		return nil, ErrKeyNotFound
+	}
+	if item.isExpired(time.Now()) {
+		if o.delete {
+			delete(m.items, key)
+		}
+		m.mux.Unlock()
+		return nil, ErrKeyExpired
+	}
+
+	if o.delete {
+		val := append([]byte(nil), item.value...)
+		delete(m.items, key)
+		m.mux.Unlock()
+		return val, nil
+	}
+	if o.validUntil != nil {
+		item.validUntil = *o.validUntil
+	} else if o.setTTL != nil {
+		item.validUntil = time.Now().Add(*o.setTTL)
+	} else if o.updateTTL != nil {
+		if item.validUntil.IsZero() {
+			item.validUntil = time.Now().Add(*o.updateTTL)
+		} else {
+			item.validUntil = item.validUntil.Add(*o.updateTTL)
+		}
+	} else if o.defaultTTL && m.ttl > 0 {
+		item.validUntil = time.Now().Add(m.ttl)
+	}
+	val := append([]byte(nil), item.value...)
+	m.mux.Unlock()
 	return m.getValue(func() (*memoryItem, bool) {
-		if len(opts) == 0 {
-			m.mux.RLock()
-			item, ok := m.items[key]
-			m.mux.RUnlock()
-
-			return item, ok
-		}
-
-		o := getOptions{}
-		o.apply(opts...)
-
-		m.mux.Lock()
-		item, ok := m.items[key]
-		if o.delete {
-			delete(m.items, key)
-		} else if !item.isExpired(time.Now()) {
-			if o.validUntil != nil {
-				item.validUntil = *o.validUntil
-			} else if o.setTTL != nil {
-				item.validUntil = time.Now().Add(*o.setTTL)
-			} else if o.updateTTL != nil {
-				item.validUntil = item.validUntil.Add(*o.updateTTL)
-			} else if o.defaultTTL {
-				item.validUntil = time.Now().Add(m.ttl)
-			}
-		}
-		m.mux.Unlock()
-
-		return item, ok
-	})
+	return val, nil
 }

68-71: ** Add defensive copy in Drain return values.**

Line 70 assigns item.value directly into the returned map without copying. Although the cache is cleared, callers could mutate the returned byte slices, potentially affecting lingering internal references or violating immutability expectations.

Apply this diff:

 items := make(map[string][]byte, len(cpy))
 for key, item := range cpy {
-	items[key] = item.value
+	items[key] = append([]byte(nil), item.value...)
 }
internal/sms-gateway/modules/messages/service.go (1)

237-240: ** Critical: wrong cache key when ID is auto-generated.**

Line 237 uses message.ID as the cache key, but when the client doesn't provide an ID, line 229 generates msg.ExtID. At this point, message.ID is still empty, so the cache is keyed under an empty string, breaking subsequent reads via GetState (line 135) which expects the actual generated ID.

Apply this diff to use the correct key:

-	if err := s.cache.Set(context.Background(), device.UserID, message.ID, anys.AsPointer(modelToMessageState(msg))); err != nil {
+	if err := s.cache.Set(context.Background(), device.UserID, msg.ExtID, anys.AsPointer(modelToMessageState(msg))); err != nil {
 		s.logger.Warn("can't cache message", zap.String("id", msg.ExtID), zap.Error(err))
 	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0983c17 and d7bebe8.

📒 Files selected for processing (7)
  • deployments/grafana/dashboards/messages.json (1 hunks)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
  • internal/sms-gateway/modules/messages/metrics.go (1 hunks)
  • internal/sms-gateway/modules/messages/service.go (5 hunks)
  • pkg/cache/memory.go (8 hunks)
  • pkg/cache/options.go (1 hunks)
  • pkg/cache/redis.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • internal/sms-gateway/modules/messages/cache.go
  • deployments/grafana/dashboards/messages.json
  • internal/sms-gateway/modules/messages/metrics.go
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.

Applied to files:

  • pkg/cache/memory.go
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.

Applied to files:

  • pkg/cache/redis.go
📚 Learning: 2025-09-10T23:54:16.828Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:118-121
Timestamp: 2025-09-10T23:54:16.828Z
Learning: Redis 7.4.0 introduced HEXPIRE, HEXPIREAT, and related commands that support per-field TTL on hash data structures. go-redis v9.6.0+ includes support for these commands with methods like HExpireAt.

Applied to files:

  • pkg/cache/redis.go
🧬 Code graph analysis (3)
internal/sms-gateway/modules/messages/service.go (5)
internal/sms-gateway/modules/db/module.go (1)
  • IDGen (8-8)
internal/sms-gateway/models/models.go (2)
  • Device (25-36)
  • User (17-23)
internal/sms-gateway/modules/messages/domain.go (2)
  • MessageStateIn (31-36)
  • MessageStateOut (38-44)
internal/sms-gateway/modules/messages/repository.go (1)
  • ErrMessageNotFound (18-18)
internal/sms-gateway/handlers/middlewares/deviceauth/deviceauth.go (1)
  • WithDevice (83-87)
pkg/cache/memory.go (2)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (87-91)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
pkg/cache/redis.go (3)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (87-91)
  • Option (6-6)
pkg/cache/errors.go (1)
  • ErrKeyNotFound (7-7)
pkg/cache/cache.go (1)
  • Cache (5-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E
  • GitHub Check: Benchmark
  • GitHub Check: Analyze (go)

@capcom6 capcom6 force-pushed the messages/cache-state branch from f3c4ccb to 7e0fa8d Compare October 20, 2025 01:52
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (5)
internal/sms-gateway/modules/messages/cache.go (1)

38-54: Critical: Unresolved error handling issue from previous review.

The error handling issue flagged in the previous review has not been addressed. Currently, ErrKeyNotFound from storage.Get is propagated to callers (lines 40-42), but the subsequent check at lines 44-46 suggests cache misses should return (nil, nil) instead.

This breaks the cache abstraction—callers should not need to handle storage-level errors for cache misses.

Apply this diff to fix the error handling:

 func (c *cache) Get(ctx context.Context, userID, ID string) (*MessageStateOut, error) {
 	data, err := c.storage.Get(ctx, userID+":"+ID, cacheImpl.AndSetTTL(5*time.Minute))
 	if err != nil {
+		if errors.Is(err, cacheImpl.ErrKeyNotFound) {
+			return nil, nil
+		}
 		return nil, err
 	}
 
 	if len(data) == 0 {
 		return nil, nil
 	}
 
 	message := new(MessageStateOut)
 	if err := json.Unmarshal(data, message); err != nil {
 		return nil, fmt.Errorf("can't unmarshal message: %w", err)
 	}
 
 	return message, nil
 }

Note: You'll need to add "errors" to your imports.

pkg/cache/redis.go (2)

136-148: Fix AndDelete bypass: fallback path ignores delete flag.

When AndDelete() is the only option set:

  • Line 114: isEmpty() returns false (due to delete flag)
  • Lines 130-135: All TTL checks fail
  • Line 136: else if o.defaultTTL fails
  • Line 138: Falls into else block with plain HGet
  • Result: The field is not deleted

Apply this diff to remove the fallback and always use the script when options are present:

-	} else if o.defaultTTL {
-		ttlTimestamp = time.Now().Add(r.ttl).Unix()
-	} else {
-		// No TTL options, fallback to simple get
-		val, err := r.client.HGet(ctx, r.key, key).Result()
-		if err != nil {
-			if err == redis.Nil {
-				return nil, ErrKeyNotFound
-			}
-			return nil, fmt.Errorf("can't get cache item: %w", err)
-		}
-		return []byte(val), nil
-	}
+	} else if o.defaultTTL {
+		ttlTimestamp = time.Now().Add(r.ttl).Unix()
+	}

Based on learnings


41-44: Fix TTL delta logic: adding to remaining TTL resets expiration.

The current logic adds the delta to the remaining TTL then sets a new relative expiration from now:

  • Line 42: HTTL returns remaining seconds (e.g., 100)
  • Line 43: newTtl = ttl + ttlDelta (e.g., 100 + 50 = 150)
  • Line 44: HExpire(150) sets expiration to 150 seconds from now

This effectively gives the item remaining + delta seconds from now, not extending the current expiration by delta. If an item has 100 seconds left and you add 50, it gets 150 seconds from now (net gain of ~150 seconds), not 150 seconds total (net gain of 50 seconds).

Apply this diff to extend by delta only:

 elseif ttlDelta > 0 then
   local ttl = redis.call('HTTL', KEYS[1], field)
-  local newTtl = ttl + ttlDelta
-  redis.call('HExpire', KEYS[1], newTtl, field)
+  if ttl > 0 then
+    redis.call('HExpire', KEYS[1], ttlDelta, field)
+  else
+    redis.call('HExpire', KEYS[1], ttlDelta, field)
+  end

Wait, that's still wrong. To extend an existing TTL by delta seconds, you need:

 elseif ttlDelta > 0 then
-  local ttl = redis.call('HTTL', KEYS[1], field)
-  local newTtl = ttl + ttlDelta
-  redis.call('HExpire', KEYS[1], newTtl, field)
+  redis.call('HExpire', KEYS[1], ttlDelta, field)

Actually, if the semantic intent is "add delta to the current expiration time", you need to compute the absolute target and use HExpireAt. But HExpire always sets relative to now, so you cannot preserve the existing expiration and add to it with HExpire alone. You need HExpireAt(now + remaining + delta).

Clarify the intended semantics: should AndUpdateTTL(50s) mean "extend expiration by 50s" or "set TTL to 50s from now"?

pkg/cache/memory.go (2)

68-71: Add defensive copy to prevent mutation of returned values.

Line 70 assigns item.value directly without copying. Callers could mutate the returned byte slices, violating the immutability expectation of cached data.

Apply this diff:

 items := make(map[string][]byte, len(cpy))
 for key, item := range cpy {
-	items[key] = item.value
+	items[key] = append([]byte(nil), item.value...)
 }

77-109: Fix nil deref, zero-time bug, and data race in Get.

Critical issues:

  1. Nil dereference (line 94): When ok == false, item is nil, but !item.isExpired() is called unconditionally in the else-if, causing a panic.

  2. Zero-time bug (line 100): item.validUntil.Add(*o.updateTTL) on a zero validUntil produces a time in 1970 + duration, not now + duration. Should initialize from time.Now() when validUntil is zero.

  3. Data race: Lines 94-103 mutate item.validUntil under Lock, but after releasing the lock (line 105), the same item pointer is returned and its validUntil is read at line 159 in getItem without synchronization. Concurrent Gets can race on validUntil reads/writes.

Apply this diff to fix all three issues:

 func (m *memoryCache) Get(_ context.Context, key string, opts ...GetOption) ([]byte, error) {
+	o := getOptions{}
+	o.apply(opts...)
+
+	// Read-only fast path
+	if o.isEmpty() {
+		m.mux.RLock()
+		item, ok := m.items[key]
+		if !ok {
+			m.mux.RUnlock()
+			return nil, ErrKeyNotFound
+		}
+		if item.isExpired(time.Now()) {
+			m.mux.RUnlock()
+			return nil, ErrKeyExpired
+		}
+		val := append([]byte(nil), item.value...)
+		m.mux.RUnlock()
+		return val, nil
+	}
+
+	// Mutating path
+	m.mux.Lock()
+	item, ok := m.items[key]
+	if !ok {
+		m.mux.Unlock()
+		return nil, ErrKeyNotFound
+	}
+	now := time.Now()
+	if item.isExpired(now) {
+		if o.delete {
+			delete(m.items, key)
+		}
+		m.mux.Unlock()
+		return nil, ErrKeyExpired
+	}
+
+	if o.delete {
+		val := append([]byte(nil), item.value...)
+		delete(m.items, key)
+		m.mux.Unlock()
+		return val, nil
+	}
+
+	// Apply TTL mutations
+	if o.validUntil != nil {
+		item.validUntil = *o.validUntil
+	} else if o.setTTL != nil {
+		item.validUntil = now.Add(*o.setTTL)
+	} else if o.updateTTL != nil {
+		if item.validUntil.IsZero() {
+			item.validUntil = now.Add(*o.updateTTL)
+		} else {
+			item.validUntil = item.validUntil.Add(*o.updateTTL)
+		}
+	} else if o.defaultTTL {
+		item.validUntil = now.Add(m.ttl)
+	}
+
+	val := append([]byte(nil), item.value...)
+	m.mux.Unlock()
-	return m.getValue(func() (*memoryItem, bool) {
-		if len(opts) == 0 {
-			m.mux.RLock()
-			item, ok := m.items[key]
-			m.mux.RUnlock()
-
-			return item, ok
-		}
-
-		o := getOptions{}
-		o.apply(opts...)
-
-		m.mux.Lock()
-		item, ok := m.items[key]
-		if o.delete {
-			delete(m.items, key)
-		} else if !item.isExpired(time.Now()) {
-			if o.validUntil != nil {
-				item.validUntil = *o.validUntil
-			} else if o.setTTL != nil {
-				item.validUntil = time.Now().Add(*o.setTTL)
-			} else if o.updateTTL != nil {
-				item.validUntil = item.validUntil.Add(*o.updateTTL)
-			} else if o.defaultTTL {
-				item.validUntil = time.Now().Add(m.ttl)
-			}
-		}
-		m.mux.Unlock()
-
-		return item, ok
-	})
+	return val, nil
 }

Based on learnings

🧹 Nitpick comments (1)
internal/sms-gateway/modules/messages/cache.go (1)

22-36: Consider adding input validation and extracting the TTL constant.

The Set logic is correct, but consider these optional improvements:

  1. Validate that userID and ID are non-empty to prevent malformed cache keys.
  2. Extract the 5*time.Minute TTL as a package-level constant to avoid duplication with the Get method.

Example refactor:

+const cacheTTL = 5 * time.Minute
+
 func (c *cache) Set(ctx context.Context, userID, ID string, message *MessageStateOut) error {
+	if userID == "" || ID == "" {
+		return fmt.Errorf("userID and ID must be non-empty")
+	}
+
 	var (
 		err  error
 		data []byte
 	)
 
 	if message != nil {
 		data, err = json.Marshal(message)
 		if err != nil {
 			return fmt.Errorf("can't marshal message: %w", err)
 		}
 	}
 
-	return c.storage.Set(ctx, userID+":"+ID, data, cacheImpl.WithTTL(5*time.Minute))
+	return c.storage.Set(ctx, userID+":"+ID, data, cacheImpl.WithTTL(cacheTTL))
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3c4ccb and 7e0fa8d.

📒 Files selected for processing (4)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
  • pkg/cache/memory.go (8 hunks)
  • pkg/cache/options.go (1 hunks)
  • pkg/cache/redis.go (5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.

Applied to files:

  • pkg/cache/memory.go
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.

Applied to files:

  • pkg/cache/redis.go
🧬 Code graph analysis (3)
internal/sms-gateway/modules/messages/cache.go (2)
internal/sms-gateway/modules/messages/domain.go (1)
  • MessageStateOut (38-44)
pkg/cache/options.go (2)
  • WithTTL (22-30)
  • AndSetTTL (64-68)
pkg/cache/memory.go (2)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (88-92)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
pkg/cache/redis.go (3)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (88-92)
  • Option (6-6)
pkg/cache/errors.go (1)
  • ErrKeyNotFound (7-7)
pkg/cache/cache.go (1)
  • Cache (5-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: E2E
  • GitHub Check: Lint
  • GitHub Check: Benchmark
  • GitHub Check: Analyze (go)
🔇 Additional comments (2)
internal/sms-gateway/modules/messages/cache.go (1)

16-20: LGTM!

The constructor is straightforward and idiomatic.

pkg/cache/options.go (1)

56-62: LGTM! The isEmpty() fix has been applied correctly.

The missing !o.defaultTTL check flagged in the previous review has been correctly added on line 60. The method now properly returns false when defaultTTL is set, ensuring the options path is not skipped.

@capcom6
Copy link
Member Author

capcom6 commented Oct 21, 2025

Deployed...

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
internal/sms-gateway/modules/messages/cache.go (1)

47-50: Incorrect error handling: ErrKeyNotFound should return (nil, nil).

The code returns any error from storage.Get, including ErrKeyNotFound. However, the subsequent check at lines 52-54 (len(data) == 0) suggests cache misses should return (nil, nil) rather than an error. Currently, ErrKeyNotFound propagates to callers unexpectedly.

This issue was previously flagged and remains unresolved.

Apply this diff:

 	data, err := c.storage.Get(ctx, userID+":"+ID, cacheImpl.AndSetTTL(5*time.Minute))
 	if err != nil {
+		if err == cacheImpl.ErrKeyNotFound {
+			return nil, nil
+		}
 		return nil, err
 	}
🧹 Nitpick comments (2)
internal/sms-gateway/modules/messages/cache.go (2)

24-41: Extract TTL to a constant.

The 5-minute TTL is duplicated in both Set (line 40) and Get (line 47). Extract it to a named constant for consistency and maintainability.

 const cacheTimeout = 10 * time.Millisecond
+const cacheTTL = 5 * time.Minute

 type cache struct {

Then update both methods:

-	return c.storage.Set(ctx, userID+":"+ID, data, cacheImpl.WithTTL(5*time.Minute))
+	return c.storage.Set(ctx, userID+":"+ID, data, cacheImpl.WithTTL(cacheTTL))

40-40: Consider escaping or validating key components.

The key is constructed as userID + ":" + ID. If either component contains a colon, this could lead to key ambiguity or collisions (e.g., "user:123" + ":" + "456" produces the same key as "user" + ":" + "123:456").

While unlikely if IDs are controlled, consider using a delimiter-safe approach or validating inputs.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e0fa8d and e38a5a4.

📒 Files selected for processing (1)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/sms-gateway/modules/messages/cache.go (2)
internal/sms-gateway/modules/messages/domain.go (1)
  • MessageStateOut (38-44)
pkg/cache/options.go (2)
  • WithTTL (22-30)
  • AndSetTTL (64-68)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E
  • GitHub Check: Benchmark
  • GitHub Check: Analyze (go)
🔇 Additional comments (1)
internal/sms-gateway/modules/messages/cache.go (1)

18-22: LGTM!

Constructor is straightforward and correctly initializes the cache wrapper.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
internal/sms-gateway/modules/messages/cache.go (1)

46-65: Critical: ErrKeyNotFound should return (nil, nil) instead of propagating as error.

Lines 51-53 return any error from storage.Get, including ErrKeyNotFound. However, cache misses should be treated as (nil, nil) rather than an error. The subsequent check at lines 55-57 (len(data) == 0) will never execute for missing keys because ErrKeyNotFound is returned earlier.

This issue was previously flagged but remains unresolved.

Apply this diff to fix the error handling:

 func (c *cache) Get(ctx context.Context, userID, ID string) (*MessageStateOut, error) {
 	ctx, cancel := context.WithTimeout(ctx, cacheTimeout)
 	defer cancel()
 
 	data, err := c.storage.Get(ctx, userID+":"+ID, cacheImpl.AndSetTTL(cacheTTL))
 	if err != nil {
+		if err == cacheImpl.ErrKeyNotFound {
+			return nil, nil
+		}
 		return nil, err
 	}
 
 	if len(data) == 0 {
 		return nil, nil
 	}
 
 	message := new(MessageStateOut)
 	if err := json.Unmarshal(data, message); err != nil {
 		return nil, fmt.Errorf("can't unmarshal message: %w", err)
 	}
 
 	return message, nil
 }
🧹 Nitpick comments (1)
internal/sms-gateway/modules/messages/cache.go (1)

12-15: Consider making cacheTimeout configurable for production tuning.

The 100ms timeout is an improvement over the previous value, but for network-based caches like Redis under load or with network latency, this may still cause occasional timeouts. Consider making this configurable via environment variable or config file so it can be tuned per deployment environment.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de9b108 and d800cf7.

📒 Files selected for processing (1)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/sms-gateway/modules/messages/cache.go (2)
internal/sms-gateway/modules/messages/domain.go (1)
  • MessageStateOut (38-44)
pkg/cache/options.go (2)
  • WithTTL (22-30)
  • AndSetTTL (64-68)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E
  • GitHub Check: Benchmark
  • GitHub Check: Analyze (go)
🔇 Additional comments (2)
internal/sms-gateway/modules/messages/cache.go (2)

17-25: LGTM!

The cache struct and constructor follow a clean dependency injection pattern. The implementation is straightforward and correct.


27-44: Verify the intent of storing empty data when message is nil.

When message is nil, data remains as an empty []byte (line 30) and is stored in the cache (line 43). This behavior could be intentional for cache invalidation or clearing entries. However, if the intent is to simply skip the cache operation for nil messages, the method should return early.

Please confirm whether storing empty data for nil messages is the intended behavior for cache invalidation.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
internal/sms-gateway/modules/messages/cache.go (1)

49-68: Critical: ErrKeyNotFound should return (nil, nil), not an error.

Lines 54-56 return any error from storage.Get, including ErrKeyNotFound. However, the check at lines 58-60 (len(data) == 0) suggests cache misses should be treated as (nil, nil) rather than propagating an error. This issue was flagged in a previous review but remains unaddressed.

Apply this diff:

+import "errors"
+
 func (c *cache) Get(ctx context.Context, userID, ID string) (*MessageStateOut, error) {
 	ctx, cancel := context.WithTimeout(ctx, cacheTimeout)
 	defer cancel()
 
 	data, err := c.storage.Get(ctx, userID+":"+ID, cacheImpl.AndSetTTL(c.ttl))
 	if err != nil {
+		if errors.Is(err, cacheImpl.ErrKeyNotFound) {
+			return nil, nil
+		}
 		return nil, err
 	}
 
 	if len(data) == 0 {
 		return nil, nil
 	}
 
 	message := new(MessageStateOut)
 	if err := json.Unmarshal(data, message); err != nil {
 		return nil, fmt.Errorf("can't unmarshal message: %w", err)
 	}
 
 	return message, nil
 }

Also, same key collision vulnerability as in Set method (see previous comment).

🧹 Nitpick comments (1)
internal/sms-gateway/modules/messages/cache.go (1)

13-14: Consider making cacheTimeout configurable.

While 100ms is an improvement over the previously flagged 10ms, it may still be insufficient for network-based caches (Redis) under load or with network latency. Consider making this configurable via the Config struct to allow tuning per environment.

Apply this diff to make the timeout configurable:

-const (
-	cacheTimeout = 100 * time.Millisecond
-)
-
 type cache struct {
 	ttl time.Duration
+	timeout time.Duration
 
 	storage cacheImpl.Cache
 }
 
 func newCache(config Config, storage cacheImpl.Cache) *cache {
 	return &cache{
 		ttl: config.CacheTTL,
+		timeout: 100 * time.Millisecond, // Or add to Config
 
 		storage: storage,
 	}
 }

Then replace cacheTimeout with c.timeout in Set and Get methods.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d800cf7 and 94cda62.

📒 Files selected for processing (4)
  • internal/config/config.go (3 hunks)
  • internal/config/module.go (1 hunks)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
  • internal/sms-gateway/modules/messages/config.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
internal/sms-gateway/modules/messages/cache.go (3)
internal/sms-gateway/modules/messages/config.go (1)
  • Config (5-8)
internal/sms-gateway/modules/messages/domain.go (1)
  • MessageStateOut (38-44)
pkg/cache/options.go (2)
  • WithTTL (22-30)
  • AndSetTTL (64-68)
internal/config/module.go (1)
internal/config/config.go (1)
  • Messages (75-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Benchmark
  • GitHub Check: E2E
  • GitHub Check: Analyze (go)
🔇 Additional comments (5)
internal/config/module.go (1)

106-107: LGTM! Configuration wiring is correct.

The config values are properly converted from hours/seconds to time.Duration and wired into the messages module configuration.

internal/sms-gateway/modules/messages/config.go (1)

7-7: LGTM! Clean field addition.

The CacheTTL field properly extends the configuration to support cache TTL settings.

internal/config/config.go (3)

17-17: LGTM! Messages configuration added.

The Messages field properly extends the Config struct to support message-specific settings.


75-78: LGTM! Messages struct definition is correct.

The struct properly defines cache TTL and processed lifetime fields with appropriate YAML and environment variable bindings.


109-112: LGTM! Reasonable default values.

The defaults of 5 minutes for cache TTL and 30 days for processed lifetime are appropriate for typical message state management.

@capcom6 capcom6 force-pushed the messages/cache-state branch from 94cda62 to 5cd39d0 Compare October 21, 2025 13:47
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (7)
internal/sms-gateway/modules/messages/cache.go (2)

49-68: Treat cache miss as (nil, nil); don’t propagate ErrKeyNotFound.

Currently returns the backend miss error; callers likely expect nil,nil. Also generate the key via the same encoder.

@@
-func (c *cache) Get(ctx context.Context, userID, ID string) (*MessageStateOut, error) {
+func (c *cache) Get(ctx context.Context, userID, ID string) (*MessageStateOut, error) {
 	ctx, cancel := context.WithTimeout(ctx, cacheTimeout)
 	defer cancel()
 
-	data, err := c.storage.Get(ctx, userID+":"+ID, cacheImpl.AndSetTTL(c.ttl))
-	if err != nil {
-		return nil, err
-	}
+	key := cacheKey(userID, ID)
+	data, err := c.storage.Get(ctx, key, cacheImpl.AndSetTTL(c.ttl))
+	if err != nil {
+		// Treat miss/expired as empty
+		if errors.Is(err, cacheImpl.ErrKeyNotFound) || errors.Is(err, cacheImpl.ErrKeyExpired) {
+			return nil, nil
+		}
+		return nil, err
+	}

30-47: Nil message writes and composite key risks.

  • Writing nil bytes on message==nil is ambiguous; prefer Delete or an explicit tombstone.
  • Key "userID+":"+ID" risks collisions if components contain ':'; encode components.
@@
-import (
+import (
 	"context"
 	"encoding/json"
+	"errors"
 	"fmt"
+	"net/url"
 	"time"
@@
 func newCache(config Config, storage cacheImpl.Cache) *cache {
@@
 }
 
+func cacheKey(userID, ID string) string {
+	return url.QueryEscape(userID) + ":" + url.QueryEscape(ID)
+}
@@
 func (c *cache) Set(ctx context.Context, userID, ID string, message *MessageStateOut) error {
@@
-	if message != nil {
+	if message != nil {
 		data, err = json.Marshal(message)
 		if err != nil {
 			return fmt.Errorf("can't marshal message: %w", err)
 		}
 	}
@@
-	return c.storage.Set(ctx, userID+":"+ID, data, cacheImpl.WithTTL(c.ttl))
+	key := cacheKey(userID, ID)
+	if message == nil {
+		return c.storage.Delete(ctx, key)
+	}
+	return c.storage.Set(ctx, key, data, cacheImpl.WithTTL(c.ttl))
internal/sms-gateway/modules/messages/service.go (1)

228-240: Wrong cache key when ID is auto-generated.

When message.ID is empty, Line 229 generates a new msg.ExtID, but Line 237 still caches using the empty message.ID. This creates an invalid cache entry that can't be retrieved.

Apply this diff:

- if err := s.cache.Set(context.Background(), device.UserID, message.ID, anys.AsPointer(modelToMessageState(msg))); err != nil {
+ if err := s.cache.Set(context.Background(), device.UserID, msg.ExtID, anys.AsPointer(modelToMessageState(msg))); err != nil {
   s.logger.Warn("can't cache message", zap.String("id", msg.ExtID), zap.Error(err))
 }
pkg/cache/redis.go (2)

110-166: Fallback path bypasses AndDelete behavior.

The second fallback (lines 136-148) executes a plain HGet even when o.delete is true, causing the delete operation to be skipped. The script should be invoked whenever any mutation (delete or TTL update) is requested.

Based on learnings


41-45: Handle negative TTL values in delta calculation.

When HTTL returns -2 (no expiry set) or -1 (key doesn't exist or field expired), adding ttlDelta directly produces incorrect results. The code should check ttl > 0 before computing the new expiration.

Apply this diff:

 elseif ttlDelta > 0 then
   local ttl = redis.call('HTTL', KEYS[1], field)
-  local newTtl = ttl + ttlDelta
-  redis.call('HExpire', KEYS[1], newTtl, field)
+  if ttl > 0 then
+    redis.call('HExpire', KEYS[1], ttl + ttlDelta, field)
+  else
+    redis.call('HExpire', KEYS[1], ttlDelta, field)
+  end
 end
pkg/cache/memory.go (2)

105-109: Handle zero validUntil when applying AndUpdateTTL.

When item.validUntil.IsZero() (no TTL set), calling item.validUntil.Add(*o.updateTTL) adds the duration to the zero time, producing an incorrect timestamp. For items without an existing TTL, set validUntil to time.Now().Add(*o.updateTTL) instead.

Apply this diff:

 } else if o.updateTTL != nil {
-  item.validUntil = item.validUntil.Add(*o.updateTTL)
+  if item.validUntil.IsZero() {
+    item.validUntil = time.Now().Add(*o.updateTTL)
+  } else {
+    item.validUntil = item.validUntil.Add(*o.updateTTL)
+  }
 } else if o.defaultTTL {
   item.validUntil = time.Now().Add(m.ttl)
 }

72-75: Missing defensive copy in Drain return values.

Line 74 assigns item.value directly without copying. Callers could mutate the returned byte slices, affecting internal references. Make defensive copies of each value.

Apply this diff:

 items := make(map[string][]byte, len(cpy))
 for key, item := range cpy {
-  items[key] = item.value
+  items[key] = append([]byte(nil), item.value...)
 }
🧹 Nitpick comments (6)
deployments/grafana/dashboards/messages.json (3)

36-39: Unify Prometheus datasource references (avoid mixed UID and name).

Panels use panel.datasource.uid "edqp0a73uh2bka" but target.datasource.uid "Prometheus". Pick one consistently (prefer a DS variable or the same UID) to prevent import/runtime binding issues.

Example change:

-          "datasource": { "uid": "Prometheus" }
+          "datasource": { "uid": "edqp0a73uh2bka" }

Also applies to: 97-99, 113-116, 202-205, 216-219, 310-313, 324-327, 414-415


101-107: Use increase() for distribution over the selected time range.

Summing raw counters yields total since process start, not distribution for now-24h.

- "expr": "sum by (state) (sms_messages_total{instance=~\"$instance\",job=~\"$job\"})"
+ "expr": "sum by (state) (increase(sms_messages_total{instance=~\"$instance\",job=~\"$job\"}[$__range]))"

206-209: Error-rate denominator likely too broad; restrict to relevant states.

Current total includes all events; prefer terminal/processing states to avoid skew.

- "expr": "sum(rate(sms_messages_total{state=\"Failed\",instance=~\"$instance\",job=~\"$job\"}[$__rate_interval])) / sum(rate(sms_messages_total{instance=~\"$instance\",job=~\"$job\"}[$__rate_interval]))"
+ "expr": "sum(rate(sms_messages_total{state=\"Failed\",instance=~\"$instance\",job=~\"$job\"}[$__rate_interval])) / clamp_min(sum(rate(sms_messages_total{state=~\"Processed|Sent|Delivered|Failed\",instance=~\"$instance\",job=~\"$job\"}[$__rate_interval])), 1)"
internal/sms-gateway/modules/messages/cache.go (1)

22-28: Validate TTL at construction.

Guard against zero/negative TTL to avoid immortal entries by accident.

 func newCache(config Config, storage cacheImpl.Cache) *cache {
-	return &cache{
-		ttl: config.CacheTTL,
+	ttl := config.CacheTTL
+	if ttl <= 0 {
+		ttl = 5 * time.Minute
+	}
+	return &cache{
+		ttl: ttl,
 		storage: storage,
 	}
 }
pkg/cache/cache.go (1)

17-18: Clarify immutability/copy semantics for []byte results.

Document whether implementations must return a copy (recommended) to avoid aliasing/mutation issues.

-// Otherwise, it returns the value and nil.
+// Otherwise, it returns the value and nil.
+// The returned slice MUST NOT be mutated by callers.
+// Implementations SHOULD return a copy to avoid aliasing shared buffers.
 Get(ctx context.Context, key string, opts ...GetOption) ([]byte, error)
internal/sms-gateway/modules/messages/config.go (1)

5-8: Validate/default CacheTTL.

Ensure non-positive TTLs are rejected or defaulted during config load to prevent immortal cache entries.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 94cda62 and 5cd39d0.

📒 Files selected for processing (16)
  • api/requests.http (1 hunks)
  • deployments/grafana/dashboards/messages.json (1 hunks)
  • internal/config/config.go (3 hunks)
  • internal/config/module.go (1 hunks)
  • internal/sms-gateway/handlers/messages/3rdparty.go (1 hunks)
  • internal/sms-gateway/handlers/messages/mobile.go (1 hunks)
  • internal/sms-gateway/modules/messages/cache.go (1 hunks)
  • internal/sms-gateway/modules/messages/config.go (1 hunks)
  • internal/sms-gateway/modules/messages/domain.go (1 hunks)
  • internal/sms-gateway/modules/messages/metrics.go (1 hunks)
  • internal/sms-gateway/modules/messages/module.go (1 hunks)
  • internal/sms-gateway/modules/messages/service.go (5 hunks)
  • pkg/cache/cache.go (2 hunks)
  • pkg/cache/memory.go (9 hunks)
  • pkg/cache/options.go (1 hunks)
  • pkg/cache/redis.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • pkg/cache/options.go
  • api/requests.http
  • internal/sms-gateway/modules/messages/domain.go
  • internal/config/module.go
  • internal/sms-gateway/handlers/messages/mobile.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T23:53:42.006Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/redis.go:63-75
Timestamp: 2025-09-10T23:53:42.006Z
Learning: In pkg/cache/redis.go, the Redis cache implementation uses Redis's native field-level TTL via HExpireAt, meaning expired fields are automatically removed by Redis itself. This differs from the memory implementation where the application tracks expiration times. As a result, operations like Drain that call HGetAll will only return non-expired items since expired fields are already removed by Redis.

Applied to files:

  • pkg/cache/redis.go
📚 Learning: 2025-09-10T11:11:43.462Z
Learnt from: capcom6
PR: android-sms-gateway/server#178
File: pkg/cache/memory.go:59-74
Timestamp: 2025-09-10T11:11:43.462Z
Learning: In pkg/cache/memory.go, the cleanup method removes expired items first, then executes the provided callback function. This means methods like Drain that use cleanup don't need additional expiration filtering since expired items are already removed before the callback runs.

Applied to files:

  • pkg/cache/memory.go
🧬 Code graph analysis (7)
internal/sms-gateway/handlers/messages/3rdparty.go (1)
internal/sms-gateway/handlers/converters/messages.go (1)
  • MessageStateToDTO (45-55)
internal/sms-gateway/modules/messages/cache.go (3)
internal/sms-gateway/modules/messages/config.go (1)
  • Config (5-8)
internal/sms-gateway/modules/messages/domain.go (1)
  • MessageStateOut (38-44)
pkg/cache/options.go (2)
  • WithTTL (22-30)
  • AndSetTTL (64-68)
pkg/cache/redis.go (3)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (88-92)
  • Option (6-6)
pkg/cache/errors.go (1)
  • ErrKeyNotFound (7-7)
pkg/cache/cache.go (1)
  • Cache (5-36)
pkg/cache/cache.go (1)
pkg/cache/options.go (2)
  • Option (6-6)
  • GetOption (48-48)
internal/sms-gateway/modules/messages/service.go (8)
internal/sms-gateway/modules/devices/service.go (2)
  • Service (30-39)
  • NewService (157-165)
internal/sms-gateway/modules/sse/service.go (2)
  • Service (17-25)
  • NewService (38-47)
internal/sms-gateway/modules/events/service.go (2)
  • Service (13-24)
  • NewService (26-38)
internal/sms-gateway/modules/messages/config.go (1)
  • Config (5-8)
internal/sms-gateway/modules/db/module.go (1)
  • IDGen (8-8)
internal/sms-gateway/models/models.go (2)
  • Device (25-36)
  • User (17-23)
internal/sms-gateway/modules/messages/repository.go (1)
  • ErrMessageNotFound (18-18)
internal/sms-gateway/handlers/middlewares/deviceauth/deviceauth.go (1)
  • WithDevice (83-87)
pkg/cache/memory.go (2)
pkg/cache/options.go (3)
  • GetOption (48-48)
  • AndDelete (88-92)
  • Option (6-6)
pkg/cache/cache.go (1)
  • Cache (5-36)
internal/sms-gateway/modules/messages/module.go (3)
internal/sms-gateway/modules/cleaner/module.go (1)
  • AsCleanable (8-14)
internal/sms-gateway/cache/factory.go (1)
  • Factory (17-19)
internal/sms-gateway/modules/messages/service.go (2)
  • NewService (44-67)
  • Service (30-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Benchmark
  • GitHub Check: E2E
  • GitHub Check: Analyze (go)
🔇 Additional comments (6)
pkg/cache/cache.go (1)

5-36: Interface update verified—all implementations and callers consistent.

Verification confirms both implementations (redis.go and memory.go) and all callsites correctly use the updated interface with []byte and GetOption parameters. Set, Get, GetAndDelete, Drain, and SetOrFail methods are consistently implemented across all cache providers with matching signatures.

internal/sms-gateway/handlers/messages/3rdparty.go (1)

235-235: Review comment is incorrect. No nil deference guard needed.

The GetState implementation shows that all code paths return either an error (non-nil) or a valid *MessageStateOut pointer. Specifically:

  • When dto == nil from cache, it returns (nil, ErrMessageNotFound) with error
  • When message lookup fails, it returns (nil, err) with error
  • When successful, it returns (dto, nil) with a pointer from anys.AsPointer(modelToMessageState(message))

The existing error check (if err != nil) already prevents any nil pointer dereference. The proposed guard is redundant and unnecessary.

Likely an incorrect or invalid review comment.

internal/config/config.go (1)

17-17: LGTM! Clean configuration additions.

The new Messages configuration struct with cache TTL and processed lifetime settings is well-structured with appropriate YAML and environment variable tags. Default values (5 minutes cache, 30 days lifetime) are reasonable.

Also applies to: 75-78, 109-112

internal/sms-gateway/modules/messages/metrics.go (1)

1-60: LGTM! Clean Prometheus metrics implementation.

The metrics struct provides a focused API for tracking message states and cache performance. Using promauto ensures automatic registration, and the naming conventions follow Prometheus best practices.

internal/sms-gateway/modules/messages/module.go (1)

11-35: LGTM! Clean refactor to function-based module.

The shift from a public Module variable to a Module() function improves the DI wiring. The use of fx.Private for internal providers (metrics, repository, hashing task, cache) properly encapsulates implementation details while exposing only the public Service interface.

internal/sms-gateway/modules/messages/service.go (1)

44-75: LGTM! Clean service refactor with explicit dependencies.

The updated constructor and RunBackgroundTasks method provide clear dependency injection and proper goroutine lifecycle management using WaitGroup.

@capcom6 capcom6 force-pushed the messages/cache-state branch from 1b8e0f4 to de4af16 Compare October 22, 2025 01:34
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
internal/sms-gateway/modules/messages/service.go (2)

228-240: Critical: Wrong cache key when ID is auto-generated.

When the input message.ID is empty, the code generates a new ID in msg.ExtID (line 229) but still caches using the empty message.ID (line 237). This creates an invalid cache key, breaking subsequent reads.

Trace:

  1. Input message.ID is empty
  2. Line 229: msg.ExtID = s.idgen() generates a new ID
  3. Line 231: state.ID = msg.ExtID correctly sets output
  4. Line 237: s.cache.Set(..., message.ID, ...) uses empty message.ID as key ❌

Fix:

-	if err := s.cache.Set(context.Background(), device.UserID, message.ID, anys.AsPointer(modelToMessageState(msg))); err != nil {
+	if err := s.cache.Set(context.Background(), device.UserID, msg.ExtID, anys.AsPointer(modelToMessageState(msg))); err != nil {
 		s.logger.Warn("can't cache message", zap.String("id", msg.ExtID), zap.Error(err))
 	}

Use msg.ExtID which contains either the provided ID or the auto-generated one.


134-167: Critical: Cache hit/miss metrics are incorrectly counted.

Based on the past review analysis, the cache implementation returns (nil, nil) for both "key not in cache" (true miss) and "key cached as nil" (cached not-found). This causes the following issues:

  1. Line 136-137: When err == nil, the code increments cache hits. However, the cache returns (nil, nil) for true misses (key never cached), so these are incorrectly counted as hits.

  2. Indistinguishable states: The cache stores empty bytes []byte{} for nil values and returns (nil, nil) when len(data) == 0, making "never cached" indistinguishable from "cached as nil".

Impact: First-time lookups of non-existent messages are counted as cache hits, corrupting your metrics and hiding the actual cache performance.

Recommended fix: Update the cache interface to distinguish these three states explicitly:

-func (s *Service) GetState(user models.User, ID string) (*MessageStateOut, error) {
-	dto, err := s.cache.Get(context.Background(), user.ID, ID)
-	if err == nil {
-		s.metrics.IncCache(true)
-
-		// Cache nil entries represent "not found" and prevent repeated lookups
-		if dto == nil {
-			return nil, ErrMessageNotFound
-		}
-		return dto, nil
-	}
-	s.metrics.IncCache(false)
+func (s *Service) GetState(user models.User, ID string) (*MessageStateOut, error) {
+	dto, found, err := s.cache.Get(context.Background(), user.ID, ID)
+	if err != nil {
+		// Cache error - log and fall through to DB
+		s.logger.Warn("cache error", zap.String("id", ID), zap.Error(err))
+	} else if found {
+		s.metrics.IncCache(true)
+		if dto == nil {
+			return nil, ErrMessageNotFound
+		}
+		return dto, nil
+	}
+	// Not found in cache (true miss)
+	s.metrics.IncCache(false)

This requires updating the cache interface to return a found boolean (similar to Go's map lookup pattern).

🧹 Nitpick comments (1)
deployments/grafana/dashboards/messages.json (1)

37-38: Consider using datasource variable for portability.

The datasource UID "edqp0a73uh2bka" is hardcoded throughout the dashboard. This may require manual updates when importing into different Grafana instances. Consider using a datasource variable or the default Prometheus datasource reference for better portability.

Also applies to: 97-98, 114-115, 217-218, 310-311, 325-326, 413-414, 444-445, 467-468

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b8e0f4 and de4af16.

📒 Files selected for processing (3)
  • deployments/grafana/dashboards/messages.json (1 hunks)
  • internal/sms-gateway/modules/messages/metrics.go (1 hunks)
  • internal/sms-gateway/modules/messages/service.go (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/sms-gateway/modules/messages/service.go (7)
internal/sms-gateway/modules/devices/service.go (2)
  • Service (30-39)
  • NewService (157-165)
internal/sms-gateway/modules/messages/config.go (1)
  • Config (5-8)
internal/sms-gateway/modules/db/module.go (1)
  • IDGen (8-8)
internal/sms-gateway/modules/messages/domain.go (2)
  • MessageStateIn (31-36)
  • MessageStateOut (38-44)
internal/sms-gateway/modules/messages/repository_filter.go (2)
  • MessagesSelectFilter (16-23)
  • MessagesSelectOptions (25-36)
internal/sms-gateway/modules/messages/repository.go (1)
  • ErrMessageNotFound (18-18)
internal/sms-gateway/handlers/middlewares/deviceauth/deviceauth.go (1)
  • WithDevice (83-87)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: E2E
  • GitHub Check: Lint
  • GitHub Check: Benchmark
  • GitHub Check: Analyze (go)
🔇 Additional comments (3)
internal/sms-gateway/modules/messages/metrics.go (1)

1-60: LGTM! Clean Prometheus metrics implementation.

The metrics module follows Prometheus best practices with proper naming conventions, automatic registration via promauto, and clear help text. The implementation is straightforward and correct.

internal/sms-gateway/modules/messages/service.go (2)

30-67: LGTM! Good refactoring for explicit dependency injection.

The Service struct and constructor now use explicit dependencies, making the code more testable and following better DI practices.


90-121: LGTM! UpdateState correctly caches with existing.ExtID.

The method properly caches the updated state using existing.ExtID as the key, which ensures consistency. Metrics are correctly incremented.

@capcom6 capcom6 merged commit 3107d85 into master Oct 22, 2025
11 checks passed
@capcom6 capcom6 deleted the messages/cache-state branch October 22, 2025 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deployed The PR is deployed on staging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant