Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable length key support and major template refactoring to support key as a template parameter #675

Conversation

thompsonbry
Copy link

@thompsonbry thompsonbry commented Feb 2, 2025

See #600
See #634
See #613
See #610

Note: test_art_span.cpp is disabled. This has the tests for variable length keys. It will remain disabled until we merge and then branch again. Right now the focus is to get the key template refactor back to mainline.

@coderabbitai pause
@coderabbitai resolve
@coderabbitai ignore

Note: I had a local repository snafu. Creating this new PR replacing #661. All edits should already be applied in this PR per #661 and per DM chat.

@laurynas-biveinis

Summary by CodeRabbit

  • Refactor
    • Standardized key handling with explicit 64-bit keys to streamline data operations and improve performance.
    • Revised internal APIs and memory management for adaptive data structures.
  • New Features
    • Introduced extensive tests for key encoding/decoding and span-based operations to enhance system robustness.
  • Chores
    • Updated build configurations and documentation, including enhanced test targets and copyright notices.

Copy link

coderabbitai bot commented Feb 2, 2025

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Walkthrough

The changes update numerous source files by standardizing the key type from unodb::key to std::uint64_t. Function signatures, template classes, and type aliases across ART implementation files, benchmarks, tests, and examples are refactored accordingly. Several source files have been removed or consolidated, build configurations and CMake targets adjusted, and new tests (e.g., for key encoding/decoding) added. Overall, the updates ensure type safety and consistency throughout the codebase.

Changes

File(s) Change Summary
.gitignore, CMakeLists.txt Removed ignored shell scripts (run-cmake.sh, loop-test.sh) and /docs/; updated CMake warning flags, library source lists, and added new test target (test_key_encode_decode) with adjusted dependencies.
ART Implementation Files
(art.cpp, art_common.cpp, art_internal*.hpp, mutex_art.hpp, olc_art.cpp)
Removed redundant ART source files; refactored headers to use templates for key type support; changed key type to std::uint64_t; updated utility functions and internal APIs.
Benchmark Files
(benchmark/micro_benchmark*.cpp, ...concurrency.hpp, etc.)
Changed loop counters and function signatures from unodb::key to std::uint64_t; updated namespace qualifiers (e.g., to unodb::benchmark) for consistency.
Test Files
(test/*)
Modified test cases and type aliases to use std::uint64_t; added new tests (e.g., test_key_encode_decode.cpp); updated ART test frameworks and instantiations for type safety.
Examples & Fuzz Files
(examples/*, fuzz_deepstate/*)
Updated database instantiations and key types to explicitly use std::uint64_t.
Utilities
(portability_builtins.hpp, db_test_utils.*)
Introduced templated byte-swap (bswap) functions with specializations; refined template instantiations for improved type specificity.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant KeyEncoder
    participant Database
    participant KeyDecoder

    User->>KeyEncoder: Provide raw key input
    KeyEncoder->>Database: Encode key (std::uint64_t) and process request
    Database->>KeyDecoder: Return key view for verification
    KeyDecoder->>User: Decode and deliver key result
Loading

Poem

I’m a rabbit hopping through the code delight,
Keys now shine in a 64-bit light.
Templates revamped, tests on a spree,
Bugs run away as changes set free.
With a twitch of my nose and a joyful leap,
I celebrate changes that forever keep!
🐰✨

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 25

🔭 Outside diff range comments (5)
art_internal_impl.hpp (5)

339-565: basic_art_policy Class

  1. Unified Policy
    Parameterizing by Key, Db, CriticalSectionPolicy, etc., provides flexibility. Orchestrating them might become complex. Ensure unit tests thoroughly cover each policy combination.
  2. Atomics & Concurrency
    Watch for concurrency issues if multiple threads modify the same policy or same data structures, especially with read_critical_section. Revisit these sections when enabling concurrency or OLC.
  3. delete_subtree
    Recursively deletes children. In extremely large or deeply skewed trees, this can cause a deep recursion stack. Consider iterative approaches or balanced rebalancing if recursion is a concern.

603-734: key_prefix Union

  1. Prefix Compression
    The design of storing up to 7 bytes in a single machine word is neat. Just ensure you handle corner cases (like empty prefix or very short keys) gracefully.
  2. cut and prepend
    These methods rely heavily on correct bit manipulations. Additional unit tests for boundary conditions (cutting 7 bytes, adding partial bytes) will be beneficial.
  3. Potential Overflow
    Confirm that shifting by prefix1_bit_length + 8U does not exceed 63 bits if prefix1 is near capacity. The existing assertions help, but be sure to keep them updated if key_prefix_capacity changes.

1156-1234: basic_inode: Parameterizing Node Capacities

  1. Min/Max Children
    The template parameters MinSize, Capacity, etc., are well-defined, making the code self-documenting.
  2. Constructor Overloads
    The separate constructors for building this node from smaller or larger node types follow the classical ART growth/shrink approach. Check for edge cases when node transitions from 4 children to 5, etc.
  3. OLC-Specific Path
    The doc mentions building nodes "optimistically". Ensure that if an OLC transaction gets aborted or retried, these partial node expansions do not leak memory or break the b-tree structure.

1692-2073: basic_inode_16

  1. Key Sorting with SIMD
    The insertion logic uses lower_bound or SSE to find an insertion point. This is well-optimized but must remain in sync with children_count.
  2. Downsizing from Node48
    The constructor that reclaims an inode48 is interesting; carefully ensure we do not skip or incorrectly handle partial children in a half-filled Node48.
  3. Lexicographic Iteration
    The iteration logic is consistent with an ordered array of children. Unit tests for forward and backward iteration are essential.

2074-2600: basic_inode_48

  1. Sparse Child Indexing
    Indexing child pointers with child_indexes[256] is a good trade-off for medium fullness. Validate that empty_child (0xFF) is handled properly, especially in concurrency scenarios.
  2. Removal & Compaction
    remove_child_pointer sets the pointer to nullptr and sets child_indexes[...] = empty_child. Check that no stale pointer remains.
  3. Performance
    Searching for the next or previous child requires scanning up or down an array of 256 entries. This is acceptable for mid-range node fullness, but keep an eye on the performance profile if the structure is heavily used.
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b4bae1 and 02d2127.

📒 Files selected for processing (39)
  • .gitignore (0 hunks)
  • CMakeLists.txt (2 hunks)
  • art.cpp (0 hunks)
  • art.hpp (9 hunks)
  • art_common.cpp (0 hunks)
  • art_common.hpp (1 hunks)
  • art_internal.cpp (0 hunks)
  • art_internal.hpp (3 hunks)
  • art_internal_impl.hpp (45 hunks)
  • benchmark/micro_benchmark.cpp (13 hunks)
  • benchmark/micro_benchmark_concurrency.hpp (6 hunks)
  • benchmark/micro_benchmark_key_prefix.cpp (8 hunks)
  • benchmark/micro_benchmark_mutex.cpp (2 hunks)
  • benchmark/micro_benchmark_n16.cpp (2 hunks)
  • benchmark/micro_benchmark_n256.cpp (2 hunks)
  • benchmark/micro_benchmark_n4.cpp (5 hunks)
  • benchmark/micro_benchmark_n48.cpp (2 hunks)
  • benchmark/micro_benchmark_node_utils.hpp (14 hunks)
  • benchmark/micro_benchmark_olc.cpp (2 hunks)
  • benchmark/micro_benchmark_utils.cpp (2 hunks)
  • benchmark/micro_benchmark_utils.hpp (7 hunks)
  • examples/example_art.cpp (3 hunks)
  • examples/example_art_stats.cpp (1 hunks)
  • examples/example_olc_art.cpp (1 hunks)
  • fuzz_deepstate/test_art_fuzz_deepstate.cpp (7 hunks)
  • mutex_art.hpp (6 hunks)
  • olc_art.cpp (0 hunks)
  • portability_builtins.hpp (1 hunks)
  • test/CMakeLists.txt (3 hunks)
  • test/db_test_utils.cpp (2 hunks)
  • test/db_test_utils.hpp (12 hunks)
  • test/test_art.cpp (1 hunks)
  • test/test_art_concurrency.cpp (8 hunks)
  • test/test_art_iter.cpp (21 hunks)
  • test/test_art_oom.cpp (5 hunks)
  • test/test_art_scan.cpp (37 hunks)
  • test/test_art_span.cpp (1 hunks)
  • test/test_key_encode_decode.cpp (1 hunks)
  • test/test_qsbr.cpp (0 hunks)
💤 Files with no reviewable changes (6)
  • .gitignore
  • art_common.cpp
  • art_internal.cpp
  • test/test_qsbr.cpp
  • art.cpp
  • olc_art.cpp
🧰 Additional context used
🪛 Cppcheck (2.10-2)
test/test_key_encode_decode.cpp

[error] 83-83: There is an unknown macro here somewhere. Configuration is required. If UNODB_DETAIL_DISABLE_MSVC_WARNING is a macro then please configure it.

(unknownMacro)

test/test_art_iter.cpp

[error] 84-84: There is an unknown macro here somewhere. Configuration is required. If UNODB_DETAIL_DISABLE_CLANG_WARNING is a macro then please configure it.

(unknownMacro)

benchmark/micro_benchmark_utils.cpp

[style] 26-26: The function 'destroy_tree' is never used.

(unusedFunction)

test/test_art_scan.cpp

[error] 84-84: There is an unknown macro here somewhere. Configuration is required. If UNODB_DETAIL_DISABLE_CLANG_WARNING is a macro then please configure it.

(unknownMacro)

test/test_art_span.cpp

[error] 84-84: There is an unknown macro here somewhere. Configuration is required. If UNODB_DETAIL_DISABLE_CLANG_WARNING is a macro then please configure it.

(unknownMacro)

🔇 Additional comments (103)
art_common.hpp (7)

21-28: Includes look appropriate
These headers are needed for string operations (<cstring>), formatting (<iomanip>), I/O streams (<iostream>), and aligned memory operations (heap.hpp). No issues found here.


32-33: Forward declarations are fine
Forward-declaring db and olc_db here prevents circular dependencies. Looks good.


46-91: Visitor design is clear
The visitor interface provides read-only access to key and value, preventing references from escaping. This is a clean approach to scanning. Just keep concurrency in mind if multiple scans run in parallel, ensuring no data races on the underlying iterator.


95-106: Documentation clarity
The constant INITIAL_BUFFER_CAPACITY is well documented, and the rationale for a larger capacity is understandable. This is approved.


127-133: Dumping value
Dumping the entire value byte-by-byte is consistent with the approach above. No concerns here.


138-150: Shift-or function for 32-bit
Logic for merging bits into the next power-of-two boundary is correct. No issues found.


152-172: 32-bit and 64-bit next_power_of_two
Using partial template specialization via sizeof(T) is a neat approach. The overflow note is correctly documented. This code is solid.

art_internal.hpp (7)

17-23: New includes
Needed for algorithms (<algorithm>), I/O streaming (<iomanip>, <iostream>), memory allocations (heap.hpp), and builtins (portability_builtins.hpp). No issues here.

Also applies to: 29-29, 31-31


36-37: Forward declarations
No concerns—these are standard forward declarations for internal classes.

Also applies to: 39-41


64-72: get_u64(key_view)
Copies up to 8 bytes from the key into u. This is a straightforward approach for partial or full 64-bit extraction.


246-274: tree_depth
Wraps a 32-bit depth counter. Straightforward and correct.


275-293: basic_db_leaf_deleter
Manages leaf deletion with a reference to the DB. This is a typical RAII deleter pattern, no issues.


303-318: basic_db_inode_deleter
Similar concept to leaf deleter, no concerns here.


320-405: basic_node_ptr
Implements a tagged pointer with node type bits stored in the lower address bits. Implementation is consistent with the alignment-based tagging approach.

test/db_test_utils.hpp (8)

28-28: Switch to std::map
Allows ordering by a custom comparator for key_view keys. This is a valid solution next to an unsupported hash approach.


50-52: Extern template usage
Explicit instantiation to reduce compile times is beneficial. No issues.


56-60: Select correct thread type
Using unodb::qsbr_thread for olc_db, otherwise std::thread, is a neat approach to unify concurrency usage.


73-80: Predefined test_values
Simple arrays that provide sample data for testing. Straightforward usage, no concerns.


90-100: assert_value_eq
Lock checking for mutex_db vs. direct check for other DB types is handled cleanly. Implementation is consistent.


105-122: do_assert_result_eq
Generates helpful debugging output if key is not found. Implementation is fine.


127-135: assert_result_eq with OLC
Applies a quiescent state exit for olc_db. This usage is correct to ensure QSBR sees a stable state.


142-505: tree_verifier
Manages a local map of inserted keys to verify correctness vs. the DB.

  • Uses specialized inserts/removes for olc_db vs. others.
  • The custom comparator for key_view is correct via lexicographic comparison.
  • Overall design is robust and thorough for testing.
test/test_art_span.cpp (11)

1-41: Header & includes appear consistent
No obvious issues spotted with project headers and GTest includes.


42-50: Initialization of test_keys and span_db
The constexpr usage is good for compile-time initialization. The span_db alias succinctly sets up a test DB for variable-length keys. No issues noted; this is well-structured.


57-68: Test fixture setup
The ARTSpanCorrectnessTest fixture carefully templated over Db provides a powerful mechanism to test multiple DB implementations. The approach is clean and extensible.


303-329: DbInsertNodeRecursion
This test verifies deeper recursion on insertion (e.g., inserting a key whose prefix partially matches). The approach tests structural correctness in multiple expansions. Implementation looks sound:

  • Good use of verifier for state validation.
  • The test might benefit from a direct assertion on node depth if available.

330-344: Node16
Checks insertion into Node16 capacity. Properly ensures that the node transitions from Node4 to Node16. No structural or logical concerns detected.


416-428: Node48
Similar pattern: ensures Node48 creation and correct presence/absence checks. The usage of verifier.assert_node_counts(...) is consistent and maintains coverage.


468-480: Node256
The test ensures that once Node48 is saturated, the data structure grows to Node256. The approach thoroughly verifies final node counts. Code logic is consistent with the other node expansions.


520-528: TryDeleteFromEmpty
A test confirming that attempts to delete absent keys don’t alter the DB. This is a valuable negative test case. No issues found.


560-574: Node4AttemptDeleteAbsent
Properly checks that deleting non-existent keys from Node4 is handled gracefully without allocations. Good negative path coverage. Implementation is correct.


1078-1091: Clear
Ensures clearing after a single insertion results in an empty DB. The test logic is straightforward and correct.


1092-1112: TwoInstances
Demonstrates concurrency-like scenario with two DB “instances” in separate threads. The test encapsulates a minimal sync approach with thread_syncs. No concurrency safety is tested, but it verifies that separate DBs remain isolated.

benchmark/micro_benchmark_node_utils.hpp (6)

1-1: Header compliance
Copyright notice updated to 2025, matching project timeline.


53-54: next_key function refactoring
Switching to use std::uint64_t for the key is consistent with the broader changes. The function’s asserts are well-structured. Maintains correctness for zero-bit manipulations.


237-238: generate_keys_to_limit parameter
Replacing unodb::key with std::uint64_t for key_limit aligns with the PR’s objective of standardizing key types. The design remains straightforward.


254-255: generate_random_keys_over_full_smaller_tree
Template adjusted for std::uint64_t. The function collects random keys below the limit, ensuring graceful handling if a generated key surpasses the limit. The logic looks robust.


523-524: Refactored insert_sequentially
Now using std::uint64_t to track the inserted key. The approach is consistent with other changes. The function still increments keys until completion.


539-540: insert_keys
Replacing const std::vector<unodb::key> & with const std::vector<std::uint64_t> & is consistent. Implementation remains an unrolled loop of simple inserts.

art.hpp (8)

36-50: Template-based inode classes
Introducing template parameters like inode<Key>, inode_4<Key>, etc., significantly boosts flexibility for different key types. The arrangement is well-organized.


53-61: node_ptr & inode_defs
Adapting to typed inode definitions ensures internal pointers remain type-safe. This is a neat extension of the existing basic_art_policy.


67-74: art_policy template
Integrates the new Key template parameter deeply into the ART policy. This centralizes node creation, locking stubs, and custom deletors. Great reusability.


84-96: Non-thread-safe db
Explicitly restricting other key types via static_assert(std::is_same_v<Key, std::uint64_t>) is a strong safety net before finalizing multi-type support. No issues found.


104-119: Internal key logic
Refactored methods (get_internal, insert_internal, remove_internal) rely on art_key_type. This ensures a uniform approach to key transformations. The design is consistent with the new template approach.


195-221: Partial prefix matching
Exemplified in insert_internal: splitting into a new inode_4 if the prefix diverges. The dynamic expansion is well-handled with checks and expansions. Comments on prefix-split logic are thorough.


303-329: iterator::next()
Uses a straightforward approach to pop or push inodes from the stack to move to the next leaf. The code is logically consistent, preserving the rev- or forward-traversal state.


396-413: Stats instrumentation
Functions like account_growing_inode guard logic for node expansions. The assertion ensures growth counts do not exceed actual node counts. Provides safety checks in debug scenarios.

art_internal_impl.hpp (10)

45-50: Forward Declarations Introduced

These new forward declarations for template <typename Key> class db and template <typename Key> class olc_db improve modularity by allowing references to these classes without requiring their full definitions. Ensure that all usages of these forward-declared classes include the corresponding headers to avoid linker errors.


80-104: Extensive Leaf Documentation

This block of comments describes planned work for variable-length keys and outlines the design for storing key prefixes and values in the leaf. The TODO notes are clearly spelled out. Verify that these pending design decisions (for example, partial key storage on leaves, optional key sizes, and space optimizations) are being tracked in an issue or backlog, so they can be implemented or removed in future cleanups.


254-289: Factory Function make_db_leaf_ptr

The function allocates memory aligned for a leaf, constructs the leaf in place with placement new, and returns a unique pointer with a custom deleter. Overall design is sound:

  • Exception Safety: Throws std::length_error for oversize keys, which is good.
  • Memory Ownership: The usage of basic_db_leaf_unique_ptr ensures cleanup on destruction—this is correct for RAII.
  • Alignment: Using allocate_aligned is a good practice, but consider verifying that the requested alignment is suitable for all potential CPU instructions (SSE, AVX, etc.).

309-323: basic_db_leaf_deleter Implementation

  1. Stats Decrement
    The code decrements leaf count and size before freeing memory, which is consistent with the increment in make_db_leaf_ptr.
  2. Exception Safety
    Deletion is noexcept, which is correct.
  3. Potential Overhead
    Keep an eye on performance overhead if many leaves are being created/destroyed in quick succession, though the overhead might be offset by the clarity of ownership.

324-334: basic_db_inode_deleter Implementation

  1. Trivially Destructible Constraint
    The assert on std::is_trivially_destructible_v<INode> ensures that no custom destructor logic is missed, which is important for concurrency and lock states.
  2. No Stats Decrement for Freed Memory?
    You do decrement inode_count, but ensure it mirrors the usage patterns of leaf_count for consistent memory usage tracking.

566-601: key_prefix_snapshot Union

Storing prefix data in a union with a single std::uint64_t is an intriguing approach:

  • Atomic Requirements
    If reading/writing this union is done concurrently, confirm that atomic operations are used consistently to avoid data races.
  • Endianness
    The usage of u64 merges prefix bits in a single word. On non-little-endian architectures, re-check the correctness of the bit shifting.

735-764: iter_result and iter_result_opt

This small structure is crucial for iteration in the ART:

  • Key/Child Mismatch
    Storing both key_byte and child_index is a good optimization for skipping extra lookups. Make sure references to them remain valid as the node possibly mutates in multi-threaded scenarios.
  • Optional Usage
    Using std::optional signals “end” conditions. This is a clean approach.

784-1155: basic_inode_impl: Base Inode Logic

  1. Dispatch on node_type
    The code uses large switch statements across node_type. If new node types are added or existing ones are refactored, these dispatch blocks can become error-prone. Keep them in sync with tests.
  2. Key Prefixing
    Many methods rely on partial or shared prefix. Maintaining consistency with basic_leaf is essential.
  3. Atomic children_count
    This concurrency mechanism is meaningful for OLC or multi-threaded updates. Confirm that partial reads or writes in OLC transactions do not lead to missed updates.

1235-1691: basic_inode_4

  1. Sorted Keys
    Maintaining sorted keys in an array for up to 4 children is straightforward. The SSE-based path comparisons are good for performance, but confirm no out-of-bounds usage on the _mm_ intrinsics if children_count is small.
  2. Insert & Remove
    The code carefully shifts keys and child pointers. This is typically correct, but watch for concurrency changes.
  3. Downsizing
    Transitions to a leaf or a smaller node type happen only when a node has 2 children. Confirm that edge cases under concurrency are tested thoroughly.

2601-2850: basic_inode_256

  1. Full Byte Index
    Using a direct array of 256 child pointers is the simplest approach for maximum ARTree node capacity.
  2. Memory Footprint
    This node is quite large in memory. If the workload rarely saturates nodes, consider how often we might do expansions.
  3. Iteration
    The logic for begin() and last() scanning from 0 to 255 or 255 down to 0 is correct. Keep an eye on concurrency around children_count so that you do not skip newly inserted or pending deletions.
benchmark/micro_benchmark_utils.cpp (1)

33-38: Template Instantiations for Specific Key Types

These explicit template instantiations for db<std::uint64_t>, mutex_db<std::uint64_t>, and olc_db<std::uint64_t> unify the benchmark build process and avoid implicit instantiations.

test/db_test_utils.cpp (3)

17-19: Additional Internal Headers

Importing art_common.hpp and art_internal.hpp here suggests that these test utilities now directly manipulate internal ART structures. Ensure that the test code remains stable if internal APIs are refactored, or consider if only partial exposure is needed.


23-35: Instantiation of db<std::uint64_t> / mutex_db<std::uint64_t> / olc_db<std::uint64_t>

Explicitly instantiating these templates ensures that the linker has the implementations for the test environment, especially if they aren’t used by normal compilation units. This is standard practice in “template repository” patterns.


50-65: dump Specializations for basic_art_key

Providing specialized dump() methods for debugging is helpful:

  1. GNU Attributes
    [gnu::cold] and UNODB_DETAIL_NOINLINE reduce performance overhead on hot paths.
  2. Key Type Differences
    One variant for std::uint64_t and one for unodb::key_view. Be sure to unify logging output so that debugging is consistent for both key types.
benchmark/micro_benchmark_mutex.cpp (1)

25-27: LGTM! Improved namespace organization.

Moving mutex_db into the benchmark namespace better reflects its role as a benchmark-specific implementation.

examples/example_art_stats.cpp (1)

31-31: LGTM! Proper template parameter usage.

The change to unodb::db<std::uint64_t> aligns with the codebase's transition to templated key types.

examples/example_art.cpp (1)

34-34: LGTM! Proper template parameter usage.

The change to unodb::db<std::uint64_t> aligns with the codebase's transition to templated key types.

portability_builtins.hpp (1)

30-58: LGTM! Well-designed template implementation for byte swapping.

The template implementation with specializations provides:

  • Type safety through compile-time specializations
  • Efficient compiler-specific implementations
  • Clear and maintainable code structure
benchmark/micro_benchmark_olc.cpp (1)

24-26: LGTM!

The namespace change for the base class improves code organization by properly scoping benchmark-specific database implementations.

examples/example_olc_art.cpp (1)

37-37: LGTM!

The explicit template parameter improves type safety and clarity by specifying the key type directly.

benchmark/micro_benchmark_concurrency.hpp (1)

75-75: LGTM!

The consistent use of std::uint64_t for key-related variables improves type safety and aligns with the key template refactoring.

Also applies to: 79-79, 95-95, 114-114, 120-120, 149-149, 170-173, 176-179, 182-185

benchmark/micro_benchmark_n256.cpp (1)

79-87: LGTM!

The consistent namespace changes across all benchmark templates improve code organization by properly scoping benchmark-specific database implementations.

Also applies to: 89-97, 99-107, 109-117, 119-127, 129-137, 139-147, 149-157, 159-167, 169-177

benchmark/micro_benchmark_n48.cpp (1)

89-207: LGTM!

The namespace changes from unodb to unodb::benchmark are consistent across all benchmark templates, aligning with the broader refactoring effort. The functionality remains unchanged.

benchmark/micro_benchmark_n16.cpp (1)

89-207: LGTM!

The namespace changes from unodb to unodb::benchmark are consistent across all benchmark templates, aligning with the broader refactoring effort. The functionality remains unchanged.

mutex_art.hpp (1)

26-32: Great documentation improvements!

The added documentation clearly explains the purpose and behavior of the class, including the locking strategy and the reference to the alternative implementation.

test/test_art_concurrency.cpp (3)

37-37: LGTM: Type change aligns with standardization effort.

The parameter type change from unodb::key to std::uint64_t is consistent with the broader effort to standardize key types across the codebase.


102-108: LGTM: Well-implemented key decoding function.

The decode function provides a clean way to convert key_view to uint64_t, with proper use of the decoder class.


231-232: LGTM: Updated type alias enhances consistency.

The change to use unodb::test::u64_mutex_db and unodb::test::u64_olc_db aligns with the new templated key type system.

fuzz_deepstate/test_art_fuzz_deepstate.cpp (3)

37-37: LGTM: Oracle type update maintains type safety.

The change to use std::uint64_t in the oracle map aligns with the standardized key type while maintaining type safety.


69-79: LGTM: Key generation function properly adapted.

The key generation function has been correctly updated to use std::uint64_t, maintaining the same functionality with the new type.


222-222: LGTM: Database instantiation updated correctly.

The test database instantiation now properly uses the templated type with std::uint64_t.

benchmark/micro_benchmark_n4.cpp (2)

33-45: LGTM: Key sequence generation properly adapted.

The make_n_key_sequence function has been correctly updated to use std::uint64_t, maintaining the same sequence generation logic.


252-373: LGTM: Benchmark templates consistently updated.

All benchmark templates have been systematically updated to use the benchmark namespace types (unodb::benchmark::db, unodb::benchmark::mutex_db, unodb::benchmark::olc_db).

benchmark/micro_benchmark_key_prefix.cpp (3)

75-76: LGTM: Vector type updates maintain functionality.

The change to use std::vector<std::uint64_t> for key storage aligns with the standardized key type while preserving the original functionality.


303-304: LGTM: Size calculations properly adjusted.

The size calculations have been correctly updated to work with the new vector type while maintaining the same logic.


445-473: LGTM: Benchmark templates systematically updated.

All benchmark templates have been consistently updated to use the benchmark namespace types, maintaining the benchmarking functionality.

test/test_key_encode_decode.cpp (2)

158-361: LGTM! Comprehensive test coverage for integer types.

The test cases thoroughly cover:

  • Exact byte order verification
  • Round-trip encoding/decoding
  • Edge cases (min/max values)
  • Lexicographic ordering

26-37: TODO comments indicate future work for variable-length keys.

The TODO comments outline important future enhancements:

  1. Add coverage for lexicographic ordering of variable-length keys
  2. Add microbenchmarks for key encoder & decoder
  3. Compare performance between fixed-width and variable-length keys

Let's verify if there are any existing issues tracking these TODOs:

test/test_art_oom.cpp (3)

28-40: Well-documented test brittleness.

The comments effectively explain:

  • The dependency on heap allocation counts
  • The impact of code changes on test stability
  • How to handle test failures

74-75: Key type change looks correct.

The function signatures have been properly updated to use std::uint64_t instead of unodb::key.

Also applies to: 89-90


110-112: Type alias update aligns with the broader refactoring.

The ARTTypes alias now uses the new test database types that support std::uint64_t keys.

benchmark/micro_benchmark.cpp (2)

50-51: Key type changes look correct.

All loop counters and key variables have been properly updated to use std::uint64_t.

Also applies to: 84-85, 113-114, 148-149, 184-185, 235-236, 319-320, 347-348, 352-353


369-377: Benchmark templates updated correctly.

The benchmark templates now use the appropriate database types that support std::uint64_t keys.

Also applies to: 379-387, 389-397, 399-407, 409-417, 419-430, 432-440, 442-453, 455-463

test/test_art_iter.cpp (2)

49-51: Type alias update aligns with the broader refactoring.

The ARTTypes alias now uses the new test database types that support std::uint64_t keys.


83-85: Key comparison updates look correct.

All key comparisons have been updated to use the decoded std::uint64_t values.

Also applies to: 100-102, 107-109, 125-127, 132-134, 156-159, 163-165, 170-172, 194-196, 201-203, 208-210, 232-234, 239-241, 246-248, 270-272, 277-279, 284-286

🧰 Tools
🪛 Cppcheck (2.10-2)

[error] 84-84: There is an unknown macro here somewhere. Configuration is required. If UNODB_DETAIL_DISABLE_CLANG_WARNING is a macro then please configure it.

(unknownMacro)

test/test_art.cpp (2)

41-43: LGTM! Type change aligns with uint64_t standardization.

The update to use u64_db variants instead of raw db types is consistent with the codebase-wide standardization to std::uint64_t keys.


41-43: LGTM!

The update to use u64_db, u64_mutex_db, and u64_olc_db test types aligns with the broader refactor to standardize key types across the codebase.

test/test_art_scan.cpp (5)

47-53: LGTM! Clean key decoding abstraction.

The new decode function provides a clear and consistent way to handle key conversion, improving code maintainability.


170-172: LGTM! Type change aligns with uint64_t standardization.

The update to use u64_db variants is consistent with the codebase-wide standardization.


39-39: LGTM!

The TODO comment properly documents the need for unit tests to support variable length keys, aligning with the PR objectives.


170-172: LGTM!

The update to use u64_db, u64_mutex_db, and u64_olc_db test types is consistent with the changes in test_art.cpp and aligns with the key type standardization refactor.


84-85: LGTM!

The key type changes from unodb::key to std::uint64_t are consistently applied across all test cases, maintaining proper test coverage. The decode function is correctly used to handle key conversions.

Also applies to: 139-141, 221-222, 262-264, 285-286, 306-307, 327-328, 591-592, 612-613, 633-634, 655-656, 676-677, 697-698

🧰 Tools
🪛 Cppcheck (2.10-2)

[error] 84-84: There is an unknown macro here somewhere. Configuration is required. If UNODB_DETAIL_DISABLE_CLANG_WARNING is a macro then please configure it.

(unknownMacro)

test/CMakeLists.txt (2)

67-69: Build configuration aligns with PR objectives.

The addition of test_key_encode_decode and temporary disabling of test_art_span matches the PR description. The test_art_span will be re-enabled in a future PR.


88-89: LGTM! Test dependencies updated correctly.

Dependencies are properly updated to include the new key encode/decode tests.

CMakeLists.txt (2)

705-707: LGTM! Source files removed as part of template refactor.

The removal of implementation files (art.cpp, art_common.cpp, olc_art.cpp) aligns with the move to a template-based implementation where the code needs to be in headers.


95-95: Verify impact of removing -Wmissing-declarations.

The removal of this GCC warning flag could allow undeclared functions to slip through. Consider keeping it or documenting why it was removed.

✅ Verification successful

Removal of -Wmissing-declarations does not appear to have resulted in any undeclared functions slipping through.

An examination of the function definitions across the codebase did not reveal any problematic cases where declaration mismatches would have been flagged by GCC with the warning enabled. In our code, functions are either defined in source files or properly declared in headers, so the removal does not seem to compromise code quality. However, it is important to document the rationale behind this removal for future maintainability.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if any functions are missing declarations in header files
rg -A 5 "^[^/]*\b(void|int|char|double|float|long|short|unsigned|bool|auto)\b.*\([^)]*\)\s*\{" | grep -v "static"

Length of output: 65650

@thompsonbry
Copy link
Author

@coderabbitai pause
@coderabbitai resolve
@coderabbitai ignore

Copy link

coderabbitai bot commented Feb 2, 2025

@thompsonbry I'll pause the automated reviews, resolve all review comments, and ignore this PR.

[ignore]

✅ Actions performed

Comments resolved.


Reviews paused.

Copy link

codecov bot commented Feb 3, 2025

Codecov Report

Attention: Patch coverage is 97.64808% with 27 lines in your changes missing coverage. Please review.

Project coverage is 98.35%. Comparing base (3185758) to head (d051648).
Report is 38 commits behind head on master.

Files with missing lines Patch % Lines
art.hpp 95.59% 18 Missing ⚠️
test/db_test_utils.cpp 0.00% 6 Missing ⚠️
art_internal.hpp 94.91% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #675      +/-   ##
==========================================
- Coverage   98.46%   98.35%   -0.12%     
==========================================
  Files          33       33              
  Lines        6651     7043     +392     
==========================================
+ Hits         6549     6927     +378     
- Misses        102      116      +14     
Flag Coverage Δ
Debug 98.33% <97.64%> (+0.66%) ⬆️
Release 96.83% <97.58%> (-0.99%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

sonarqubecloud bot commented Feb 4, 2025

Quality Gate Failed Quality Gate failed

Failed conditions
12.7% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

@thompsonbry thompsonbry force-pushed the varkeys-template-restored branch from f2fc621 to d051648 Compare February 4, 2025 17:04
@laurynas-biveinis laurynas-biveinis merged commit 2fd62f3 into laurynas-biveinis:master Feb 5, 2025
138 of 149 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants