-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Rename next
to dest_state
and next()
to get_dest_state()
.
#86
base: main
Are you sure you want to change the base?
Conversation
…hat result from the change.
c866da0
to
2ff9f3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
src/log_surgeon/finite_automata/NfaState.hpp (2)
88-88
: Doc string alignment.
The new comment referencesSpontaneousTransition::serialize
. Double-check that the intended usage matches the updated method names and that the doc lines remain accurate.
92-92
: Returning a reference tobool
is risky.
Returningbool const&
can be confusing, as copying abool
is typically preferred. Consider returning by value instead to avoid possible misuse.- [[nodiscard]] auto is_accepting() const -> bool const& { return m_accepting; } + [[nodiscard]] auto is_accepting() const -> bool { return m_accepting; }src/log_surgeon/Lexer.hpp (1)
172-191
: Consider simplifying the register ID retrieval logic.While the implementation is correct, it could be more concise by combining the optional checks:
auto get_reg_ids_from_capture_id(capture_id_t const capture_id ) const -> std::optional<std::pair<reg_id_t, reg_id_t>> { auto const optional_tag_id_pair{get_tag_id_pair_from_capture_id(capture_id)}; - if (false == optional_tag_id_pair.has_value()) { - return std::nullopt; - } - auto const [start_tag_id, end_tag_id]{optional_tag_id_pair.value()}; - - auto const optional_start_reg_id{get_reg_id_from_tag_id(start_tag_id)}; - if (false == optional_start_reg_id.has_value()) { - return std::nullopt; - } - - auto const optional_end_reg_id{get_reg_id_from_tag_id(end_tag_id)}; - if (false == optional_end_reg_id.has_value()) { - return std::nullopt; - } - - return {optional_start_reg_id.value(), optional_end_reg_id.value()}; + if (optional_tag_id_pair) { + auto const [start_tag_id, end_tag_id]{optional_tag_id_pair.value()}; + auto const optional_start_reg_id{get_reg_id_from_tag_id(start_tag_id)}; + auto const optional_end_reg_id{get_reg_id_from_tag_id(end_tag_id)}; + if (optional_start_reg_id && optional_end_reg_id) { + return {optional_start_reg_id.value(), optional_end_reg_id.value()}; + } + } + return std::nullopt; }tests/CMakeLists.txt (1)
12-14
: Good architectural improvement!The replacement of
TaggedTransition.hpp
withSpontaneousTransition.hpp
andTagOperation.hpp
suggests a better separation of concerns, aligning with previous feedback about tagged transitions not sharing enough functionality.CMakeLists.txt (1)
81-81
: LGTM! The build system changes reflect a good architectural improvement.The replacement of
TaggedTransition.hpp
withSpontaneousTransition.hpp
andTagOperation.hpp
suggests a better separation of concerns, which should make the codebase more maintainable.This change aligns well with the Single Responsibility Principle by separating transition logic from tag operations.
Also applies to: 83-83
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
CMakeLists.txt
(1 hunks)src/log_surgeon/Lexer.hpp
(2 hunks)src/log_surgeon/Lexer.tpp
(8 hunks)src/log_surgeon/LexicalRule.hpp
(2 hunks)src/log_surgeon/SchemaParser.cpp
(1 hunks)src/log_surgeon/finite_automata/Dfa.hpp
(1 hunks)src/log_surgeon/finite_automata/DfaState.hpp
(2 hunks)src/log_surgeon/finite_automata/DfaStatePair.hpp
(1 hunks)src/log_surgeon/finite_automata/Nfa.hpp
(7 hunks)src/log_surgeon/finite_automata/NfaState.hpp
(6 hunks)src/log_surgeon/finite_automata/RegexAST.hpp
(4 hunks)src/log_surgeon/finite_automata/SpontaneousTransition.hpp
(1 hunks)src/log_surgeon/finite_automata/TagOperation.hpp
(1 hunks)tests/CMakeLists.txt
(1 hunks)tests/test-nfa.cpp
(2 hunks)tests/test-register-handler.cpp
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,ts,tsx}`: - Prefer `false ==
**/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/Dfa.hpp
src/log_surgeon/finite_automata/DfaStatePair.hpp
src/log_surgeon/finite_automata/DfaState.hpp
tests/test-register-handler.cpp
tests/test-nfa.cpp
src/log_surgeon/finite_automata/SpontaneousTransition.hpp
src/log_surgeon/Lexer.hpp
src/log_surgeon/LexicalRule.hpp
src/log_surgeon/finite_automata/TagOperation.hpp
src/log_surgeon/finite_automata/RegexAST.hpp
src/log_surgeon/finite_automata/NfaState.hpp
src/log_surgeon/SchemaParser.cpp
src/log_surgeon/finite_automata/Nfa.hpp
**/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/Dfa.hpp
src/log_surgeon/finite_automata/DfaStatePair.hpp
src/log_surgeon/finite_automata/DfaState.hpp
tests/test-register-handler.cpp
tests/test-nfa.cpp
src/log_surgeon/finite_automata/SpontaneousTransition.hpp
src/log_surgeon/Lexer.hpp
src/log_surgeon/LexicalRule.hpp
src/log_surgeon/finite_automata/TagOperation.hpp
src/log_surgeon/finite_automata/RegexAST.hpp
src/log_surgeon/finite_automata/NfaState.hpp
src/log_surgeon/SchemaParser.cpp
src/log_surgeon/finite_automata/Nfa.hpp
🧠 Learnings (5)
tests/CMakeLists.txt (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/TaggedTransition.hpp:16-37
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In `src/log_surgeon/finite_automata/TaggedTransition.hpp`, the classes `PositiveTaggedTransition` and `NegativeTaggedTransition` currently do not share enough functionality to justify refactoring into a common base class.
CMakeLists.txt (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/TaggedTransition.hpp:16-37
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In `src/log_surgeon/finite_automata/TaggedTransition.hpp`, the classes `PositiveTaggedTransition` and `NegativeTaggedTransition` currently do not share enough functionality to justify refactoring into a common base class.
src/log_surgeon/Lexer.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#42
File: src/log_surgeon/finite_automata/RegexNFA.hpp:37-90
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In this codebase, prefer code clarity over efficiency optimizations unless efficiency is a critical concern.
src/log_surgeon/finite_automata/RegexAST.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexAST.hpp:700-700
Timestamp: 2024-11-13T22:38:19.472Z
Learning: In `RegexASTCapture`, `m_tag` must always be non-null.
src/log_surgeon/finite_automata/NfaState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: build (ubuntu-22.04, Release)
- GitHub Check: build (ubuntu-22.04, Debug)
- GitHub Check: build (macos-latest, Release)
🔇 Additional comments (48)
src/log_surgeon/finite_automata/NfaState.hpp (12)
4-24
: Header inclusions updated.
The new headers support spontaneous transitions and tagging logic. There are no obvious issues with ordering or redundant includes.
40-44
: Constructor setsm_accepting
.
This constructor explicitly marks the state as accepting. Ensure that any usage aligns with your design intent (i.e., that all states created with this constructor should indeed be accepting).
46-52
: Chained constructor usage.
Callingadd_spontaneous_transition
within the constructor is clear and reduces duplication. This matches the new spontaneous transition model well.
54-65
: Validate pointer lifetime.
add_spontaneous_transition
storesdest_state
in a vector. Confirm that the lifetime ofdest_state
is at least as long as the lifetime of thisNfaState
to avoid dangling references.
98-101
: Providing read-only access to spontaneous transitions.
Returning a const reference is appropriate if the caller only needs inspection. This design is consistent with the rest of the interface.
103-105
: Consistent style for byte transitions.
The getter mirrors the spontaneous transitions approach and returns a reference for outside inspection. This appears coherent with the new code structure.
107-107
: Getter for tree transitions.
Exposingm_tree_transitions
via a const reference is consistent with the pattern used for other transitions.
112-112
: New memberm_spontaneous_transitions
.
This vector neatly encapsulates the transitions. No concerns regarding the container choice.
178-179
: Epsilon closure with spontaneous transitions.
This replaces older tagged or epsilon transitions effectively. The logic of pushingget_dest_state()
onto the stack is straightforward and consistent.
188-190
: Conditional acceptance string.
Buildingaccepting_tag_string
ifm_accepting
is true is a convenient approach, ensuring minimal overhead when the state is not accepting.
201-208
: Serializing spontaneous transitions.
All transitions are collected inserialized_spontaneous_transitions
. Returningstd::nullopt
upon failure is consistent with the rest of the design.
211-215
: Overall serial format.
Including both byte transitions and spontaneous transitions in the same format string is clear and conforms to the new approach.src/log_surgeon/finite_automata/Nfa.hpp (12)
4-21
: Refreshed includes and doc clarifications.
The updated includes and doc lines align with the new spontaneous transitions model and usage of optional return types.
43-44
: New doc block fornew_state
.
The explanation that the returned state has no spontaneous transitions is helpful.
47-52
:new_accepting_state
creation
Creating an accepting state with a dedicated method is more explicit, improving readability. Confirm that all references to your older approach are removed.
54-62
: Handling negative captures.
The doc string describing how tags are negated is consistent, andnew_state_from_negative_captures
helps clarify usage.
65-76
: Positive capture method documentation.
Renamed to reflect “start and end states from positive capture.” The new name is clearer for future maintainers.
85-88
: Serialization withstd::optional<std::string>
.
Returning an optional indicates possible failure, which is a good practice. Double-check that all call sites handle the empty case gracefully.
96-96
: Retrievingm_root
.
The direct getter is concise. No issues found so far.
148-152
: Implementation ofnew_accepting_state
.
The approach of constructing theTypedNfaState
withmatching_variable_id
insidem_states
is consistent and well contained.
155-170
:new_state_from_negative_captures
implementation details.
Moves all relevant tag IDs into the new state. This logic looks correct for negating multiple captures.
173-185
:new_start_and_end_states_from_positive_capture
usage.
Attaching transitions for start and end tags in a single step is a clean approach, avoiding confusion from older tagged transitions.
188-216
: BFS traversal including spontaneous transitions.
Expanding BFS to queue spontaneous transitions ensures no states remain inaccessible. This approach closely mirrors the old epsilon concept.
221-238
: Optional-basedserialize
.
Aborting on the first serialization failure is an appropriate strategy. The rest of the method remains straightforward.src/log_surgeon/finite_automata/RegexAST.hpp (3)
24-24
: Additional include forTagOperation.hpp
.
All references to tag operations appear consolidated, which aids clarity in the capturing logic.
117-125
: Refined negative captures approach.
The doc clarifies that negative captures generate a spontaneous transition to apply negation. This consistent terminology aligns well with the updated code.
911-942
: Capture group NFA documentation.
The updated notes on spontaneous transitions clearly illustrate the new approach, from setting start tags to negating alternate tags, then ending with capturing final tags. This is an excellent update for maintainability.src/log_surgeon/LexicalRule.hpp (2)
4-5
: LGTM!The addition of the
<vector>
include is necessary as it's used in theget_captures()
method.
45-48
: LGTM!The refactoring improves code clarity by consolidating state creation and variable ID assignment into a single method call.
src/log_surgeon/finite_automata/TagOperation.hpp (2)
13-17
: LGTM!The enum class is well-defined with clear, descriptive values.
25-31
: LGTM!Efficient implementation of comparison operators using std::tie for member-wise comparison.
src/log_surgeon/finite_automata/SpontaneousTransition.hpp (2)
26-30
: LGTM!Well-designed constructors with clear parameter handling and move semantics.
53-67
: LGTM!The implementation is robust with:
- Proper error handling using std::optional
- Efficient use of C++20 ranges for transformation
- Clear string formatting using fmt
src/log_surgeon/finite_automata/DfaState.hpp (2)
49-49
: LGTM!The method rename improves clarity and the const return type enhances const correctness.
60-75
: LGTM!The implementation:
- Preserves the existing logic while improving const correctness
- Maintains efficient handling of both Byte and Utf8 state types
src/log_surgeon/finite_automata/DfaStatePair.hpp (2)
72-73
: LGTM! Method renaming improves clarity.The renaming from
next
toget_dest_state
makes the method's purpose more explicit and self-documenting.
74-74
: LGTM! Condition checks follow coding guidelines.The conditions have been updated to follow the coding guidelines:
- Using
nullptr != x
pattern for pointer checks- Using
false == x
pattern for boolean expressionsAlso applies to: 76-76
tests/test-nfa.cpp (1)
65-67
: LGTM! Improved error handling with optional type.The change to use
std::optional
for the serialization result improves error handling by explicitly checking for success before using the value.tests/test-register-handler.cpp (1)
38-38
: LGTM! Added const qualifier for immutability.Adding the
const
qualifier toempty_handler
better expresses the intent that this test variable should not be modified.src/log_surgeon/finite_automata/Dfa.hpp (2)
8-8
: LGTM! Added required include for std::stack.The addition of
<stack>
header is necessary for usingstd::stack
in the implementation.
62-62
: LGTM! Condition checks follow coding guidelines.The conditions have been updated to follow the coding guidelines by using the
false == x
pattern for boolean expressions.Also applies to: 117-117
src/log_surgeon/Lexer.hpp (2)
6-17
: LGTM! The new header inclusions are appropriate.The addition of
<map>
andNfaState.hpp
headers supports the new functionality while maintaining proper header organization.
132-138
: Well-implemented optional return with safe map access!The method properly checks for key existence before access and returns an optional value, following best practices.
src/log_surgeon/Lexer.tpp (4)
87-88
: LGTM! The state transition renaming improves clarity.The change from
next
toget_dest_state
better describes the method's purpose.
218-219
: Good consistency in state transition handling!The explicit type declaration and consistent method naming improve code clarity.
280-283
: Well-implemented modern C++ practices!The use of
contains
for set membership and consistent state transition naming improve code quality.
340-340
: LGTM! Consistent method naming.The change maintains consistency with the new state transition naming convention.
src/log_surgeon/SchemaParser.cpp (1)
3-23
:❓ Verification inconclusive
Verify the impact of replacing Lexer with Reader.
The switch from
Lexer.hpp
toReader.hpp
suggests a significant change in input handling. Please ensure all lexer functionality previously used is available through the Reader interface.
🏁 Script executed:
#!/bin/bash # Search for any remaining Lexer usage that might need updating rg -l "Lexer" --type cpp --type hppLength of output: 63
Action Required: Re-verify the Lexer Replacement Impact
The recent switch from
Lexer.hpp
toReader.hpp
appears to be a significant refactoring of input handling. Our initial automated search for the term “Lexer” ran into issues with header file type recognition. Please re-run your search using the following command to accurately target both header and source files:#!/bin/bash # Re-run the search for "Lexer" in all .hpp and .cpp files rg -l Lexer -g '*.hpp' -g '*.cpp'
- Confirm that there are no lingering references to
Lexer
in the codebase.- Verify that all functionality previously provided by the lexer is now properly supported by the
Reader
interface.
[[nodiscard]] auto serialize() const -> std::string { | ||
switch (m_type) { | ||
case TagOperationType::Set: | ||
return fmt::format("{}{}", m_tag_id, "p"); | ||
case TagOperationType::Negate: | ||
return fmt::format("{}{}", m_tag_id, "n"); | ||
case TagOperationType::None: | ||
return "none"; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a default case to the switch statement.
The switch statement should handle all possible enum values, including future additions.
Apply this diff:
case TagOperationType::None:
return "none";
+ default:
+ return "unknown";
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
[[nodiscard]] auto serialize() const -> std::string { | |
switch (m_type) { | |
case TagOperationType::Set: | |
return fmt::format("{}{}", m_tag_id, "p"); | |
case TagOperationType::Negate: | |
return fmt::format("{}{}", m_tag_id, "n"); | |
case TagOperationType::None: | |
return "none"; | |
} | |
} | |
[[nodiscard]] auto serialize() const -> std::string { | |
switch (m_type) { | |
case TagOperationType::Set: | |
return fmt::format("{}{}", m_tag_id, "p"); | |
case TagOperationType::Negate: | |
return fmt::format("{}{}", m_tag_id, "n"); | |
case TagOperationType::None: | |
return "none"; | |
default: | |
return "unknown"; | |
} | |
} |
References
Description
next
andnext()
todest_state
andget_dest_state()
, respectively.Validation performed
Summary by CodeRabbit
New Features
Refactor
Tests