-
Notifications
You must be signed in to change notification settings - Fork 29
Add http-path protocol in reference with go-multiaddr #94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
multiaddr/codecs/http_path.py
Outdated
IS_PATH = True | ||
SIZE = -1 # LengthPrefixedVarSize | ||
|
||
|
||
class Codec(CodecBase): | ||
SIZE = SIZE | ||
IS_PATH = IS_PATH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Analysis of http-path Codec ConfigurationQuestion Summary@lla-dane asked for confirmation on the correctness of the http-path codec configuration: IS_PATH = True
SIZE = -1 # LengthPrefixedVarSize
class Codec(CodecBase):
SIZE = SIZE
IS_PATH = IS_PATH And requested guidance on writing tests for multiaddr + http-path. Analysis ResultsCodec Configuration: ✅ CORRECTThe codec configuration in
Go Implementation ReferenceFrom protoHTTPPath = Protocol{
Name: "http-path",
Code: P_HTTP_PATH,
VCode: CodeToVarint(P_HTTP_PATH),
Size: LengthPrefixedVarSize, // This is -1
Transcoder: TranscoderHTTPPath,
} The Go transcoder implementation ( var TranscoderHTTPPath = NewTranscoderFromFunctions(httpPathStB, httpPathBtS, validateHTTPPath)
func httpPathStB(s string) ([]byte, error) {
unescaped, err := url.QueryUnescape(s)
if err != nil {
return nil, err
}
if len(unescaped) == 0 {
return nil, fmt.Errorf("empty http path is not allowed")
}
return []byte(unescaped), err
}
func httpPathBtS(b []byte) (string, error) {
if len(b) == 0 {
return "", fmt.Errorf("empty http path is not allowed")
}
return url.QueryEscape(string(b)), nil
}
func validateHTTPPath(b []byte) error {
if len(b) == 0 {
return fmt.Errorf("empty http path is not allowed")
}
return nil
} Issue Found: Protocol Registration❌ PROBLEM: The protocol is incorrectly registered in Protocol(P_HTTP_PATH, "http-path", None), # ❌ Should be "http_path" ✅ SOLUTION: Should be: Protocol(P_HTTP_PATH, "http-path", "http_path"), This means the protocol currently doesn't use the http_path codec, which explains why the codec configuration might seem unused. Testing Guidance for multiaddr + http-path1. Basic Roundtrip Testsdef test_http_path_multiaddr_roundtrip():
"""Test basic http-path in multiaddr string roundtrip"""
test_cases = [
"/http-path/foo",
"/http-path/foo/bar",
"/http-path/api/v1/users",
]
for addr_str in test_cases:
m = Multiaddr(addr_str)
assert str(m) == addr_str
# Verify protocol value extraction
path_value = m.value_for_protocol(P_HTTP_PATH)
expected_path = addr_str.replace("/http-path", "")
assert path_value == expected_path 2. URL Encoding Testsdef test_http_path_url_encoding():
"""Test special characters and URL encoding behavior"""
test_cases = [
("/foo bar", "/foo%20bar"),
("/path/with/special!@#", "/path/with/special%21%40%23"),
("/こんにちは", "/%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%81%AF"),
("/tmp/bar", "/tmp%2Fbar"), # Forward slash encoding
]
for input_path, expected_encoded in test_cases:
addr_str = f"/http-path{input_path}"
m = Multiaddr(addr_str)
# The string representation should show URL-encoded path
assert str(m) == f"/http-path{expected_encoded}" 3. Complex Multiaddr Testsdef test_http_path_in_complex_multiaddr():
"""Test http-path as part of larger multiaddr chains"""
test_cases = [
"/ip4/127.0.0.1/tcp/443/tls/http/http-path/api/v1",
"/ip4/127.0.0.1/tcp/80/http/http-path/static/css",
"/dns/example.com/tcp/443/tls/http/http-path/docs",
]
for addr_str in test_cases:
m = Multiaddr(addr_str)
assert str(m) == addr_str
# Verify we can extract the http-path value
path_value = m.value_for_protocol(P_HTTP_PATH)
assert path_value.startswith("/") 4. Error Handling Testsdef test_http_path_error_cases():
"""Test error handling for invalid http-path values"""
# Empty path should raise error
with pytest.raises(StringParseError):
Multiaddr("/http-path/")
# Missing path value should raise error
with pytest.raises(StringParseError):
Multiaddr("/http-path")
# Invalid URL encoding should raise error
with pytest.raises(StringParseError):
Multiaddr("/http-path/invalid%zz") 5. Protocol Value Extraction Testsdef test_http_path_value_extraction():
"""Test extracting http-path values from multiaddr"""
test_cases = [
("/http-path/foo", "foo"),
("/http-path/foo/bar", "foo/bar"),
("/http-path/api/v1/users", "api/v1/users"),
("/ip4/127.0.0.1/tcp/80/http/http-path/docs", "docs"),
]
for addr_str, expected_path in test_cases:
m = Multiaddr(addr_str)
path_value = m.value_for_protocol(P_HTTP_PATH)
assert path_value == expected_path 6. Binary Roundtrip Testsdef test_http_path_binary_roundtrip():
"""Test binary encoding/decoding roundtrip"""
test_cases = [
"/http-path/foo",
"/http-path/foo/bar",
"/http-path/api/v1/users",
]
for addr_str in test_cases:
m1 = Multiaddr(addr_str)
binary = m1.to_bytes()
m2 = Multiaddr.from_bytes(binary)
assert str(m1) == str(m2)
assert m1 == m2 7. Edge Cases and Special Charactersdef test_http_path_edge_cases():
"""Test edge cases and special character handling"""
# Test with various special characters
special_paths = [
"/path with spaces",
"/path/with/multiple/slashes",
"/path/with/unicode/测试",
"/path/with/symbols!@#$%^&*()",
]
for path in special_paths:
addr_str = f"/http-path{path}"
m = Multiaddr(addr_str)
# Should handle encoding properly
assert m.value_for_protocol(P_HTTP_PATH) == path Implementation NotesCurrent Test CoverageThe existing tests in
Missing Test CoverageAfter fixing the protocol registration, add tests for:
Recommendations
ConclusionThe codec configuration is correct, but the protocol registration needs to be fixed to properly use the http_path codec. Once fixed, the testing patterns provided above will ensure comprehensive coverage of http-path functionality in multiaddr contexts. |
Thanks @acul71, for the detailed review. |
@acul71 @seetadev: So |
@lla-dane : HI Abhinav. Thank you for sharing this update and the clear explanation around the http-path parsing issue. You’ve articulated the problem very well — I can see how treating all protocols uniformly as /proto/value would break down when handling multi-segment values like /http-path/api/v1/users. Your reasoning to handle http-path in a way similar to unix, by joining the remaining segments into a single value, makes complete sense. It’s a neat and practical approach that aligns with how other implementations solve similar cases. This not only resolves a parsing bug but also improves developer experience, since http-path support is quite relevant for modern web-based use cases built on libp2p. We look forward to reviewing your implementation and future commits as you expand protocol coverage. |
@acul71 @seetadev : There are a few errors happening in the speacial character encoding parts in http-path addresses in two specific tests in the
2.
I dont understand how to we fix this. Most probably we have to change something in Only this case is failing, other than these all cases are working correcly. |
- Update http_path codec to use quote(s, safe=) for consistent URL encoding - Remove redundant URL encoding from transforms.py to prevent double encoding - Update all HTTP path tests to expect URL-encoded values consistently - Fix protocol tests to use same URL encoding approach as codec - Ensure cross-language compatibility with Go multiaddr implementation Fixes test failures in: - test_http_path_url_encoding - test_http_path_edge_cases - test_http_path_bytes_string_roundtrip - test_http_path_special_characters All 251 tests now pass.
HTTP Path URL Encoding FixProblemThe HTTP path codec had inconsistent URL encoding behavior causing test failures:
Root CauseThe issue was in the SolutionUpdated def to_string(self, buf: bytes) -> str:
if len(buf) == 0:
raise ValueError("empty http path is not allowed")
# Return URL-encoded string to match Go implementation
return quote(buf.decode("utf-8"), safe="") Key Changes
Result
The fix ensures HTTP paths are properly URL-encoded throughout the multiaddr system, maintaining compatibility with the broader libp2p ecosystem. |
- Add test_http_path_only_reads_http_path_part: Test that http-path only reads its own part, not subsequent protocols - Add test_http_path_malformed_percent_escape: Test rejection of malformed percent-escapes like %f - Add test_http_path_raw_value_access: Test accessing raw unescaped values (similar to Go's SplitLast/RawValue) - Fix http-path protocol parsing: Remove http-path from special path handling in _from_string - Fix IS_PATH = False for http-path codec (should not consume all remaining parts) - Fix string_to_bytes to handle protocols with SIZE=0 (like p2p-circuit) - Fix codec SIZE attribute check to handle codecs without SIZE attribute These changes ensure the Python implementation matches Go behavior exactly: - http-path only consumes its immediate value, not subsequent protocols - Proper handling of flag protocols (SIZE=0) in string-to-bytes conversion - Complete test coverage matching Go implementation All 254 tests pass, ensuring cross-language compatibility.
@lla-dane Please review the changes |
Tracks multiformats/multiaddr#181
This is a WIP PR, which aims to add the remaining protocols in py-libp2p in reference with go-libp2p as mentioned in the above issue.
Protocols added in this PR:
More protocols to be added in the same in future commits.