Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed missing CRLF at end of a chunk in malformed chunked encoding me… #10359

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGES/10355.bugfix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Fixed missing CRLF in a malformed chunked encoded message body -- by
:user:`xdegaye`.

A missing CRLF in a chunked encoded message now raises an exception instead of
silently discarding the following part of the message.
13 changes: 13 additions & 0 deletions aiohttp/http_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -880,6 +880,19 @@ def feed_data(
chunk = chunk[len(SEP) :]
self._chunk = ChunkState.PARSE_CHUNKED_SIZE
else:
length = len(chunk)
if (
self._lax
and length
or (
not self._lax
and (length and chunk[:1] != SEP[:1] or length > 1)
)
):
exc = TransferEncodingError("Missing CRLF at chunk end")
set_exception(self.payload, exc)
raise exc
# Get the CRLF or the missing LF in the next chunk.
self._chunk_tail = chunk
return False, b""

Expand Down
63 changes: 63 additions & 0 deletions tests/test_http_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -1761,6 +1761,69 @@
assert out.is_eof()
assert b"asdf" == b"".join(out._buffer)

async def test_parse_chunked_payload_split_CRLF(
Copy link
Member

@Dreamsorcerer Dreamsorcerer Jan 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a test that uses the response: HttpResponseParser fixture, rather than using HttpPayloadParser directly?
I think we'd want to ensure that both parsers work the same way, this only tests the Python parser (which is not used by most users).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is this second parser ?
Please bear with me, I have only been looking at aiohttp source code for few days.

The tests check the changes made in HttpPayloadParser, not in the second parser.

Copy link
Member

@Dreamsorcerer Dreamsorcerer Jan 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if you use the fixture as mentioned above, the test will be run against both parsers (i.e. run twice).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not answer my question. Please, what is the name of the second parser ?

Why should changes that affect HttpPayloadParser be tested against the second parser ?

What if the tests against the second parser fail ?
In that case:

  • do you consider that these failures must be fixed within this PR ?
  • would'nt it be better to update the title of the PR and the issue to restrict their scope to HttpPayloadParser ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a Python parser and a C parser (llhttp). Both should behave the same way, hence why we have a fixture to run the same tests against both parsers.

self, protocol: BaseProtocol
) -> None:
loop = asyncio.get_running_loop()
for lax, SEP in ((False, b"\r\n"), (True, b"\n")):
out = aiohttp.StreamReader(protocol, 2**16, loop=loop)
p = HttpPayloadParser(out, chunked=True, lax=lax)
p.feed_data(b"4\r\nasdf", SEP=SEP) # type: ignore[arg-type]
p.feed_data(b"\r\n0\r\n\r\n", SEP=SEP) # type: ignore[arg-type]

assert out.is_eof()
assert b"asdf" == b"".join(out._buffer)

async def test_parse_chunked_payload_split_CRLF2(
self, protocol: BaseProtocol
) -> None:
loop = asyncio.get_running_loop()
for lax, SEP in ((False, b"\r\n"), (True, b"\n")):
out = aiohttp.StreamReader(protocol, 2**16, loop=loop)
p = HttpPayloadParser(out, chunked=True, lax=lax)
p.feed_data(b"4\r\nasdf\r", SEP=SEP) # type: ignore[arg-type]
p.feed_data(b"\n0\r\n\r\n", SEP=SEP) # type: ignore[arg-type]

assert out.is_eof()
assert b"asdf" == b"".join(out._buffer)

async def test_parse_chunked_payload_split_CRLF3(
self, protocol: BaseProtocol
) -> None:
loop = asyncio.get_running_loop()
for lax, SEP in ((False, b"\r\n"), (True, b"\n")):
out = aiohttp.StreamReader(protocol, 2**16, loop=loop)
p = HttpPayloadParser(out, chunked=True, lax=lax)
with pytest.raises(http_exceptions.TransferEncodingError):
p.feed_data(b"4\r\nasdf", SEP=SEP) # type: ignore[arg-type]
p.feed_data(b"X\n0\r\n\r\n", SEP=SEP) # type: ignore[arg-type]
exc = out.exception()
assert isinstance(exc, http_exceptions.TransferEncodingError)
assert "Missing CRLF" in exc.args[0]

async def test_parse_chunked_payload_split_CRLF4(
self, protocol: BaseProtocol
) -> None:
out = aiohttp.StreamReader(protocol, 2**16, loop=asyncio.get_running_loop())
p = HttpPayloadParser(out, chunked=True)
with pytest.raises(http_exceptions.TransferEncodingError):
p.feed_data(b"4\r\nasdfX")
p.feed_data(b"\n0\r\n\r\n")

Check warning on line 1811 in tests/test_http_parser.py

View check run for this annotation

Codecov / codecov/patch

tests/test_http_parser.py#L1811

Added line #L1811 was not covered by tests
exc = out.exception()
assert isinstance(exc, http_exceptions.TransferEncodingError)
assert "Missing CRLF" in exc.args[0]

async def test_parse_chunked_payload_missing_CRLF(
self, protocol: BaseProtocol
) -> None:
out = aiohttp.StreamReader(protocol, 2**16, loop=asyncio.get_running_loop())
p = HttpPayloadParser(out, chunked=True)
with pytest.raises(http_exceptions.TransferEncodingError):
p.feed_data(b"4\r\nasdf\rX0\r\n\r\n")
exc = out.exception()
assert isinstance(exc, http_exceptions.TransferEncodingError)
assert "Missing CRLF" in exc.args[0]

async def test_http_payload_parser_length(self, protocol: BaseProtocol) -> None:
out = aiohttp.StreamReader(protocol, 2**16, loop=asyncio.get_running_loop())
p = HttpPayloadParser(out, length=2)
Expand Down
Loading