Skip to content

Skip broken HTML preview test case with libxml >= 2.14 #18413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

hardfalcon
Copy link

@hardfalcon hardfalcon commented May 8, 2025

The test_no_tree test case is known to fail with libxml >= 2.14, so skip it for the time being when libxml >= 2.14 is used.

Reported-by: Ivan Shapovalov [email protected]
Ref: https://gitlab.gnome.org/GNOME/libxml2/-/issues/908
Tested-by: Pascal Ernster [email protected]
Closes: #18406

@hardfalcon hardfalcon requested a review from a team as a code owner May 8, 2025 10:37
@CLAassistant
Copy link

CLAassistant commented May 8, 2025

CLA assistant check
All committers have signed the CLA.

@hardfalcon hardfalcon force-pushed the pr/fix-tests-with-libxml-2.14 branch 3 times, most recently from 12ba8b2 to 5897fd6 Compare May 13, 2025 08:13
@hardfalcon hardfalcon force-pushed the pr/fix-tests-with-libxml-2.14 branch from 5897fd6 to 4926c90 Compare May 24, 2025 13:39
@@ -324,6 +326,9 @@ def test_empty(self) -> None:

def test_no_tree(self) -> None:
"""A valid body with no tree in it."""
if etree.LIBXML_VERSION >= (2, 14):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What versions specifically is this broken and fixed in?

These tests seem to pass fine for me locally with libxml2 2.14.3-1 on Manjaro Linux (Arch-based).

$ SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.media.test_html_preview
The "poetry.dev-dependencies" section is deprecated and will be removed in a future version. Use "poetry.group.dev.dependencies" instead.
tests.media.test_html_preview
  MediaEncodingTestCase
    test_content_type ...                                                  [OK]
    test_duplicates ...                                                    [OK]
    test_fallback ...                                                      [OK]
    test_meta_charset ...                                                  [OK]
    test_meta_charset_underscores ...                                      [OK]
    test_meta_xml_encoding ...                                             [OK]
    test_unknown_invalid ...                                               [OK]
    test_xml_encoding ...                                                  [OK]
  OpenGraphFromHtmlTestCase
    test_comment ...                                                       [OK]
    test_comment2 ...                                                      [OK]
    test_empty ...                                                         [OK]
    test_empty_description ...                                             [OK]
    test_h1_as_title ...                                                   [OK]
    test_invalid_encoding ...                                              [OK]
    test_invalid_encoding2 ...                                             [OK]
    test_missing_title ...                                                 [OK]
    test_missing_title_and_broken_h1 ...                                   [OK]
    test_nested_nodes ...                                                  [OK]
    test_no_tree ...                                                       [OK]
    test_script ...                                                        [OK]
    test_simple ...                                                        [OK]
    test_twitter_tag ...                                                   [OK]
    test_windows_1252 ...                                                  [OK]
    test_xml ...                                                           [OK]
  SummarizeTestCase
    test_long_summarize ...                                                [OK]
    test_short_summarize ...                                               [OK]
    test_small_then_large_summarize ...                                    [OK]

-------------------------------------------------------------------------------
Ran 27 tests in 0.025s

PASSED (successes=27)

I don't see the specific fix commit mentioned in the issue in the diff for the latest tag, https://gitlab.gnome.org/GNOME/libxml2/-/compare/v2.14.2...v2.14.3?from_project_id=1665 so I'm confused why it's working for me.

We may want to adjust this condition depending on the answer here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just re-tested, and building synapse 1.130.0 with libxml 2.14.3 and lxml 5.4.0 still fails for me. I'm using the PKGBUILD of the matrix-synapse 1.130.0-1 package from Arch Linux, but with rm-faling-test.patch removed, and I'm running both the build and the tests in a clean chroot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we're using the same versions, there seems to be another factor at play here and I don't think we've gotten to the root issue yet.

Perhaps it would be good to see the exact test failure output you're seeing. Although, based on the issue you linked, I could see how the test might fail.

If I look closer at the source and add some debug logs to the decode_body(...) function, I can see that it goes all the way to the end of the function and uses utf-8 encoding.

def test_no_tree(self) -> None:
"""A valid body with no tree in it."""
html = b"\x00"
tree = decode_body(html, "http://example.com/test.html")
self.assertIsNone(tree)


I'm testing with Synapse from source. My previous reply was with [email protected] and testing again now with [email protected] since it just got updated.

# `pamac` is the package manager on my system, Manajaro Linux (Arch-based).
# Note: The `python-lxml` listed here isn't relevant to my local Synapse from source
# which will use a Poetry virtual environment
$ pamac search libxml2 --installed
python-lxml  5.4.0-1                                           extra
    Python3 binding for the libxml2 and libxslt libraries
perl-alien-libxml2  0.20-1                                     extra
    Install the C libxml2 library on your system
lib32-libxml2  2.14.3-1                                     multilib
    XML C parser and toolkit (32-bit)
libxml2  2.14.3-1                                               core
    XML C parser and toolkit

All the tests pass:

$ SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.media.test_html_preview
The "poetry.dev-dependencies" section is deprecated and will be removed in a future version. Use "poetry.group.dev.dependencies" instead.
tests.media.test_html_preview
  MediaEncodingTestCase
    test_content_type ...                                                  [OK]
    test_duplicates ...                                                    [OK]
    test_fallback ...                                                      [OK]
    test_meta_charset ...                                                  [OK]
    test_meta_charset_underscores ...                                      [OK]
    test_meta_xml_encoding ...                                             [OK]
    test_unknown_invalid ...                                               [OK]
    test_xml_encoding ...                                                  [OK]
  OpenGraphFromHtmlTestCase
    test_comment ...                                                       [OK]
    test_comment2 ...                                                      [OK]
    test_empty ...                                                         [OK]
    test_empty_description ...                                             [OK]
    test_h1_as_title ...                                                   [OK]
    test_invalid_encoding ...                                              [OK]
    test_invalid_encoding2 ...                                             [OK]
    test_missing_title ...                                                 [OK]
    test_missing_title_and_broken_h1 ...                                   [OK]
    test_nested_nodes ...                                                  [OK]
    test_no_tree ...                                                       [OK]
    test_script ...                                                        [OK]
    test_simple ...                                                        [OK]
    test_twitter_tag ...                                                   [OK]
    test_windows_1252 ...                                                  [OK]
    test_xml ...                                                           [OK]
  SummarizeTestCase
    test_long_summarize ...                                                [OK]
    test_short_summarize ...                                               [OK]
    test_small_then_large_summarize ...                                    [OK]

-------------------------------------------------------------------------------
Ran 27 tests in 0.015s

PASSED (successes=27)

@@ -0,0 +1 @@
Skip broken HTML preview test case with libxml >= 2.14.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue (#18406) mentions that this isn't reproducible with [email protected]. Could another alternative fix just be to update that dependency instead? -> #18480

The `test_no_tree` test case is known to fail with libxml >= 2.14, so
skip it for the time being when libxml >= 2.14 is used.

Reported-by: Ivan Shapovalov <[email protected]>
Ref: https://gitlab.gnome.org/GNOME/libxml2/-/issues/908
Tested-by: Pascal Ernster <[email protected]>
Closes: element-hq#18406
Signed-off-by: Pascal Ernster <[email protected]>
@hardfalcon hardfalcon force-pushed the pr/fix-tests-with-libxml-2.14 branch from 4926c90 to 8aad329 Compare May 30, 2025 05:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTML preview unit tests fail with libxml2 2.14.2
3 participants