Skip to content

Conversation

addie9800
Copy link
Collaborator

This PR:

  • adds TheMexicoNews - the first Mexican publisher
  • Fixes the Decoding of HTML Unicode characters in the parsing of LinkedDataMapping
  • Fixes an issue in text_content() where text was omitted, if contained within the tail of an excluded tag. To reproduce, check this article: https://mexiconewsdaily.com/news/bad-idea-mexico-pushes-back-trump-universal-steel-tariffs/ without the change, the sentence The tariffs are set to take effect on March 12 and, as things stand, will apply to all countries’ steel and aluminum exports to the United States. is not extracted.

addie9800 added 2 commits July 9, 2025 23:14
# Conflicts:
#	src/fundus/parser/utility.py
#	src/fundus/publishers/__init__.py
@addie9800 addie9800 requested a review from MaxDall July 9, 2025 21:18
Copy link
Collaborator

@MaxDall MaxDall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addie9800 Thanks a lot for adding the first Mexican publisher and fixing the unescaped HTML 🚀

@MaxDall MaxDall mentioned this pull request Jul 17, 2025
@addie9800 addie9800 merged commit 1f88941 into master Jul 17, 2025
5 checks passed
@addie9800 addie9800 deleted the add-mx-with-language-chagens branch July 17, 2025 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants