Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: check that the previous xref is not the just processed xref #727

Merged
merged 4 commits into from
Aug 16, 2024
Merged

fix: check that the previous xref is not the just processed xref #727

merged 4 commits into from
Aug 16, 2024

Conversation

tkegan
Copy link
Contributor

@tkegan tkegan commented Jul 25, 2024

Type of pull request

  • Bug fix (involves code and configuration changes)

About

A customer uploaded several files to my employer's product which cause one of our backend processes using smalot/pdfparser (thanks) to bomb with a fatal out of memory error. I attempted to employ the configuration documented. This did not work. Based on metadata this PDF is a scan made on some copier, pdf version 1.7. I did a bit of tracing the problem and found that memory was exhausted in an infinite loop. It seems the copier adds Prev 0 in the trailing portion of a xref. The proposed change checks for this unique case where previous is the just processed xref and ignores it. This works for my employer's product.

Checklist for code / configuration changes

In case you changed the code/configuration, please read each of the following checkboxes as they contain valuable information:

  • Please add at least one test case (unit test, system test, ...) to demonstrate that the change is working. If existing code was changed, your tests cover these code parts as well.
    By the way, you don't have to provide a full fledged PDF file to demonstrate a fix. Instead a unit test may be sufficient sometimes,
    please have a look at FontTest for example code.
    Code changes without any tests are likely to be rejected. If you dont know how to write tests, no problem, tell us upfront and we may add them ourselves or discuss other ways.
  • Please run PHP-CS-Fixer before committing, to confirm with our coding styles. See https://github.com/smalot/pdfparser/blob/master/.php-cs-fixer.php for more information about our coding styles.

@tkegan tkegan marked this pull request as ready for review July 25, 2024 18:12
@k00ni k00ni added the fix label Jul 26, 2024
@k00ni k00ni changed the title fix: check that that previous xref is not the just processed xref fix: check that the previous xref is not the just processed xref Jul 26, 2024
Copy link
Collaborator

@k00ni k00ni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. This really sound like an edge case. Did you experience anything like that before @j0k3r @GreyWyvern?

At first glance the changes look good. I ran the code locally and it fails without the fix.

tests/PHPUnit/Integration/RawData/RawDataParserTest.php Outdated Show resolved Hide resolved
@k00ni k00ni merged commit ac8e667 into smalot:master Aug 16, 2024
29 checks passed
@k00ni
Copy link
Collaborator

k00ni commented Aug 16, 2024

Thank you @tkegan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants