Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print improvements: don't print unprintable filetypes, improve PDF conversion and mimetype detection #2166

Merged
merged 12 commits into from
Aug 22, 2024

Conversation

rocodes
Copy link
Contributor

@rocodes rocodes commented Aug 13, 2024

Status

Ready for review

Documentation changes tk as well

Description

Printer-focused UX improvements/support: (#918)

  • Detect file mimetypes via mimetype instead of by file extension before printing
  • Add new status codes to report mimetype detection/conversion/unprintable types back to the user
  • Update print dialog to display a message if the user tries to print an unprintable type

Broadening printable filetype support: (#1725)

  • Use libreoffice headless mode instead of unoconv for file conversion
  • Attempt to convert for printing any file type that is supported by libreoffice, by parsing its .desktop entries on sd-devices. This broadens our printable filetypes from ~10 to ... a lot more than 10 :) ... although some may not print well. (This isn't quite an accurate list; I will add a docs PR).

Dependency management:

  • Specify libreoffice as a dependency of the securedrop-export and securedrop-viewer packages in the debian/control file. This is a change for two reasons:
    • Removes the no-install-recommends flag, meaning Java (openjdk-17-jdk at present time) will be installed, which it currently is not. This is intentional; it will allow us to support viewing and converting more filetypes. When updated client packages have been released, we can remove the Salt logic that installs libreoffice: see Remove Salt logic installing libreoffice securedrop-workstation#1162
  • Adjust paxctld.conf to allow openjdk-17 (needed for a dependency of libreoffice, ca-certificates-java).
  • Deprecate unoconv dependency

Fixes #918
Fixes #1725 (although we will still want to look at an overall consistent way of handling file types in the client, and there have been a few suggestions)
Refs (probably fixes, but needs testing) #1731

Test Plan

  • Visual review
  • CI passing
  • Manual testing (below)

Manual testing

  • You will need a supported USB printer.
  • Start from SecureDrop Workstation 1.0.0 rpm or tag and provision SDW. Build packages from the tip of this branch, and install them in the large template. You will need to install at minimum securedrop-client, securedrop-export, securedrop-workstation-viewer, and securedrop-workstation-config.
  • Packages install successfully
  • Shut down the template, then start SDW. You will need to submit various file types to your instance for testing: please submit: a multi-page PDF; a plain .txt file; one or more video/audio files; one or more image files; one file with the wrong extension; and one or more files supported by libreoffice.
    Viewing
  • Files that are currently supported by securedrop-workstation viewer can be viewed in viewer Dispvm (no change, no added filetype support, just regression testing)
    Printing
  • Printing a multi-page PDF is successful
  • Printing a .txt file is successful
  • Printing a non-PDF file is successful; you can choose from any of the filetypes supported by libreoffice ( 😮 ), although definitions of "success" may be stretched for some filetypes.
  • Attempting to print a video or audio file or any other "unprintable" mimetype (see below) results in the printer modal error that says "This file type cannot be printed" (Disable "Print" when file format is not conducive to printing #918). You don't have to try every single type.
unprintable types "audio/mp4", "audio/mpeg", "audio/x-vorbis", "audio/x-wav", "video/quicktime", "video/x-theora", "video/mp4", "video/webm", "video/x-msvideo", "video/x-ms-wmv", "application/vnd.djvu", "application/vnd.rar", "application/zip", "application/x-7z-compressed"

Checklist

TK

If these changes modify code paths involving cryptography, the opening of files in VMs or network (via the RPC service) traffic, Qubes testing in the staging environment is required. For fine tuning of the graphical user interface, testing in any environment in Qubes is required. Please check as applicable:

  • I have tested these changes in the appropriate Qubes environment
  • I do not have an appropriate Qubes OS workstation set up (the reviewer will need to test these changes)
  • These changes should not need testing in Qubes

If these changes add or remove files other than client code, the AppArmor profile may need to be updated. Please check as applicable:

  • I have updated the AppArmor profile
  • No update to the AppArmor profile is required for these changes
  • I don't know and would appreciate guidance

If these changes modify the database schema, you should include a database migration. Please check as applicable:

  • I have written a migration and upgraded a test database based on main and confirmed that the migration is self-contained and applies cleanly
  • I have written a migration but have not upgraded a test database based on main and would like the reviewer to do so
  • I need help writing a database migration
  • No database schema changes are needed

@rocodes rocodes force-pushed the 918-dont-print-garbage branch 3 times, most recently from 5562877 to 07bb841 Compare August 14, 2024 15:03
@rocodes rocodes added this to the 0.13.0 milestone Aug 14, 2024
Copy link
Member

@legoktm legoktm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a quick skim!

export/securedrop_export/print/service.py Outdated Show resolved Hide resolved
export/securedrop_export/print/service.py Outdated Show resolved Hide resolved
export/securedrop_export/print/service.py Outdated Show resolved Hide resolved
@rocodes rocodes force-pushed the 918-dont-print-garbage branch from 07bb841 to 5890ce4 Compare August 18, 2024 16:49
@rocodes rocodes marked this pull request as ready for review August 18, 2024 17:33
@rocodes rocodes requested a review from a team as a code owner August 18, 2024 17:33
@rocodes
Copy link
Contributor Author

rocodes commented Aug 18, 2024

I was hoping to have this all finished + green before tomorrow and to do a little more testing, but I'm a little short on time and still need to fix up one or two tests, but I'm taking it out of draft mode to invite more feedback and testing. So far I've tested the export component with locally-built packages, and printing and mime detection are working as expected.

I will be back to address review feedback (such as #2166 (comment) and anything else that arises) and do test fixup on Wed.

…E_DISCOVERY status enums to client and export.
Raise ExportException instead of trying to print file with unprintable
mimetype.

Parse mimetypes from LibreOffice .desktop files and try to convert files of any mimetype that LibreOffice supports to PDF before printing. This dramatically broadens the range of printable types.

Use LibreOffice headless mode for PDF conversion rather than
unoconv. In future, if batch printing is supported, consider unoserver
instead.

Use -M (magic bytes only) flag for mimeinfo detection.
Show error message in Print dialog when attempting to print unprintable filetype.
Use CALLED_PROCESS_ERROR code for QProcess errors instead of generic
print error code, and log error messages.

Add message in Print Dialog if user tries to print non printable filetype.
…op-viewer packages. This was previously not possible due to the necessary paxctl configuration not being available soon enough (see #205), but now that we install the grsec kernel and our configuration packages in a template and reboot it before provisioning the (shared) export/viewer template, the PaX conf file is in place and the installation will succeed.

Include libfile-mimeinfo-perl package (installed by default) in secuedrop-export package depedencies.
parsing tests and update _needs_pdf_conversion and other print tests.
@rocodes rocodes force-pushed the 918-dont-print-garbage branch from 5890ce4 to 1bcf7d1 Compare August 18, 2024 17:45
Copy link
Member

@legoktm legoktm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited for this, I've done a code review pass and now am going to actually test it.

export/securedrop_export/print/service.py Outdated Show resolved Hide resolved
export/securedrop_export/print/service.py Outdated Show resolved Hide resolved
export/securedrop_export/print/service.py Outdated Show resolved Hide resolved
export/securedrop_export/print/service.py Outdated Show resolved Hide resolved
# sample .desktop files in test directory, but the service checks /usr/share/applications
testdir_libreoffice_desktop = Path(Path.cwd(), "tests", "files")
supported = self.service._get_supported_mimetypes_libreoffice(testdir_libreoffice_desktop)
assert len(supported) > 0, len(supported)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're testing against fixed sample files, can we assert the real number? Or even just the full list of mime types?

export/securedrop_export/print/service.py Show resolved Hide resolved
Copy link
Member

@legoktm legoktm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test plan checks out, one note: I tried "one file with the wrong extension" by having a webp image with the .txt extension, it opened in the wrong application (text editor instead of eog), I don't think that's a regression here right?

Marking as requesting changes based on previous CR

@legoktm
Copy link
Member

legoktm commented Aug 20, 2024

I pushed 292b230 in a separate branch to show how we can install LibreOffice in CI and use it for integration tests; there could be others that run various files through headless libreoffice to see how the conversion goes and error handling instead of trying to mock it all out.

Let me know if you'd like me to push it to this branch or submit it as a separate PR after this lands! (Or feel free to pull it in yourself :))

@rocodes
Copy link
Contributor Author

rocodes commented Aug 21, 2024

Hey, thank you for all the feedback and testing :) Realquick:

  • These changes don't affect sd-viewer (and don't broaden viewing support - only printing support), so you're right that the .txt file being opened in the wrong application is not related (but we should fix it).
  • I have local changes to address a couple failures including the pathlib thing you noticed, but I ran out of time and didn't want to push anything in haste - sorry if you spent too much time looking into that.
  • Thanks for all your cleanup changes - much better - shall include them too. And I'll take a look at your libreoffice branch. I was on the fence about including that in CI or whether that blurs the boundaries between unit-testing and integration testing, but I'm into it.

@legoktm
Copy link
Member

legoktm commented Aug 21, 2024

  • so you're right that the .txt file being opened in the wrong application is not related (but we should fix it).

Ack, filed separately as #2171.

  • I have local changes to address a couple failures including the pathlib thing you noticed, but I ran out of time and didn't want to push anything in haste - sorry if you spent too much time looking into that.

OK! I was going to spend some time today figuring that out but I can hold off for now then.

  • And I'll take a look at your libreoffice branch. I was on the fence about including that in CI or whether that blurs the boundaries between unit-testing and integration testing, but I'm into it.

It definitely does blur the boundaries, but I think mixing is fine, if we end up having more we can split them out into separate files.

Also, integration tests are important:
eb6fe1b73f87abe806b353aacbde464c

@rocodes rocodes force-pushed the 918-dont-print-garbage branch from 0588432 to fec5ed2 Compare August 22, 2024 15:55
@legoktm
Copy link
Member

legoktm commented Aug 22, 2024

test are failing because it doesn't find any libreoffice .desktop mime types:

        if len(supported_types) == 0:
>           raise ExportException(sdstatus=Status.ERROR_MIMETYPE_DISCOVERY)

I suspect you (like me) have LibreOffice installed locally so it passes locally but the CI container doesn't have it, so it either needs to be mocked or we just install LibreOffice in CI and it works (which it will once my integration patch is in)

@rocodes rocodes force-pushed the 918-dont-print-garbage branch 2 times, most recently from 58193af to 706e94a Compare August 22, 2024 16:30
@rocodes
Copy link
Contributor Author

rocodes commented Aug 22, 2024

:) I did catch that, and I think I addressed all your changes too in 05fe7dd..706e94a (edit - but not quite successfully, apparently - hang on)

@rocodes rocodes force-pushed the 918-dont-print-garbage branch from 706e94a to e9228ce Compare August 22, 2024 16:51
@rocodes
Copy link
Contributor Author

rocodes commented Aug 22, 2024

(moving out of 'ready for review' until I pull in 292b230)

raise ExportException(sdstatus=Status.ERROR_MIMETYPE_UNKNOWN)
# Don't print "audio/*", "video/*", or archive mimetypes
if (
any(mimetype.startswith(prefix) for prefix in MIMETYPE_UNPRINTABLE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cool trick is that startswith takes a tuple of strings, which allows you to do:

>>> 'audio/foo'.startswith(('audio/', 'video/'))
True
>>> 'text/foo'.startswith(('audio/', 'video/'))
False

rocodes and others added 5 commits August 22, 2024 16:31
Install LibreOffice in the CI container and then we can test against the
actual desktop files instead of copies of them.
libreoffice headless conversion against .odt file if libreoffice is
installed.
@rocodes rocodes force-pushed the 918-dont-print-garbage branch from ec47d11 to 3ac0af4 Compare August 22, 2024 20:32
@rocodes rocodes requested a review from legoktm August 22, 2024 21:11
Copy link
Member

@legoktm legoktm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, in real testing, it all works too!

@legoktm legoktm merged commit 86eac8b into main Aug 22, 2024
58 checks passed
@legoktm legoktm deleted the 918-dont-print-garbage branch August 22, 2024 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
2 participants