From 634985cd4e2fe963031e1e47180ac86dd2535ed5 Mon Sep 17 00:00:00 2001 From: eoinwm-cisa Date: Thu, 30 Oct 2025 14:49:43 -0500 Subject: [PATCH] Update MANUAL.txt --- MANUAL.txt | 50 +++++++++++++++++++++++++++++++++++++------------- 1 file changed, 37 insertions(+), 13 deletions(-) diff --git a/MANUAL.txt b/MANUAL.txt index 88f0b681cd0b..40739d35fb5a 100644 --- a/MANUAL.txt +++ b/MANUAL.txt @@ -1,7 +1,7 @@ --- title: Pandoc User's Guide author: John MacFarlane -date: 2025-10-20 +date: 2025-10-30 --- # Synopsis @@ -1748,6 +1748,9 @@ Nonzero exit codes have the following meanings: The `--defaults` option may be used to specify a package of options, in the form of a YAML file. +pandoc does not include a defaults file. The tables below are examples +and do not document default configuration options. + Fields that are omitted will just have their regular default values. So a defaults file can be as simple as one line: @@ -7846,7 +7849,7 @@ set at startup. For full documentation, see the [pandoc-lua] man page. [lua standalone]: https://www.lua.org/manual/5.4/manual.html#7 [pandoc-lua]: https://github.com/jgm/pandoc/blob/master/doc/pandoc-lua.md -# A note on security +# Security notes 1. Although pandoc itself will not create or modify any files other than those you explicitly ask it create (with the exception @@ -7869,15 +7872,29 @@ set at startup. For full documentation, see the [pandoc-lua] man page. these formats.) 4. In reading HTML files, pandoc will attempt to include the - contents of `iframe` elements by fetching content from the - local file or URL specified by `src`. If untrusted HTML is - processed on a server, this has the potential to reveal anything - readable by the process running the server. Using the `-f html+raw_html` - will mitigate this threat by causing the whole `iframe` - to be parsed as a raw HTML block. Using `--sandbox` will also - protect against the threat. - -5. If your application uses pandoc as a Haskell library (rather than + contents of `iframe` elements by fetching content from the local + file or URL specified by `src`. If untrusted HTML is processed on a + server, this has the potential to reveal anything readable by the + process running the server or to enable server-side request forgery + (SSRF) attacks + ([CVE-2025-51591](https://www.cve.org/CVERecord?id=CVE-2025-51591)). + To mitigate such attacks, use `--sandbox` or `-f html+raw_html` + (which causes the whole `iframe` to be parsed as a raw HTML block). + For some `--pdf-engine` options, `--sandbox` and `-f html+raw_html` + may not mitigate attacks. For example, using + `--pdf-engine=wkhtmltopdf` with `-f html+raw_html` enables an SSRF + vulnerability in wkhtmltopdf + ([CVE-2022-35583](https://www.cve.org/CVERecord?id=CVE-2022-35583)), + as `--sandbox` does not apply to wkhtmltopdf. + +5. In reading Markdown files, pandoc will attempt to include the + contents of `iframe` elements but enables `raw_html` by default. + This mitigates SSRF attacks. Using other `--pdf-engine` options may + enable SSRF attacks. For example, using `--pdf-engine=wkhtmltopdf` + enables an SSRF vulnerability in wkhtmltopdf + ([CVE-2022-35583](https://www.cve.org/CVERecord?id=CVE-2022-35583)). + +6. If your application uses pandoc as a Haskell library (rather than shelling out to the executable), it is possible to use it in a mode that fully isolates pandoc from your file system, by running the pandoc operations in the `PandocPure` monad. See the document @@ -7885,7 +7902,7 @@ set at startup. For full documentation, see the [pandoc-lua] man page. for more details. (This corresponds to the use of the `--sandbox` option on the command line.) -6. Pandoc's parsers can exhibit pathological performance on some +7. Pandoc's parsers can exhibit pathological performance on some corner cases. It is wise to put any pandoc operations under a timeout, to avoid DOS attacks that exploit these issues. If you are using the pandoc executable, you can add the @@ -7895,13 +7912,20 @@ set at startup. For full documentation, see the [pandoc-lua] man page. to pathological performance than the `markdown` parser, so it is a better choice when processing untrusted input. -7. The HTML generated by pandoc is not guaranteed to be safe. +8. The HTML generated by pandoc is not guaranteed to be safe. If `raw_html` is enabled for the Markdown input, users can inject arbitrary HTML. Even if `raw_html` is disabled, users can include dangerous content in URLs and attributes. To be safe, you should run all HTML generated from untrusted user input through an HTML sanitizer. +9. Using `--pdf-engine=wkhtmltopdf` brings risks related to processing + untrusted HTML and Markdown input. wkhtmltopdf is no longer + maintained and is based on outdated Qt WebKit components. + wkhtmltopdf + [recommends](https://wkhtmltopdf.org/status.html#recommendations) + "Do not use wkhtmltopdf with any untrusted HTML." + # Authors Copyright 2006--2024 John MacFarlane (jgm@berkeley.edu). Released