Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag generation performance #731

Open
DavidEGx opened this issue Oct 11, 2024 · 8 comments
Open

Tag generation performance #731

DavidEGx opened this issue Oct 11, 2024 · 8 comments
Labels
area: general tagpdf, ... question Further information is requested

Comments

@DavidEGx
Copy link

I've noticed generating tags is quite time consuming.

As an example, running:

$ lualatex latexdoc.txt

(See attached latexdoc.txt. Just asked chatgpt to generate a demo latex file, similar issue with my real latex files)

Takes around 1.6s in my machine. If I remove testphase = {phase-III} it takes ~0.5s.

What is worse, sometimes you have to rerun lualatex.

Is this something that will be addressed in the future? What can we expect?
Or is it just me and I'm doing something wrong?!

Thanks for the work.

@u-fischer
Copy link
Member

yes tagging slows down the compilation. The code has to create and write quite a lot pdf objects. There are certainly places where the code can be speeded up and this will be done at some time but currently this is not the first priority, the focus is on getting correct tagging at all.

Unrelated but you can/should remove \usepackage[utf8]{inputenc}. For lualatex is does nothing at all, and with pdflatex is it unneeded as utf8 is the default anyway.

@DavidEGx
Copy link
Author

Unrelated but you can/should remove \usepackage[utf8]{inputenc}. For lualatex is does nothing at all, and with pdflatex is it unneeded as utf8 is the default anyway.

Thanks, I'll do that.


Maybe this is just totally nonsense but... couldn't tags run only once in a last run?

I mean instead of having to:

  • Run: 1.6s.
  • Rerun: 1.6s.
    Total: 3.2.s

Do:

  • Run no tags: 0.5s.
  • Rerun no tags: 0.5s.
  • Rerun only tags: 1.1s.
    Total: 2.1s

@u-fischer
Copy link
Member

Maybe this is just totally nonsense but... couldn't tags run only once in a last run?

With lualatex yes, that will probably be possible. Currently a few things use the aux-file and so need two compilations but it should be possible to change that. But naturally then lualatex needs to know which is the last compilation, so you would have to switch tagging on and off. pdflatex typically really needs two or three compilations with tagging active.

The time consuming part is generally the end when all the objects are written so you could try this while drafting your document:

\AddToHook{enddocument/end}{\tagpdfsetup{activate/tree=false}}
\DocumentMetadata{testphase={phase-III}}

But that is not very much tested, so report back if there are problems or some pdf viewer complains ...

@hpvd
Copy link

hpvd commented Oct 12, 2024

Thanks for raising this performance topic. It is imho really important for "usability" and with this also for acceptance and adaption rate of tagging.
Maybe it is reasonable to think about things like coupling it to something like the draft mode or similar
or even thinking about extending / structure the possibilities of the draft mode (or a new mode?) to have
this one customizable switch (draft setup?)
to select between "fast" compile and "final" compile which loads the appropriate config for

  • every internal performance demanding "advanced/2nd level" task and
  • every external package relying on it

(not only tagging) ...

@DavidEGx
Copy link
Author

BTW, is there some sort of approximate date for the release of tags?

@u-fischer
Copy link
Member

BTW, is there some sort of approximate date for the release of tags?

Sorry I don't understand the question.

@DavidEGx
Copy link
Author

BTW, is there some sort of approximate date for the release of tags?

Sorry I don't understand the question.

Since we have to use the key "testphase" I assume tags are at beta ish state. So the question is when it will be not beta.

@FrankMittelbach
Copy link
Member

FrankMittelbach commented Oct 14, 2024

The project is planned as a multi-year one (aprox 5 if there is sufficient financial support, otherwise possibly longer). There is a lot of documentation around this at https://www.latex-project.org/publications/indexbytopic/pdf/ , including why it needs that amount of time or more in the schedule. But because of the uncertainties it is not possible to give a reliable final date.

Regardless of that we are now in a position where it can already be actively used even though it is still evolving and needs a lot more work.

@FrankMittelbach FrankMittelbach added the question Further information is requested label Nov 2, 2024
@FrankMittelbach FrankMittelbach added the area: general tagpdf, ... label Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: general tagpdf, ... question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants