Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save: investigate use of base64 encoded urls in headless browser #518

Open
machow opened this issue Nov 22, 2024 · 5 comments
Open

save: investigate use of base64 encoded urls in headless browser #518

machow opened this issue Nov 22, 2024 · 5 comments

Comments

@machow
Copy link
Collaborator

machow commented Nov 22, 2024

In PR #499 we replaced saving to a temporary file with inputting base64 encoded rendered html into a headless browser.

Browser url limits are a bit tricky, but they do exist. Let's investigate potential issues we might hit by passing long urls to the headless browser.

Alternatively, it might save us time to revert back to saving a temporary file, since the gt R package has used that approach for ~6 years with little issue.

I would vote for reverting, just to align with gt R. It's likely modern browsers have large url limits, but it's tricky to verify.

@jrycw
Copy link
Collaborator

jrycw commented Nov 22, 2024

@machow , I wasn’t aware of this issue. Please feel free to revert to the temporary file approach if needed.

@phimad
Copy link

phimad commented Jan 29, 2025

Related to this, I am having problems with special character encodings which I believe is due to the base64 encoding.

Below I provide a MRE. I have a table with special Danish characters such as Ø, Æ, Å. It shows up correctly with GT.show method, but special characters are lost when using save()-method.

import pandas as pd
from great_tables import GT

df = pd.DataFrame(
{
"Stubhead with Å": [
"Rowlabel with Ø",
"Rowlabel with Æ",
],
"2023": [
3.2,
4.3,
],
"2024": [
5.7,
7.8,
],
"2025": [
8.9,
5.4,
],
}
)

gt_tbl = GT(df)

gt_tbl.save("test.png")

This outputs the attached image:

Image

which does not show the special characters correctly. Seems it would be prudent to use the same approach for the show()-method as for the save()-method, but I am not an expert on this.

Am hoping this will be fixed soon as I really like this library :)

@jrycw
Copy link
Collaborator

jrycw commented Jan 29, 2025

@phimad, thanks so much for reporting this issue. I’ve reformatted your code for better readability:

import pandas as pd
from great_tables import GT

df = pd.DataFrame(
    {
        "Stubhead with Å": [
            "Rowlabel with Ø",
            "Rowlabel with Æ",
        ],
        "2023": [
            3.2,
            4.3,
        ],
        "2024": [
            5.7,
            7.8,
        ],
        "2025": [
            8.9,
            5.4,
        ],
    }
)

gt_tbl = GT(df)

gt_tbl.save("test.png")

@machow and @rich-iannone, it looks like our old friend is making a return. Interestingly, the code runs fine on my Windows 11 machine.

Image

@phimad
Copy link

phimad commented Jan 30, 2025

To be clear, the output in the jupyter notebook is correct, but the saved image file is incorrect.

I am on Windows 10. Is there any sort of test I can run on my machine to help you?

EDIT:
It works when I revert to saving a temporary file to feed into the browser.

@jrycw
Copy link
Collaborator

jrycw commented Jan 30, 2025

Woah! Thanks so much for the clarification.

I can confirm that the issue also occurs on my machine:

Image

phimad added a commit to phimad/great-tables that referenced this issue Jan 30, 2025
…ter encoding when using .save() by reverting to earlier approach (prior to commit cccee66) where temporary html file was created and sent to browser rather than doing base64 URL encoding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants