Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

viewdf messes up display of frequencies if variable value starts at 0 #945

Open
rangaro opened this issue Jul 26, 2024 · 8 comments
Open
Labels

Comments

@rangaro
Copy link

rangaro commented Jul 26, 2024

If a variable starts at the value 0, viewdf will always show "0" for the frequencies of that level and then shift all the other frequencies by one (i.e., the frequencies of "0" are displayed where the "1" is, the frequencies of "1" are displayed where the "2" is, and the frequencies of the last value are omitted).

For an example, see this variable that I recoded to start with 0
Wvssolu1TA

I can replicate this issue by merely recoding a variable so that the starting value is 0.

@strengejacke
Copy link
Owner

Do you have a reproducible example? This works for me:

set.seed(123)
d <- data.frame(
  x = factor(sample(0:3, 20, TRUE)),
  y = rnorm(20),
  z = factor(sample(letters[1:4], 20, TRUE))
)

sjPlot::view_df(d, show.frq = TRUE, show.prc = TRUE)
Data frame: d
ID Name Label Values Value Labels Freq. %
1 x 0
1
2
3
3
7
8
2
15.00
35.00
40.00
10.00
2 y range: -2.0-1.8
3 z a
b
c
d
7
3
5
5
35.00
15.00
25.00
25.00

Created on 2024-11-25 with reprex v2.1.1

Can you also try datawizard::data_codebook() |> insight::print_html()?

@rangaro
Copy link
Author

rangaro commented Nov 27, 2024

I have a data file in XLSX format where this happens, but not an example with random data. The code you asked me to run only throws an error message: "Error in gt::tab_options(): ! table_font_size must be a single string, not a <tbl_df> object. Run rlang::last_trace() to see where the error occurred."

@strengejacke
Copy link
Owner

What if you convert the tibble (from readxl, I guess) into a data frame and then try the code again?

@rangaro
Copy link
Author

rangaro commented Nov 27, 2024

If I convert the tibble into a data frame, then the issue remains.

@rangaro
Copy link
Author

rangaro commented Nov 28, 2024

When I convert the variable from numeric to categorial with "as.factor", the issue vanishes. If I then convert it back to numeric and re-apply the labels, the issue comes back.

1 similar comment
@rangaro
Copy link
Author

rangaro commented Nov 28, 2024

When I convert the variable from numeric to categorial with "as.factor", the issue vanishes. If I then convert it back to numeric and re-apply the labels, the issue comes back.

@rangaro
Copy link
Author

rangaro commented Nov 28, 2024

Try
set.seed(123)
d <- data.frame(
x = as.numeric(factor(sample(0:3, 20, TRUE))),
y = rnorm(20),
z = factor(sample(letters[1:4], 20, TRUE))
)
expss::val_lab(d[1]) = c("0" = 0,
"1" = 1,
"2" =2,
"3" = 3)
sjPlot::view_df(d, show.frq = TRUE, show.prc = TRUE)

@rangaro
Copy link
Author

rangaro commented Nov 28, 2024

If you leave out the labeling, you can still see that there is an issue because the range of x is reported by sjPlot to be from 1 to 4 while it is actually from 0 to 3. I think that's where the issue is coming from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants