Add multi-key sorting of statuses #571

jzohrab · 2025-01-17T00:27:58Z

From discord link

Sorting by status works great when the book has unknown words but if books have none or the same amount of unknown words then the sort doesn't use next lowest status tier. For example I can't sort this table since the books have no unknowns:

I guess it should sort by unknowns then status 1 through 5

Notes

The book listing is in lute/templates/book/tablelisting.html, with the graph column first getting the UnknownPercent from the datatables query in lute/book/datatables.py, so the unknown percent is used for sorting. The graph is then rendered "on top" of that, with ajax_in_book_stats in the html.

The UnknownPercent is pulled from the table

CREATE TABLE IF NOT EXISTS "bookstats" (
	"BkID" INTEGER NOT NULL  ,
	"distinctterms" INTEGER NULL  ,
	"distinctunknowns" INTEGER NULL  ,
	"unknownpercent" INTEGER NULL  ,
        status_distribution VARCHAR(100) NULL,
	PRIMARY KEY ("BkID"),
	FOREIGN KEY("BkID") REFERENCES "books" ("BkID") ON UPDATE NO ACTION ON DELETE CASCADE
);

with values like this:

sqlite> select * from bookstats;
4|378|220|58|{"0": 220, "1": 0, "2": 0, "3": 0, "4": 0, "5": 0, "98": 0, "99": 158}

"58" here is the percent of unknown terms.

To add this feature correctly, the code needs to change the single value "58" to a composite key (in this case something like {"0": "048", "1": "023", "2": "011", "3": ... etc }, with three digit places for each element to allow for 100 in one of the slots).

Doing all of this in the course of a DB migration would maybe require a python script to run, to load the status_distribution json string and then do calculations. The migration tool currently doesn't handle such a thing. Instead there could be a data clean-up job that runs at startup, that checks the content of the sort string, and if it doesn't match the required pattern it could do some fast processing.

Updated spec after discord discussion.

Probably the best way to do this is to have a method calc_sort_keys added to lute/book/stats.py which does the calc for a new status_distribution_percent field if it's null and if the status_distribution is not null. Call this on every call to the datatables function so the key is always updated.

todos I can think of

~~consider if can be linked somehow to A better way to sort by difficulty #453~~ -- no don't bother, stick with the current state
table migration, add new status_distribution_percent, varchar(100). No need to drop the "unknownpercent" column, it's ok to leave that for later.
lute/book/stats.py, add calc_sort_keys method to load status_distribution_percent. Unit tests: nulls, bad json in the distribution field, empty string, valid distribution, distribution with 100% and 0%
add unit test for book with no words (e.g. english book, text = "123"
call calc_sort_keys in the book datatables python
change the book stats calculation to call calc_sort_keys
change the datatables to use the new field, template to use sort_key
graph uses field, old complicated JS code can be removed
once launched, maybe create a separate task to delete the unknownpercent column as it's no longer used

The text was updated successfully, but these errors were encountered:

jzohrab · 2025-01-17T01:07:26Z

The above is my feeling about what needs to happen for this to work; but I could be wrong.

I don't believe it would be possible to calculate the sort index dynamically when calling the datatables method. Reason: datatables needs to get the sort index to do its server-side sorting, so it would really need all of those values present. Having that data cached in the table is the only way to do it, afaict.

jzohrab · 2025-01-17T01:09:36Z

While this change is not simple, it's pretty easy, so I think it could be tackled by any motivated dev who wants to give it a shot.

jzohrab added the bug Something isn't working label Jan 17, 2025

jzohrab added this to Lute-v3 Jan 17, 2025

jzohrab added the good first issue Good for newcomers label Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-key sorting of statuses #571

Add multi-key sorting of statuses #571

jzohrab commented Jan 17, 2025 •

edited

Loading

jzohrab commented Jan 17, 2025

jzohrab commented Jan 17, 2025

Add multi-key sorting of statuses #571

Add multi-key sorting of statuses #571

Comments

jzohrab commented Jan 17, 2025 • edited Loading

Notes

todos I can think of

jzohrab commented Jan 17, 2025

jzohrab commented Jan 17, 2025

jzohrab commented Jan 17, 2025 •

edited

Loading