-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: locked link #35976
base: 2u/course-optimizer
Are you sure you want to change the base?
feat: locked link #35976
Conversation
cms/djangoapps/contentstore/tasks.py
Outdated
validated_url_list = asyncio.run(validate_urls_access_in_batches(url_list, course_key, batch_size=100)) | ||
broken_or_locked_urls, retry_list = filter_by_status(validated_url_list) | ||
|
||
# Retry urls that failed due to connection error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Generally it's really nice that you decoupled everything into functions. Everything is extremely well readable until here, where it is still good but a bit less clear structured. A small improvement could be to put the two chunks of code that are following into functions also. The for loop is a nice thing to put into a named function so people can see what it does without needing to read the code, and in the try/except statement, everything inside of try can be a function that's called with one line in the try. That's a pattern you often see where any try/except statement will have the try block just be one function call and the except block stay as it's here. What do you think?
The code already looks really good so this is a pretty minor thing. If you prefer it like this that's okay as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to do! I'll make that change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I found a bug
responses = [] | ||
url_count = len(url_list) | ||
|
||
for i in range(0, url_count, batch_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tell me if I read this correctly, there might be a bug.
This is a loop that goes from 0 to batch size, which is 100.
For each iteration, a batch starts at index i and ends at index i + 100.
And then you go over all of the urls.
That would mean:
Iteration 1 checks URLs 0 to 99
Iteration 2 checks URLs 1 to 100
Iteration 3 checks URLs 2 to 101
... and so on
So it looks to me like instead of checking urls 0-99, then 100-199, it re-checks the same URLs over and over and then stops at 199 without finishing if there are more URLs. Did I get that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be a loop that goes from 0 to url_count, with a step of batch_size (which is 100).
Iteration 1: i = 0
Iteration 2: i = 100
Iteration 3: i = 200
The celery task on will now check if a link is locked when scanning the course.