Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: locked link #35976

Open
wants to merge 5 commits into
base: 2u/course-optimizer
Choose a base branch
from
Open

Conversation

rayzhou-bit
Copy link
Contributor

The celery task on will now check if a link is locked when scanning the course.

validated_url_list = asyncio.run(validate_urls_access_in_batches(url_list, course_key, batch_size=100))
broken_or_locked_urls, retry_list = filter_by_status(validated_url_list)

# Retry urls that failed due to connection error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Generally it's really nice that you decoupled everything into functions. Everything is extremely well readable until here, where it is still good but a bit less clear structured. A small improvement could be to put the two chunks of code that are following into functions also. The for loop is a nice thing to put into a named function so people can see what it does without needing to read the code, and in the try/except statement, everything inside of try can be a function that's called with one line in the try. That's a pattern you often see where any try/except statement will have the try block just be one function call and the except block stay as it's here. What do you think?

The code already looks really good so this is a pretty minor thing. If you prefer it like this that's okay as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to do! I'll make that change

Copy link
Member

@jesperhodge jesperhodge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found a bug

responses = []
url_count = len(url_list)

for i in range(0, url_count, batch_size):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tell me if I read this correctly, there might be a bug.
This is a loop that goes from 0 to batch size, which is 100.
For each iteration, a batch starts at index i and ends at index i + 100.
And then you go over all of the urls.

That would mean:
Iteration 1 checks URLs 0 to 99
Iteration 2 checks URLs 1 to 100
Iteration 3 checks URLs 2 to 101
... and so on

So it looks to me like instead of checking urls 0-99, then 100-199, it re-checks the same URLs over and over and then stops at 199 without finishing if there are more URLs. Did I get that right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a loop that goes from 0 to url_count, with a step of batch_size (which is 100).
Iteration 1: i = 0
Iteration 2: i = 100
Iteration 3: i = 200

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants