Skip to content

Commit

Permalink
Optimize grading/progress page to reduce database queries (cache max …
Browse files Browse the repository at this point in the history
…scores).

The progress page did a number of things that make performance terrible for
courses with large numbers of problems, particularly if those problems are
customresponse CapaModule problems that need to be executed via codejail.

The grading code takes pains to not instantiate student state and execute the
problem code. If a student has answered the question, the max score is stored
in StudentModule. However, if the student hasn't attempted the question yet, we
have to run the problem code just to call .max_score() on it. This is necessary
in grade() if the student has answered other problems in the assignment (so we
can know what to divide by). This is always necessary to know in
progress_summary() because we list out every problem there. Code execution can
be especially slow if the problems need to invoke codejail.

To address this, we create a MaxScoresCache that will cache the max raw score
possible for every problem. We select the cache keys so that it will
automatically become invalidated when a new version of the course is published.

The fundamental assumption here is that a problem cannot have two different
max score values for two unscored students. A problem *can* score two students
differently such that they have different max scores. So Carlos can have 2/3 on
a problem, while Lyla gets 3/4. But if neither Carlos nor Lyla has ever
interacted with the problem (i.e. they're just seeing it on their progress
page), they must both see 0/4 -- it cannot be the case that Carlos sees 0/3 and
Lyla sees 0/4.

We used to load all student state into two separate FieldDataCache instances,
after which we do a bunch of individual queries for scored items. Part of this
split-up was done because of locking problems, but I think we might have gotten
overzealous with our manual transaction hammer.

In this commit, we consolidate all state access in grade() and progress()
to use one shared FieldDataCache. We also use a filter so that we only pull
back StudentModule state for things that might possibly affect the grade --
items that either have scores or have children.

Because some older XModules do work in their __init__() methods (like Video),
instantiating them takes time, particularly on large courses. This commit also
changes the code that fetches the grading_context to filter out children that
can't possibly affect the grade.

Finally, we introduce a ScoresClient that also tries to fetch score
information all at once, instead of in separate queries. Technically, we are
fetching this information redundantly, but that's because the state and score
interfaces are being teased apart as we move forward. Still, this only
amounts to one extra SQL query, and has very little impact on performance
overall.

Much thanks to @adampalay -- his hackathon work in openedx#7168 formed the basis of
this.

https://openedx.atlassian.net/browse/CSM-17
  • Loading branch information
David Ormsbee authored and amir-qayyum-khan committed Feb 16, 2016
1 parent a59aa6f commit 4516c49
Show file tree
Hide file tree
Showing 10 changed files with 270 additions and 102 deletions.
16 changes: 15 additions & 1 deletion common/lib/xmodule/xmodule/course_module.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from xmodule.mixin import LicenseMixin
import json

from xblock.core import XBlock
from xblock.fields import Scope, List, String, Dict, Boolean, Integer, Float
from .fields import Date
from django.utils.timezone import UTC
Expand Down Expand Up @@ -1315,11 +1316,15 @@ def grading_context(self):
except UndefinedContext:
module = self

def possibly_scored(usage_key):
"""Can this XBlock type can have a score or children?"""
return usage_key.block_type in self.block_types_affecting_grading

all_descriptors = []
graded_sections = {}

def yield_descriptor_descendents(module_descriptor):
for child in module_descriptor.get_children():
for child in module_descriptor.get_children(usage_key_filter=possibly_scored):
yield child
for module_descriptor in yield_descriptor_descendents(child):
yield module_descriptor
Expand All @@ -1345,6 +1350,15 @@ def yield_descriptor_descendents(module_descriptor):
return {'graded_sections': graded_sections,
'all_descriptors': all_descriptors, }

@lazy
def block_types_affecting_grading(self):
"""Return all block types that could impact grading (i.e. scored, or having children)."""
return frozenset(
cat for (cat, xblock_class) in XBlock.load_classes() if (
getattr(xblock_class, 'has_score', False) or getattr(xblock_class, 'has_children', False)
)
)

@staticmethod
def make_id(org, course, url_name):
return '/'.join([org, course, url_name])
Expand Down
54 changes: 29 additions & 25 deletions lms/djangoapps/ccx/tests/test_field_override_performance.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,11 @@

@attr('shard_1')
@mock.patch.dict(
'django.conf.settings.FEATURES', {'ENABLE_XBLOCK_VIEW_ENDPOINT': True}
'django.conf.settings.FEATURES',
{
'ENABLE_XBLOCK_VIEW_ENDPOINT': True,
'ENABLE_MAX_SCORE_CACHE': False,
}
)
@ddt.ddt
class FieldOverridePerformanceTestCase(ProceduralCourseTestMixin,
Expand Down Expand Up @@ -173,18 +177,18 @@ class TestFieldOverrideMongoPerformance(FieldOverridePerformanceTestCase):

TEST_DATA = {
# (providers, course_width, enable_ccx): # of sql queries, # of mongo queries, # of xblocks
('no_overrides', 1, True): (27, 7, 14),
('no_overrides', 2, True): (135, 7, 85),
('no_overrides', 3, True): (595, 7, 336),
('ccx', 1, True): (27, 7, 14),
('ccx', 2, True): (135, 7, 85),
('ccx', 3, True): (595, 7, 336),
('no_overrides', 1, False): (27, 7, 14),
('no_overrides', 2, False): (135, 7, 85),
('no_overrides', 3, False): (595, 7, 336),
('ccx', 1, False): (27, 7, 14),
('ccx', 2, False): (135, 7, 85),
('ccx', 3, False): (595, 7, 336),
('no_overrides', 1, True): (23, 7, 14),
('no_overrides', 2, True): (68, 7, 85),
('no_overrides', 3, True): (263, 7, 336),
('ccx', 1, True): (23, 7, 14),
('ccx', 2, True): (68, 7, 85),
('ccx', 3, True): (263, 7, 336),
('no_overrides', 1, False): (23, 7, 14),
('no_overrides', 2, False): (68, 7, 85),
('no_overrides', 3, False): (263, 7, 336),
('ccx', 1, False): (23, 7, 14),
('ccx', 2, False): (68, 7, 85),
('ccx', 3, False): (263, 7, 336),
}


Expand All @@ -196,16 +200,16 @@ class TestFieldOverrideSplitPerformance(FieldOverridePerformanceTestCase):
__test__ = True

TEST_DATA = {
('no_overrides', 1, True): (27, 4, 9),
('no_overrides', 2, True): (135, 19, 54),
('no_overrides', 3, True): (595, 84, 215),
('ccx', 1, True): (27, 4, 9),
('ccx', 2, True): (135, 19, 54),
('ccx', 3, True): (595, 84, 215),
('no_overrides', 1, False): (27, 4, 9),
('no_overrides', 2, False): (135, 19, 54),
('no_overrides', 3, False): (595, 84, 215),
('ccx', 1, False): (27, 4, 9),
('ccx', 2, False): (135, 19, 54),
('ccx', 3, False): (595, 84, 215),
('no_overrides', 1, True): (23, 4, 9),
('no_overrides', 2, True): (68, 19, 54),
('no_overrides', 3, True): (263, 84, 215),
('ccx', 1, True): (23, 4, 9),
('ccx', 2, True): (68, 19, 54),
('ccx', 3, True): (263, 84, 215),
('no_overrides', 1, False): (23, 4, 9),
('no_overrides', 2, False): (68, 19, 54),
('no_overrides', 3, False): (263, 84, 215),
('ccx', 1, False): (23, 4, 9),
('ccx', 2, False): (68, 19, 54),
('ccx', 3, False): (263, 84, 215),
}
Loading

0 comments on commit 4516c49

Please sign in to comment.