Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Enterprise CommCare Client Version Compliance Report #35468

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 37 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
52520df
Refactor: rename function
jingcheng16 Nov 26, 2024
52b62e7
Util to tell if a version is out of date
jingcheng16 Dec 3, 2024
4c2c923
Add Commcare Version Compliance Tile
jingcheng16 Dec 3, 2024
2ee9d14
Add CommCare Version Compliance API
jingcheng16 Dec 3, 2024
5609175
Refactor: move get_report_task to the base class
jingcheng16 Dec 3, 2024
94f54c6
Show -- for the total if the account has no mobile worker yet
jingcheng16 Dec 3, 2024
9cc4bda
Move is_out_of_date above _parse_version
jingcheng16 Dec 4, 2024
7338fcc
change text and fix format
jingcheng16 Dec 4, 2024
36138f7
Remove unnecessary code
jingcheng16 Dec 4, 2024
0512997
Change private function to public since it's imported by other file
jingcheng16 Dec 4, 2024
9dc9a81
delay the translation until rendering
jingcheng16 Dec 4, 2024
344a76c
Rename metric to description
jingcheng16 Dec 4, 2024
1aec926
Percentage in enterprise console will have one digit
jingcheng16 Dec 4, 2024
180eb0c
Use elastic search
jingcheng16 Dec 6, 2024
b8cb113
Having a util function to handle the commcare version the mobile work…
jingcheng16 Dec 6, 2024
b494264
isort
jingcheng16 Dec 6, 2024
f38e1c3
Remove formatting of version in enterprise console so it's comparable
jingcheng16 Dec 7, 2024
fcf3c61
ES returning dates in iso format so add conversion to avoid type error
jingcheng16 Dec 7, 2024
d72d426
Merge branch 'master' into jc/commcare_client_version_compliance_report
mjriley Dec 9, 2024
b1a6545
flake8
jingcheng16 Dec 9, 2024
cd61d41
Change title
jingcheng16 Dec 9, 2024
57c489a
Merge branch 'master' into jc/commcare_client_version_compliance_report
jingcheng16 Dec 10, 2024
c03d909
Rename description to total_description
jingcheng16 Dec 10, 2024
5470ea5
Update label....
jingcheng16 Dec 10, 2024
4bc93c8
adjust tile layout
biyeun Dec 11, 2024
1c1abb6
Merge branch 'master' into jc/commcare_client_version_compliance_report
jingcheng16 Dec 11, 2024
0d85d21
Rename ignored variable
jingcheng16 Dec 12, 2024
34e358b
Refactor: format version only once
jingcheng16 Dec 12, 2024
88252a0
Memoize get_latest_version_at_time and use minute precision for date_…
jingcheng16 Dec 12, 2024
8b75af1
Fix UnboundLocalError
jingcheng16 Dec 12, 2024
c0fd790
Remove microseconds too
jingcheng16 Dec 12, 2024
05f6058
Fetch CommCareBuildConfig only once
jingcheng16 Dec 12, 2024
adb7d81
memoize build time to avoid repetitively fetching build for each version
jingcheng16 Dec 12, 2024
4a81844
update naming convention for elements refrenced in javascript
biyeun Dec 12, 2024
b4d59b4
This tile don't need domain_obj
jingcheng16 Dec 12, 2024
d3da1d7
Move total above rows_for_domain
jingcheng16 Dec 12, 2024
1cc3178
Use gevent pooling for parallel domain processing
jingcheng16 Dec 12, 2024
7599675
Refer domain name as domain and fix a format in template
jingcheng16 Dec 13, 2024
44d9db5
quickcache on globally scoped function
jingcheng16 Dec 13, 2024
bb87820
keep exception handling as close as possible to the operation that ca…
jingcheng16 Dec 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions corehq/apps/builds/tests/test_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
from django.test import SimpleTestCase
from corehq.apps.builds.utils import is_out_of_date


class TestVersionUtils(SimpleTestCase):

def test_is_out_of_date(self):
test_cases = [
# (version_in_use, latest_version, expected_result)
('2.53.0', '2.53.1', True), # Normal case - out of date
('2.53.1', '2.53.1', False), # Same version - not out of date
('2.53.2', '2.53.1', False), # Higher version - not out of date
(None, '2.53.1', False), # None version_in_use
('2.53.1', None, False), # None latest_version
('invalid', '2.53.1', False), # Invalid version string
('2.53.1', 'invalid', False), # Invalid latest version
('6', '7', True), # Normal case - app version is integer
(None, None, False), # None version_in_use and latest_version
]

for version_in_use, latest_version, expected in test_cases:
with self.subTest(version_in_use=version_in_use, latest_version=latest_version):
result = is_out_of_date(version_in_use, latest_version)
self.assertEqual(
result,
expected,
f"Expected is_out_of_date('{version_in_use}', '{latest_version}') to be {expected}"
)
62 changes: 62 additions & 0 deletions corehq/apps/builds/utils.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
import re
from datetime import datetime
from memoized import memoized

from dimagi.utils.parsing import ISO_DATETIME_FORMAT

from .models import CommCareBuild, CommCareBuildConfig

Expand Down Expand Up @@ -40,3 +44,61 @@ def extract_build_info_from_filename(content_disposition):
else:
raise ValueError('Could not find filename like {!r} in {!r}'.format(
pattern, content_disposition))


@memoized
def get_latest_version_at_time(config, target_time):
"""
Get the latest CommCare version that was available at a given time.
Excludes superuser-only versions.
Menu items are already in chronological order (newest last).
If no target time is provided, return the latest version available now.

Args:
config: CommCareBuildConfig instance
target_time: datetime or string in ISO format, or None for latest version
"""
if not target_time:
return config.get_default().version

if isinstance(target_time, str):
target_time = datetime.strptime(target_time, ISO_DATETIME_FORMAT)

# Iterate through menu items in reverse (newest to oldest)
for item in reversed(config.menu):
if item.superuser_only:
continue
try:
build_time = get_build_time(item.build.version)
if build_time and build_time <= target_time:
return item.build.version
except KeyError:
continue

return None


@memoized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of memoized makes me nervous, as my understanding is that the cached value will stick around until it is explicitly cleared, or the machine restarts. Say we created a version, this ran for that version, and then we had to delete the version. This function would still return the old build time, despite the version no longer existing, right? Worse still, because the cache is local to the machine, one webworker might return the old value, while a different webworker, that hadn't previously processed the request, would return the current value. Does it seem safer to perform manual, per-request caching? I.e. create a dictionary and insert values as the request generates them, but throw that dictionary away at the end of the request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for noticing it! I verified in our doc:

Memoized on any other function/callable:
The cache lives on the callable, so if it’s globally scoped and never gets garbage collected, neither does the cache

So I'm changing it to a quickcache, and clear it every 10 minutes: 44d9db5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention 10 minutes? 100 * 60 seems to be 100 minutes, right? I'd still prefer a completely local, request-based caching mechanism, because I think it's more in-line with what you're trying to do, but I'll defer to your judgment here. This code works. The downside is that A) caching is harder to debug, B) quickcache is making redis/network requests, and C) we can still run into that inconsistent state due to a stale cache, but now that stale cache will expire after the timeout.

def get_build_time(version):
build = CommCareBuild.get_build(version, latest=True)
if build and build.time:
return build.time
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's generally a good idea to keep exception handling as close as possible to the operation that can raise the exception -- that way an unexpected operation can't throw the exception that was only meant to be thrown by another operation. It looks like we're catching KeyError due to CommCareBuild -- could we catch KeyError within get_build_time instead? If a KeyError is raised there, we can return None, and then the continue in get_latest_version_at_time doesn't seem necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Sorry this is being extracted into its own function and memoized in a rush. Fixed in bb87820

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Your version is an improvement, and it hides the exception complexity, but it still does more than is needed in the try/except block. CommCareBuild is the statement that can throw, so its slightly better and more explicit to write:

try:
  build = CommCareBuild.get_build(version, latest=True)
except KeyError:
  return None
return build.time if build and build.time else None

because it scopes the exception block to just the code that can raise the exception. I also prefer to return an explicit value, rather than letting it implicitly return nothing.



def is_out_of_date(version_in_use, latest_version):
version_in_use_tuple = _parse_version(version_in_use)
latest_version_tuple = _parse_version(latest_version)
if not version_in_use_tuple or not latest_version_tuple:
return False
return version_in_use_tuple < latest_version_tuple


def _parse_version(version_str):
"""Convert version string to comparable tuple"""
if version_str:
try:
return tuple(int(n) for n in version_str.split('.'))
except (ValueError, AttributeError):
return None
return None
2 changes: 2 additions & 0 deletions corehq/apps/enterprise/api/api.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from tastypie.api import Api

from corehq.apps.enterprise.api.resources import (
CommCareVersionComplianceResource,
DomainResource,
FormSubmissionResource,
MobileUserResource,
Expand All @@ -15,4 +16,5 @@
v1_api.register(MobileUserResource())
v1_api.register(FormSubmissionResource())
v1_api.register(ODataFeedResource())
v1_api.register(CommCareVersionComplianceResource())
v1_api.register(SMSResource())
58 changes: 25 additions & 33 deletions corehq/apps/enterprise/api/resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,12 @@ def get_object_list(self, request):
response=http.HttpTooManyRequests(headers={'Retry-After': self.RETRY_IN_PROGRESS_DELAY}))

def get_report_task(self, request):
raise NotImplementedError()
account = BillingAccount.get_account_by_domain(request.domain)
return generate_enterprise_report.s(
self.REPORT_SLUG,
account.id,
request.couch_user.username,
)

def _add_query_id_to_request(self, request, query_id):
if 'report' not in request.GET:
Expand All @@ -214,14 +219,6 @@ class DomainResource(ODataEnterpriseReportResource):

REPORT_SLUG = EnterpriseReport.DOMAINS

def get_report_task(self, request):
account = BillingAccount.get_account_by_domain(request.domain)
return generate_enterprise_report.s(
self.REPORT_SLUG,
account.id,
request.couch_user.username,
)

def dehydrate(self, bundle):
bundle.data['domain'] = bundle.obj[6]
bundle.data['created_on'] = self.convert_datetime(bundle.obj[0])
Expand All @@ -248,14 +245,6 @@ class WebUserResource(ODataEnterpriseReportResource):

REPORT_SLUG = EnterpriseReport.WEB_USERS

def get_report_task(self, request):
account = BillingAccount.get_account_by_domain(request.domain)
return generate_enterprise_report.s(
self.REPORT_SLUG,
account.id,
request.couch_user.username,
)

def dehydrate(self, bundle):
bundle.data['email'] = bundle.obj[0]
bundle.data['name'] = self.convert_not_available(bundle.obj[1])
Expand Down Expand Up @@ -289,14 +278,6 @@ class MobileUserResource(ODataEnterpriseReportResource):

REPORT_SLUG = EnterpriseReport.MOBILE_USERS

def get_report_task(self, request):
account = BillingAccount.get_account_by_domain(request.domain)
return generate_enterprise_report.s(
self.REPORT_SLUG,
account.id,
request.couch_user.username,
)

def dehydrate(self, bundle):
bundle.data['username'] = bundle.obj[0]
bundle.data['name'] = bundle.obj[1]
Expand Down Expand Up @@ -373,14 +354,6 @@ class ODataFeedResource(ODataEnterpriseReportResource):

REPORT_SLUG = EnterpriseReport.ODATA_FEEDS

def get_report_task(self, request):
account = BillingAccount.get_account_by_domain(request.domain)
return generate_enterprise_report.s(
self.REPORT_SLUG,
account.id,
request.couch_user.username,
)

def dehydrate(self, bundle):
bundle.data['num_feeds_used'] = bundle.obj[0]
bundle.data['num_feeds_available'] = bundle.obj[1]
Expand Down Expand Up @@ -433,3 +406,22 @@ def dehydrate(self, bundle):

def get_primary_keys(self):
return ('form_id', 'submitted',)


class CommCareVersionComplianceResource(ODataEnterpriseReportResource):
mobile_worker = fields.CharField()
project_space = fields.CharField()
latest_version_available_at_submission = fields.CharField()
version_in_use = fields.CharField()

REPORT_SLUG = EnterpriseReport.COMMCARE_VERSION_COMPLIANCE

def dehydrate(self, bundle):
bundle.data['mobile_worker'] = bundle.obj[0]
bundle.data['project_space'] = bundle.obj[1]
bundle.data['latest_version_available_at_submission'] = bundle.obj[2]
bundle.data['version_in_use'] = bundle.obj[3]
return bundle

def get_primary_keys(self):
return ('mobile_worker', 'project_space',)
126 changes: 123 additions & 3 deletions corehq/apps/enterprise/enterprise.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,38 @@
import re
from django.db.models import Count
from datetime import datetime, timedelta
from functools import partial
from gevent.pool import Pool

from django.conf import settings
from dimagi.utils.chunked import chunked
from dimagi.utils.parsing import ISO_DATETIME_FORMAT
from django.utils.translation import gettext as _
from django.utils.translation import gettext_lazy
from django.conf import settings

from memoized import memoized

from couchforms.analytics import get_last_form_submission_received
from dimagi.utils.dates import DateSpan

from corehq.apps.enterprise.exceptions import EnterpriseReportError, TooMuchRequestedDataError
from corehq.apps.enterprise.iterators import raise_after_max_elements
from corehq.apps.accounting.models import BillingAccount
from corehq.apps.accounting.utils import get_default_domain_url
from corehq.apps.app_manager.dbaccessors import get_brief_apps_in_domain
from corehq.apps.builds.utils import get_latest_version_at_time, is_out_of_date
from corehq.apps.builds.models import CommCareBuildConfig
from corehq.apps.domain.calculations import sms_in_last
from corehq.apps.domain.models import Domain
from corehq.apps.enterprise.exceptions import (
EnterpriseReportError,
TooMuchRequestedDataError,
)
from corehq.apps.enterprise.iterators import raise_after_max_elements
from corehq.apps.es import forms as form_es
from corehq.apps.es.users import UserES
from corehq.apps.export.dbaccessors import ODataExportFetcher
from corehq.apps.reports.util import (
get_commcare_version_and_date_from_last_usage,
)
from corehq.apps.sms.models import SMS, OUTGOING, INCOMING
from corehq.apps.users.dbaccessors import (
get_all_user_rows,
Expand All @@ -30,13 +42,16 @@
)
from corehq.apps.users.models import CouchUser, Invitation

ES_CLIENT_CONNECTION_POOL_LIMIT = 10 # Matches ES client connection pool limit


class EnterpriseReport(ABC):
DOMAINS = 'domains'
WEB_USERS = 'web_users'
MOBILE_USERS = 'mobile_users'
FORM_SUBMISSIONS = 'form_submissions'
ODATA_FEEDS = 'odata_feeds'
COMMCARE_VERSION_COMPLIANCE = 'commcare_version_compliance'
SMS = 'sms'

DATE_ROW_FORMAT = '%Y/%m/%d %H:%M:%S'
Expand Down Expand Up @@ -81,6 +96,8 @@ def create(cls, slug, account_id, couch_user, **kwargs):
report = EnterpriseFormReport(account, couch_user, **kwargs)
elif slug == cls.ODATA_FEEDS:
report = EnterpriseODataReport(account, couch_user, **kwargs)
elif slug == cls.COMMCARE_VERSION_COMPLIANCE:
report = EnterpriseCommCareVersionReport(account, couch_user, **kwargs)
elif slug == cls.SMS:
report = EnterpriseSMSReport(account, couch_user, **kwargs)

Expand Down Expand Up @@ -405,6 +422,94 @@ def _get_individual_export_rows(self, exports, export_line_counts):
return rows


class EnterpriseCommCareVersionReport(EnterpriseReport):
title = gettext_lazy('CommCare Version Compliance')
total_description = gettext_lazy('% of Mobile Workers on the Latest CommCare Version')

@property
def headers(self):
return [
_('Mobile Worker'),
_('Project Space'),
_('Latest Version Available at Submission'),
_('Version in Use'),
]

@property
def rows(self):
config = CommCareBuildConfig.fetch()
partial_func = partial(self.rows_for_domain, config)
return _process_domains_in_parallel(partial_func, self.account.get_domains())

@property
def total(self):
total_mobile_workers = 0
total_up_to_date = 0
config = CommCareBuildConfig.fetch()

def total_for_domain(domain_name):
mobile_workers = get_mobile_user_count(domain_name, include_inactive=False)
outdated_users = len(self.rows_for_domain(domain_name, config))
return mobile_workers, outdated_users

results = _process_domains_in_parallel(total_for_domain, self.account.get_domains())

for domain_mobile_workers, outdated_users in results:
if domain_mobile_workers:
total_mobile_workers += domain_mobile_workers
total_up_to_date += domain_mobile_workers - outdated_users

return _format_percentage_for_enterprise_tile(total_up_to_date, total_mobile_workers)

def rows_for_domain(self, domain_name, config):
mjriley marked this conversation as resolved.
Show resolved Hide resolved
rows = []

user_query = (UserES()
.domain(domain_name)
.mobile_users()
.source([
'username',
'reporting_metadata.last_submission_for_user.commcare_version',
'reporting_metadata.last_submission_for_user.submission_date',
'last_device.commcare_version',
'last_device.last_used'
]))

for user in user_query.run().hits:
last_submission = user.get('reporting_metadata', {}).get('last_submission_for_user', {})
last_device = user.get('last_device', {})

version_in_use, date_of_use = get_commcare_version_and_date_from_last_usage(last_submission,
last_device)

if not version_in_use:
continue

# Remove seconds and microseconds to reduce the number of unique timestamps
# This helps with performance because get_latest_version_at_time is memoized
if isinstance(date_of_use, str):
date_of_use = datetime.strptime(date_of_use, ISO_DATETIME_FORMAT)
date_of_use_minute_precision = date_of_use.replace(second=0, microsecond=0)

latest_version_at_time_of_use = get_latest_version_at_time(config, date_of_use_minute_precision)
Comment on lines +490 to +494
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why it is safe to remove seconds and microseconds from date_of_use? Admittedly its an edge case, but it seems possible for us to have multiple builds that differ only by seconds or milliseconds. Removing those would then give inaccurate information

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconds and microseconds are removed from the form submission time, not our build. So the result is, for example, if we release a new version Version 3.3 at 13:10:30, a mobile worker submit a form using Version 3.2 in 13:10:40, after removing seconds, it is become 13:10:00, the latest version at 13:10:00 is still Version 3.2, so this mobile worker won't be considered as using the out-of-date commcare version, although technically he is using out-of-date commcare version. I'm sacrificing a bit accuracy for caching the result of get_latest_version_at_time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we choose to go forward with this approach, I'd want it documented that we're reducing the accuracy of results in order to improve performance. I'm not sure reducing the accuracy of report data for the sake of a quicker summary makes sense. However, can we avoid this entirely now? It looks like the main bottlenecks for this function were fetching the config and build time, but both of those things have now been moved into their own areas. The config is only fetched once, outside of get_latest_version_at_time, and get_build_time is now cached. So I don't think we need to do any caching on get_latest_version_at_time anymore. At least on staging, our config has 65 items. I don't think it should be expensive to iterate backward through those items until we find a build time less than our target.


if is_out_of_date(version_in_use, latest_version_at_time_of_use):
rows.append([
user['username'],
domain_name,
latest_version_at_time_of_use,
version_in_use,
])

return rows


def _format_percentage_for_enterprise_tile(dividend, divisor):
if not divisor:
return '--'
return f"{dividend / divisor * 100:.1f}%"


class EnterpriseSMSReport(EnterpriseReport):
title = gettext_lazy('SMS Usage')
total_description = gettext_lazy('# of SMS Sent')
Expand Down Expand Up @@ -464,3 +569,18 @@ def rows_for_domain(self, domain_obj):
num_errors = sum([result['direction_count'] for result in results if result['error']])

return [(domain_obj.name, num_sent, num_received, num_errors), ]


def _process_domains_in_parallel(process_func, domains):
"""
Helper method to process domains in parallel using gevent pooling.
"""
results = []

for chunk in chunked(domains, ES_CLIENT_CONNECTION_POOL_LIMIT):
pool = Pool(size=ES_CLIENT_CONNECTION_POOL_LIMIT)
chunk_results = pool.map(process_func, chunk)
pool.join()
results.extend(chunk_results)

return results
Comment on lines +574 to +586
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious if this is worth doing. I'm assuming that ElasticSearch has a connection limit across machines, and so while we can attempt to use all of them to process a single request, and that request will now be faster, I'd imagine it comes at the expense of any other requests trying to use ElasticSearch. Specifically for computing Enterprise Tile totals, we are going to be running multiple queries in parallel already, as loading that console page triggers multiple tile total requests. If each of them tries to max out their elasticsearch connections, it seems like we're basically back to running operations in sequence, rather than in parallel?

Loading
Loading