Skip to content

Commit 68dc536

Browse files
Reduzir consultas e conexões ao banco de dados (#1410)
* config: ajusta limites de tempo e concorrência do Celery via variáveis de ambiente - CELERY_TASK_TIME_LIMIT: hard limit configurável via env (default 36000s) - CELERY_TASK_SOFT_TIME_LIMIT: soft limit configurável via env (default 3600s) - Adiciona CELERY_WORKER_MAX_TASKS_PER_CHILD para reciclagem de workers (default 100) - Adiciona CELERY_WORKER_CONCURRENCY configurável via env (default 4) - Comenta CELERY_RESULT_BACKEND='django-db'; resultado já persiste via Redis * config: otimiza configurações de banco de dados e sessão para produção - Simplifica CONN_MAX_AGE para 60s fixo via env (remove variável duplicada) - Remove POOL_OPTIONS (incompatível com django-prometheus backend) - Migra SESSION_ENGINE para cache Redis (SESSION_ENGINE + SESSION_CACHE_ALIAS) - Reduz escritas no PostgreSQL a cada request com SESSION_SAVE_EVERY_REQUEST=True * core: corrige _get_user para usar request.user autenticado quando disponível - Prioriza request.user.is_authenticated antes de consultar o banco - Fallback para user_id e username quando request não tem usuário autenticado - Remove acesso incorreto a request.user_id (AttributeError silencioso anterior) * core: evita dupla avaliação de queryset em UserCollectionMiddleware - Atribui request.user.collection.all() a variável local antes de reutilizar - Elimina segunda query duplicada para set_current_collections e request.user_collection * core: adiciona select_related('creator') no queryset de LicenseViewSet - Evita N+1 queries ao listar licenças no admin Wagtail * location: evita saves desnecessários em create_or_update comparando valores antes de persistir - State.create_or_update: salva apenas se name ou acronym mudaram - CountryName.create_or_update: salva apenas se country, language ou text mudaram - Country.create_or_update: salva apenas se name, acronym ou acron3 mudaram - CountryName.get_country: substitui loop por .filter(country__isnull=False).select_related().first() * location: move busca de User e user_id para fora do loop em bulk_cities - User.objects.get(id=user_id) executado uma única vez antes de iterar o CSV - Elimina N queries redundantes ao banco durante importação em massa * location: adiciona select_related no queryset de LocationAdmin - Inclui country, state, city e creator para evitar N+1 no admin Wagtail * institution: evita saves desnecessários em create_or_update comparando valores antes de persistir - Institution.create_or_update: salva apenas se institution_type ou url mudaram - InstitutionIdentification.create_or_update: salva apenas se is_official ou official mudaram - Garante que updated_by só é gravado quando há mudança efetiva * organization: refatora update_logo/update_url para retornar bool e consolidar saves - update_logo e update_url retornam True/False sem chamar save() internamente - create_or_update chama save() uma única vez se logo ou url mudaram - update_institutions retorna bool; create_or_update chama save() apenas se houve mudança - Organization.update_institutions remove save() interno; responsabilidade delegada ao chamador * organization: adiciona select_related para institution_identification__official em task de migração - Evita query extra ao acessar official durante migração de Institution para Organization * journal: otimiza queries e substitui get/except por update_or_create em JournalHistory - OfficialJournal.add_old_title/add_new_title: substitui loop por .filter().first() - add_old_title: remove self.save() após M2M .add() (desnecessário) - add_new_title: remove self.save(); responsabilidade delegada ao chamador - Journal.collection_acrons: substitui loop por values_list com select_related - Journal.get_ids / get_legacy_keys: adiciona select_related('collection') nos filtros - JournalHistory.load: substitui get/except+save duplo por update_or_create para ADMITTED e INTERRUPTED * issue: adiciona select_related para creator e updated_by no queryset de IssueAdminSnippetViewSet - Evita N+1 queries ao listar issues no admin Wagtail * researcher: substitui loops por values_list().first() nas properties orcid e lattes - Elimina iteração desnecessária; retorna diretamente o primeiro identifier encontrado - Remove acesso ao atributo do objeto relacionado; usa projeção via values_list * researcher: adiciona select_related e prefetch_related na query de migração - Adiciona select_related('affiliation__institution__location') - Adiciona prefetch_related('researcheraka_set__researcher_identifier') - Reduz queries N+1 ao acessar location e identifiers durante migração * pid_provider: consolida saves e adiciona select_related para evitar queries extras - XMLVersion.get_or_create: documenta fluxo de 3 saves (pk → arquivo → consolidação) - PidProviderXML.public_items: adiciona select_related('current_version') - PidProviderXML._add_current_version: remove save() interno; delegado ao chamador - PidProviderXML._add_other_pid: remove save() interno; delegado ao chamador - PidProviderXML._save: adiciona save() consolidado após _add_current_version/_add_other_pid - best_matches: adiciona select_related('current_version') no iterator * pid_provider: remove _get_user local e importa de core.utils.utils - Elimina duplicação de _get_user em pid_provider/tasks.py - Importa _get_user centralizado de core.utils.utils * tracker: remove _get_user local duplicado - Elimina definição local de _get_user em tracker/tasks.py - Função centralizada já disponível em core.utils.utils * bigbang: remove _get_user local e importa de core.utils.utils - Elimina definição local de _get_user - Atualiza chamadas para assinatura _get_user(request=None, user_id=..., username=...) - Importa _get_user centralizado de core.utils.utils * collection: corrige lógica if/elif e adiciona ValueError em task_load_collections - Substitui if/if por if/elif para evitar reatribuição silenciosa de user - Levanta ValueError explícito quando nem user_id nem username são fornecidos * editorialboard: extrai request.user para variável local em import_file_ebm - Atribui request.user a user antes do loop para evitar acesso repetido ao objeto request - Padroniza uso de user nas chamadas a create_or_update dentro do loop * thematic_areas: adiciona select_related('creator') nos querysets dos SnippetViewSets - GenericThematicAreaAdmin, GenericThematicAreaFileAdmin - ThematicAreaAdmin, ThematicAreaFileAdmin - Evita N+1 queries ao listar registros no admin Wagtail * article: adiciona select_related no iterator de bulk_export_articles_to_articlemeta - Inclui journal, journal__official e pp_xml para evitar N+1 durante exportação em massa * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
1 parent e46b362 commit 68dc536

File tree

23 files changed

+220
-175
lines changed

23 files changed

+220
-175
lines changed

article/controller.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -345,7 +345,7 @@ def bulk_export_articles_to_articlemeta(
345345
)
346346
return False
347347

348-
for article in queryset.iterator():
348+
for article in queryset.select_related("journal", "journal__official", "pp_xml").iterator():
349349
try:
350350
if force_update:
351351
article.check_availability(user)

bigbang/tasks.py

Lines changed: 3 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
from collection.models import Collection
99
from config import celery_app
1010
from core.models import Gender, Language, License
11+
from core.utils.utils import _get_user
1112
from editorialboard.models import RoleModel
1213
from institution.models import Institution, InstitutionType
1314
from journal.models import (
@@ -26,21 +27,14 @@
2627
User = get_user_model()
2728

2829

29-
def _get_user(user_id, username):
30-
if user_id:
31-
return User.objects.get(pk=user_id)
32-
if username:
33-
return User.objects.get(username=username)
34-
35-
3630
@celery_app.task(bind=True)
3731
def task_start(
3832
self,
3933
user_id=None,
4034
username=None,
4135
):
4236
try:
43-
user = _get_user(user_id, username)
37+
user = _get_user(request=None, user_id=user_id, username=username)
4438
Language.load(user)
4539
Collection.load(user)
4640
Vocabulary.load(user)
@@ -73,7 +67,7 @@ def task_start(
7367
@celery_app.task(bind=True)
7468
def task_create_tasks(self, user_id=None, username=None, tasks_data=None):
7569
if not tasks_data:
76-
user = _get_user(user_id, username)
70+
user = _get_user(request=None, user_id=user_id, username=username)
7771
return schedule_tasks(user.username)
7872
for task_data in tasks_data:
7973
# {

collection/tasks.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@
1111
def task_load_collections(self, user_id=None, username=None):
1212
if user_id:
1313
user = User.objects.get(pk=user_id)
14-
if username:
14+
elif username:
1515
user = User.objects.get(username=username)
16+
else:
17+
raise ValueError("user_id or username is required")
1618
Collection.load(user)

config/settings/base.py

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -374,11 +374,21 @@
374374
# http://docs.celeryproject.org/en/latest/userguide/configuration.html#std:setting-result_serializer
375375
CELERY_RESULT_SERIALIZER = "json"
376376
# http://docs.celeryproject.org/en/latest/userguide/configuration.html#task-time-limit
377-
# TODO: set to whatever value is adequate in your circumstances
378-
CELERY_TASK_TIME_LIMIT = 5 * 60
377+
# Hard time limit: o worker é terminado (SIGKILL) após este tempo.
378+
# Deve ser MAIOR que o soft time limit.
379+
CELERY_TASK_TIME_LIMIT = env.int('CELERY_TASK_TIME_LIMIT', default=36000)
379380
# http://docs.celeryproject.org/en/latest/userguide/configuration.html#task-soft-time-limit
380-
# TODO: set to whatever value is adequate in your circumstances
381-
CELERY_TASK_SOFT_TIME_LIMIT = 36000
381+
# Soft time limit: levanta SoftTimeLimitExceeded, permitindo cleanup.
382+
# Deve ser MENOR que o hard time limit.
383+
CELERY_TASK_SOFT_TIME_LIMIT = env.int('CELERY_TASK_SOFT_TIME_LIMIT', default=3600)
384+
385+
# Recicla worker após N tarefas para liberar memória e conexões de banco de dados.
386+
# Cada worker filho é substituído após processar este número de tarefas.
387+
CELERY_WORKER_MAX_TASKS_PER_CHILD = env.int('CELERY_WORKER_MAX_TASKS_PER_CHILD', default=100)
388+
389+
# Limita a concorrência do Celery worker.
390+
# Controla quantas tarefas cada worker processa simultaneamente.
391+
CELERY_WORKER_CONCURRENCY = env.int('CELERY_WORKER_CONCURRENCY', default=4)
382392
# http://docs.celeryproject.org/en/latest/userguide/configuration.html#beat-scheduler
383393
CELERY_BEAT_SCHEDULER = "django_celery_beat.schedulers:DatabaseScheduler"
384394
# http://docs.celeryproject.org/en/latest/userguide/configuration.html
@@ -404,8 +414,11 @@
404414
RUN_ASYNC = env.bool('RUN_ASYNC', default=0)
405415
# Celery Results
406416
# ------------------------------------------------------------------------------
407-
# https: // django-celery-results.readthedocs.io/en/latest/getting_started.html
408-
CELERY_RESULT_BACKEND = "django-db"
417+
# https://django-celery-results.readthedocs.io/en/latest/getting_started.html
418+
# NOTA: Não usar "django-db" como result backend em produção.
419+
# O result backend já está configurado como Redis (CELERY_BROKER_URL) acima.
420+
# Manter "django-db" aqui causaria escritas extras no PostgreSQL a cada tarefa concluída.
421+
# CELERY_RESULT_BACKEND = "django-db" # REMOVIDO: já usa Redis acima
409422
CELERY_CACHE_BACKEND = "django-cache"
410423
CELERY_RESULT_EXTENDED = True
411424

config/settings/production.py

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -14,19 +14,13 @@
1414
# ------------------------------------------------------------------------------
1515
DATABASES["default"] = env.db("DATABASE_URL") # noqa F405
1616
DATABASES["default"]["ATOMIC_REQUESTS"] = True # noqa F405
17-
DATABASES["default"]["CONN_MAX_AGE"] = env.int("CONN_MAX_AGE", default=0) or env.int("DJANGO_CONN_MAX_AGE", default=60) # noqa F405
17+
# Reutilizar conexões por 60s por padrão para reduzir overhead de reconexão.
18+
# Em produção com gunicorn+gevent ou Celery, isto evita abrir/fechar conexão a cada request.
19+
DATABASES["default"]["CONN_MAX_AGE"] = env.int("CONN_MAX_AGE", default=60) # noqa F405
1820
DATABASES["default"]["CONN_HEALTH_CHECKS"] = env.bool('DJANGO_CONN_HEALTH_CHECKS', True)
1921
DATABASES["default"]["ENGINE"] = 'django_prometheus.db.backends.postgresql'
20-
# Melhoria: Usando variáveis de ambiente para OPTIONS e POOL_OPTIONS com defaults
2122
DATABASES["default"]["OPTIONS"] = {
2223
"connect_timeout": env.int("DB_CONNECT_TIMEOUT", default=10),
23-
# Adicione outras opções de conexão aqui se necessário
24-
}
25-
DATABASES["default"]["POOL_OPTIONS"] = {
26-
'POOL_SIZE': env.int("DB_POOL_SIZE", default=10),
27-
'MAX_OVERFLOW': env.int("DB_MAX_OVERFLOW", default=20),
28-
'RECYCLE': env.int("DB_RECYCLE", default=300),
29-
# Adicione outras opções do pool aqui se necessário
3024
}
3125
# CACHES
3226
# ------------------------------------------------------------------------------
@@ -82,6 +76,11 @@
8276
# Ex: export DJANGO_SESSION_SAVE_EVERY_REQUEST=False
8377
SESSION_SAVE_EVERY_REQUEST = env.bool('DJANGO_SESSION_SAVE_EVERY_REQUEST', True)
8478

79+
# Sessões armazenadas no Redis em vez do PostgreSQL.
80+
# Reduz escritas no banco de dados a cada request quando SESSION_SAVE_EVERY_REQUEST=True.
81+
SESSION_ENGINE = "django.contrib.sessions.backends.cache"
82+
SESSION_CACHE_ALIAS = "default"
83+
8584
# STATIC
8685
# ------------------------
8786
STATICFILES_STORAGE = "whitenoise.storage.CompressedManifestStaticFilesStorage"

core/middleware.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,10 @@ def __call__(self, request):
1515

1616
if request.user.is_authenticated:
1717
set_current_user(request.user)
18-
set_current_collections(request.user.collection.all())
18+
collections = request.user.collection.all()
19+
set_current_collections(collections)
1920

20-
request.user_collection = request.user.collection.all()
21+
request.user_collection = collections
2122
else:
2223
set_current_user(None)
2324
set_current_collections(None)

core/utils/utils.py

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -91,12 +91,14 @@ def fetch_data(url, headers=None, json=False, timeout=FETCH_DATA_TIMEOUT, verify
9191

9292
def _get_user(request, username=None, user_id=None):
9393
try:
94-
return User.objects.get(pk=request.user_id)
94+
if request.user.is_authenticated:
95+
return request.user
9596
except AttributeError:
96-
if user_id:
97-
return User.objects.get(pk=user_id)
98-
if username:
99-
return User.objects.get(username=username)
97+
pass
98+
if user_id:
99+
return User.objects.get(pk=user_id)
100+
if username:
101+
return User.objects.get(username=username)
100102

101103

102104
def formated_date_api_params(query_params):

core/wagtail_hooks.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,3 +175,6 @@ class LicenseViewSet(SnippetViewSet):
175175
search_fields = ("license_type", "version")
176176
list_export = ("license_type", "version")
177177
inspect_view_enabled = True
178+
179+
def get_queryset(self, request):
180+
return super().get_queryset(request).select_related("creator")

editorialboard/views.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -71,15 +71,16 @@ def import_file_ebm(request):
7171
file_path = file_upload.attachment.file.path
7272

7373
try:
74+
user = request.user
7475
with open(file_path, "r") as csvfile:
7576
data = csv.DictReader(csvfile, delimiter=";")
7677
for line, row in enumerate(data):
7778
given_names = row.get("Nome do membro")
7879
last_name = row.get("Sobrenome")
7980
journal = Journal.objects.get(title__icontains=row.get("Periódico"))
80-
gender = Gender.create_or_update(user=request.user, code=row.get("Gender"), gender="F")
81+
gender = Gender.create_or_update(user=user, code=row.get("Gender"), gender="F")
8182
location = Location.create_or_update(
82-
user=request.user,
83+
user=user,
8384
city_name=row.get("institution_city_name"),
8485
state_text=row.get("institution_state_text"),
8586
state_acronym=row.get("institution_state_acronym"),
@@ -89,7 +90,7 @@ def import_file_ebm(request):
8990
country_name=row.get("institution_country_name"),
9091
)
9192
researcher = Researcher.create_or_update(
92-
user=request.user,
93+
user=user,
9394
given_names=given_names,
9495
last_name=last_name,
9596
suffix=row.get("Suffix"),
@@ -104,7 +105,7 @@ def import_file_ebm(request):
104105
aff_name=row.get("Instituição"),
105106
)
106107
EditorialBoardMember.create_or_update(
107-
user=request.user,
108+
user=user,
108109
researcher=researcher,
109110
journal=journal,
110111
declared_role=row["Cargo / instância do membro"],

institution/models.py

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -201,10 +201,16 @@ def create_or_update(
201201
level_3=level_3,
202202
location=location,
203203
)
204-
institution.updated_by = user
205-
institution.institution_type = institution_type or institution.institution_type
206-
institution.url = url or institution.url
207-
institution.save()
204+
changed = False
205+
if institution_type and institution_type != institution.institution_type:
206+
institution.institution_type = institution_type
207+
changed = True
208+
if url and url != institution.url:
209+
institution.url = url
210+
changed = True
211+
if changed:
212+
institution.updated_by = user
213+
institution.save()
208214
return institution
209215
except cls.DoesNotExist:
210216
return cls._create(
@@ -968,10 +974,16 @@ def create_or_update(
968974

969975
try:
970976
obj = cls._get(name=name, acronym=acronym)
971-
obj.updated_by = user
972-
obj.is_official = is_official or obj.is_official
973-
obj.official = official or obj.official
974-
obj.save()
977+
changed = False
978+
if is_official is not None and is_official != obj.is_official:
979+
obj.is_official = is_official
980+
changed = True
981+
if official and official != obj.official:
982+
obj.official = official
983+
changed = True
984+
if changed:
985+
obj.updated_by = user
986+
obj.save()
975987
return obj
976988
except cls.DoesNotExist:
977989
return cls._create(

0 commit comments

Comments
 (0)