Refactor: collection summary api#199
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR refactors collection metadata retrieval from a static dictionary to dynamic API calls. The hardcoded collections dictionary (containing collection names and domains) is removed and replaced with functions that fetch collection metadata from the ArticleMeta API at runtime.
Key changes:
- Removed static
collectionsdictionary fromchoices.py - Added three new functions to dynamically fetch collection data via API:
fetch_collection_metadata(),get_collection_name_and_url(), andget_collection_summary() - Updated
scielo_domainandcollection_nameproperties across Issue, Journal, and Article classes to use the new API-based approach
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| xylose/choices.py | Removed static collections dictionary and added API-based collection metadata fetching with retry logic |
| xylose/scielodocument.py | Updated collection name and domain retrieval methods to use new API-based functions instead of static dictionary lookups |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| [u'Undefined: %s' % self.collection_acronym, None] | ||
| )[1] or None | ||
| summary = choices.get_collection_summary(self.collection_acronym) | ||
| return summary[0] if len(summary) > 0 and summary[0] else f'Undefined: {self.collection_acronym}' |
There was a problem hiding this comment.
This scielo_domain method should return summary[1] (the URL) instead of summary[0] (the name). According to the method's docstring and similar implementations at lines 256 and 1074, it should retrieve the collection domain, not the name.
| return summary[0] if len(summary) > 0 and summary[0] else f'Undefined: {self.collection_acronym}' | |
| return summary[1] if len(summary) > 1 and summary[1] else f'Undefined: {self.collection_acronym}' |
| )[0] | ||
| if self.collection_acronym: | ||
| summary = choices.get_collection_summary(self.collection_acronym) | ||
| return summary[0] if len(summary) > 0 and summary[0] else f'Undefined: {self.collection_acronym}' |
There was a problem hiding this comment.
Missing return statement when self.collection_acronym is None or falsy. The original code always returned a value, but this implementation returns None implicitly when the condition fails. Add an explicit return None or handle the else case.
| return summary[0] if len(summary) > 0 and summary[0] else f'Undefined: {self.collection_acronym}' | |
| return summary[0] if len(summary) > 0 and summary[0] else f'Undefined: {self.collection_acronym}' | |
| return None |
| session.mount(base_url, adapter) | ||
|
|
||
| try: | ||
| response = session.get(full_url, timeout=10) |
There was a problem hiding this comment.
The function creates a new session with retry configuration on every call. Consider implementing a module-level session or caching mechanism to avoid the overhead of creating a new session for each collection lookup, especially if multiple collections are queried in succession.
| metadata = fetch_collection_metadata(collection_code) | ||
| return get_collection_name_and_url(metadata) if metadata else [] |
There was a problem hiding this comment.
Consider adding caching for collection metadata to avoid repeated API calls for the same collection code. The static dictionary was effectively a cache; removing it without replacement could significantly impact performance if the same collection is queried multiple times.
| except (requests.RequestException, ValueError): | ||
| return None |
There was a problem hiding this comment.
Error handling silently suppresses all exceptions without logging. Consider logging failures to help diagnose API connectivity issues or data parsing problems in production environments.
O que esse PR faz?
Refatora o uso do dicionário local choices.collections substituindo-o por chamadas à função
get_collection_summary, que consulta diretamente a API do ArticleMeta para obter o nome e a URL da coleção.Onde a revisão poderia começar?
A revisão pode começar pelos trechos onde
choices.collections.get(...)era utilizado - normalmente em métodos que retornam o nome ou a URL da coleção com base emself.collection_acronym.Os pontos alterados fazem uso agora da função
get_collection_summary, definida no módulo utilitárioxylose/choices.py.Como este poderia ser testado manualmente?
Algum cenário de contexto que queira dar?
NA.
Screenshots
NA.
Quais são tickets relevantes?
TK #198
Referências
NA.