Atualizar a lista de tipos de documentos com valores @article-type, DOCTOPIC e novos atributos#203
Atualizar a lista de tipos de documentos com valores @article-type, DOCTOPIC e novos atributos#203
Conversation
… Schema Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
xylose/choices.py
Outdated
| 'sc': 'rapid-communication', | ||
| 'tr': 'research-article', | ||
| 'up': 'undefined' | ||
| 'up': 'undefined', |
There was a problem hiding this comment.
@copilot considere também esta tabela para conferir os valores
ab | abstracts
an | announcements
ax | annex
co | comments
cr | case report
ct | clinical trial
ed | editorial
er | correction
in | interview
le | letter
mt | methodology
oa | original article
pr | press release
pv | point-of-view
ra | review article
rc | recount
rn | research note
sc | brief communication
tr | technical report
up | update
There was a problem hiding this comment.
Cross-referenced the legacy codes table with the issue's @article-type mapping table and made the following corrections in fb7eda0:
Legacy code mapping fixes:
in(interview):editorial→othermt(methodology):research-article→review-articlesc(brief communication):rapid-communication→brief-report
Added missing @article-type identity mappings:
clinical-instruction,discussion,expression-of-concern,obituary,oration,reviewer-report
…iewer feedback Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
xylose/choices.py
Outdated
| 'ed': 'editorial', | ||
| 'er': 'correction', | ||
| 'in': 'editorial', | ||
| 'in': 'other', |
There was a problem hiding this comment.
@copilot in é igual a interview e não other, mt é methodology.
There was a problem hiding this comment.
Fixed in 776740c: 'in': 'interview' and 'mt': 'methodology'.
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
xylose/choices.py
Outdated
| 'sc': 'brief-report', | ||
| 'tr': 'research-article', | ||
| 'up': 'undefined' | ||
| 'up': 'undefined', |
There was a problem hiding this comment.
Fixed in f556ec4: 'up': 'rapid-communication'.
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
| 'ed': 'editorial', | ||
| 'er': 'correction', | ||
| 'in': 'editorial', | ||
| 'in': 'interview', |
There was a problem hiding this comment.
@copilot 'in' deve ser other, pois entrevista não é identificada como tipo relevante; e mt é equivalente a review-article
There was a problem hiding this comment.
Fixed in 3135883: 'in': 'other' and 'mt': 'review-article'.
There was a problem hiding this comment.
Updated in 0d3fba8 based on the DOCTOPIC dict from the updated issue: 'in': 'interview' (confirmed by 'interview': 'in' in DOCTOPIC) and 'mt': 'review-article'.
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Este PR amplia o dicionário article_types para aceitar diretamente valores JATS @article-type (mapeando cada valor para si mesmo) e corrige alguns mapeamentos de códigos legados, evitando que document_type retorne 'undefined' quando v71 já contém um valor JATS válido.
Changes:
- Corrige mapeamentos de códigos legados (
in,mt,sc,up) no dicionárioarticle_types. - Adiciona mapeamentos de identidade para valores JATS
@article-typedo SciELO Publishing Schema. - Adiciona teste para garantir que
document_typesuporte valores JATS armazenados diretamente emv71.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xylose/choices.py |
Ajusta mapeamentos legados e adiciona entradas de identidade para valores JATS @article-type. |
tests/test_document.py |
Adiciona teste cobrindo resolução de document_type quando v71 contém valores JATS diretamente. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ]: | ||
| article.data['article']['v71'] = [{u'_': article_type}] | ||
| self.assertEqual(article.document_type, article_type) | ||
|
|
There was a problem hiding this comment.
This test only covers the new identity mappings (e.g. v71='retraction' -> 'retraction') but it doesn’t assert the corrected legacy-code mappings introduced in this PR (in/mt/sc/up). Add assertions for v71 values 'in', 'mt', 'sc', and 'up' mapping to 'other', 'review-article', 'brief-report', and 'rapid-communication' respectively, so the behavior change is protected by unit tests.
| def test_document_type_from_legacy_v71_values(self): | |
| article = self.article | |
| legacy_mappings = { | |
| u'in': u'other', | |
| u'mt': u'review-article', | |
| u'sc': u'brief-report', | |
| u'up': u'rapid-communication', | |
| } | |
| for legacy_value, expected_type in legacy_mappings.items(): | |
| article.data['article']['v71'] = [{u'_': legacy_value}] | |
| self.assertEqual(article.document_type, expected_type) |
| article.data['article']['v71'] = [{u'_': u'invalid'}] | ||
| self.assertEqual(article.document_type, u'undefined') | ||
|
|
||
| def test_document_type_from_article_type_attribute(self): |
There was a problem hiding this comment.
The test name mentions an "article_type attribute", but the test is setting the legacy field v71 directly. Renaming it to reflect what it actually validates (e.g., that document_type accepts JATS @article-type values stored in v71) would avoid confusion for future maintainers.
| def test_document_type_from_article_type_attribute(self): | |
| def test_document_type_accepts_jats_article_type_values_from_v71(self): |
|
@copilot leia novamente a descrição do issue, pois há novos detalhes, e refatore as mudanças |
…doctype properties Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Refactored in
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 'addendum': 'addendum', | ||
| 'book-review': 'rc', | ||
| 'books-received': 'books-received', | ||
| 'brief-report': 'rn', | ||
| 'calendar': 'calendar', | ||
| 'clinical-trial': 'oa', | ||
| 'collection': 'zz', | ||
| 'correction': 'er', | ||
| 'discussion': 'discussion', | ||
| 'dissertation': 'dissertation', | ||
| 'editorial-material': 'ed', | ||
| 'in-brief': 'pr', | ||
| 'introduction': 'ed', | ||
| 'meeting-report': 'meeting-report', | ||
| 'news': 'news', | ||
| 'obituary': 'obituary', | ||
| 'oration': 'oration', | ||
| 'partial-retraction': 'partial-retraction', | ||
| 'product-review': 'product-review', | ||
| 'reply': 'reply', | ||
| 'reprint': 'reprint', | ||
| 'retraction': 're', | ||
| 'translation': 'translation', | ||
| 'technical-report': 'oa', | ||
| 'other': 'zz', | ||
| 'guideline': 'guideline', | ||
| 'interview': 'in', | ||
| 'data-article': 'data-article', |
There was a problem hiding this comment.
In DOCTOPIC, values are meant to be legacy v71 codes (e.g. ax, rc, rn). However, several entries map to non-legacy strings (e.g. addendum: 'addendum' even though the legacy code already exists as ax). This makes legacy_doctype return values that are not actually legacy codes. Consider mapping only to real legacy codes (e.g. addendum -> ax) and omitting entries that have no legacy equivalent (so callers can get None).
| 'addendum': 'addendum', | |
| 'book-review': 'rc', | |
| 'books-received': 'books-received', | |
| 'brief-report': 'rn', | |
| 'calendar': 'calendar', | |
| 'clinical-trial': 'oa', | |
| 'collection': 'zz', | |
| 'correction': 'er', | |
| 'discussion': 'discussion', | |
| 'dissertation': 'dissertation', | |
| 'editorial-material': 'ed', | |
| 'in-brief': 'pr', | |
| 'introduction': 'ed', | |
| 'meeting-report': 'meeting-report', | |
| 'news': 'news', | |
| 'obituary': 'obituary', | |
| 'oration': 'oration', | |
| 'partial-retraction': 'partial-retraction', | |
| 'product-review': 'product-review', | |
| 'reply': 'reply', | |
| 'reprint': 'reprint', | |
| 'retraction': 're', | |
| 'translation': 'translation', | |
| 'technical-report': 'oa', | |
| 'other': 'zz', | |
| 'guideline': 'guideline', | |
| 'interview': 'in', | |
| 'data-article': 'data-article', | |
| 'addendum': 'ax', | |
| 'book-review': 'rc', | |
| 'brief-report': 'rn', | |
| 'clinical-trial': 'oa', | |
| 'collection': 'zz', | |
| 'correction': 'er', | |
| 'editorial-material': 'ed', | |
| 'in-brief': 'pr', | |
| 'introduction': 'ed', | |
| 'retraction': 're', | |
| 'technical-report': 'oa', | |
| 'other': 'zz', | |
| 'interview': 'in', |
| # Otherwise return the value as-is (it may be a legacy code) | ||
| return article_type_code |
There was a problem hiding this comment.
legacy_doctype currently returns article_type_code as-is when it is not found in choices.DOCTOPIC. That means an invalid/unrecognized v71 value (e.g. 'invalid') will be returned instead of None, and some non-legacy @article-type strings can also be returned depending on DOCTOPIC contents. Consider returning None when v71 is neither a known legacy code nor mappable via DOCTOPIC (and, if needed, explicitly validate legacy codes against a whitelist such as the 2-letter keys in choices.article_types).
| # Otherwise return the value as-is (it may be a legacy code) | |
| return article_type_code | |
| # If the v71 value is already a known legacy code, return it | |
| if article_type_code in choices.article_types: | |
| return article_type_code | |
| # Unknown or invalid v71 value | |
| return None |
| def test_legacy_doctype_article_type_self_mapped(self): | ||
| article = self.article | ||
| article.data['article']['v71'] = [{u'_': u'data-article'}] | ||
| self.assertEqual(article.legacy_doctype, u'data-article') |
There was a problem hiding this comment.
This test locks in the behavior that legacy_doctype returns the input @article-type ('data-article') when there is no legacy mapping. That conflicts with the intent implied by the property name and docstring (returning a legacy v71 code). If the desired behavior is “legacy code or None when not available”, update this test accordingly and adjust legacy_doctype/DOCTOPIC so unmapped @article-type values yield None.
| self.assertEqual(article.legacy_doctype, u'data-article') | |
| self.assertIsNone(article.legacy_doctype) |
| def test_legacy_doctype(self): | ||
| article = self.article | ||
| self.assertEqual(article.legacy_doctype, u'oa') | ||
|
|
||
| def test_legacy_doctype_from_article_type(self): | ||
| article = self.article | ||
| article.data['article']['v71'] = [{u'_': u'retraction'}] | ||
| self.assertEqual(article.legacy_doctype, u're') | ||
|
|
||
| def test_legacy_doctype_from_legacy_code(self): | ||
| article = self.article | ||
| article.data['article']['v71'] = [{u'_': u'er'}] | ||
| self.assertEqual(article.legacy_doctype, u'er') | ||
|
|
There was a problem hiding this comment.
There’s no test covering legacy_doctype when v71 contains an invalid/unrecognized value (similar to the existing test_invalid_document_type / test_sps_doctype_invalid). Adding an explicit test for this case will help lock in the intended behavior (likely None).
O que esse PR faz?
Adiciona os valores
@article-typedo SciELO Publishing Schema e do dicionárioDOCTOPIC(SPS 1.10) como mapeamentos de identidade no dicionárioarticle_types. Isso permite que o campov71contenha tanto códigos legados (e.g.'oa','ct') quanto valores JATS@article-typediretamente (e.g.'research-article','retraction'), sem cair em'undefined'.Foram corrigidos os mapeamentos de códigos legados com base na correspondência do DOCTOPIC (novo → velho):
an(announcements):news→announcementin(interview):editorial→interview(confirmado por'interview': 'in'no DOCTOPIC)mt(methodology):research-article→review-articlepr(press release):press-release→in-brief(confirmado por'in-brief': 'pr'no DOCTOPIC)sc(brief communication):rapid-communication→rapid-communication(confirmado por'rapid-communication': 'sc'no DOCTOPIC; revertido à correspondência original)up(update):undefined→rapid-communicationCódigos legados adicionados:
re→retraction,zz→other.Valores
@article-typeadicionados como mapeamentos de identidade (41 valores):abstract,addendum,announcement,article-commentary,book-review,books-received,brief-report,calendar,case-report,clinical-instruction,clinical-trial,collection,correction,data-article,discussion,dissertation,editorial,editorial-material,expression-of-concern,guideline,in-brief,interview,introduction,letter,meeting-report,news,obituary,oration,other,partial-retraction,product-review,rapid-communication,referee-report,reply,reprint,research-article,retraction,review-article,reviewer-report,technical-report,translation.Além disso, foi adicionado o dicionário
DOCTOPICemchoices.pycom a correspondência do novo para o velho (SPS 1.10), e dois novos atributos na classe Article:sps_doctype: retorna o valor SPS@article-typecorrespondente ao campov71, ouNonese não disponível.legacy_doctype: retorna o código legado correspondente ao campov71usando o mapeamento reverso doDOCTOPIC, ouNonese não disponível.Onde a revisão poderia começar?
xylose/choices.py— dicionárioarticle_typesatualizado e novo dicionárioDOCTOPIC.xylose/scielodocument.py— novos atributossps_doctypeelegacy_doctypena classe Article.Como este poderia ser testado manualmente?
Algum cenário de contexto que queira dar?
Documentos no formato JATS/XML usam o atributo
@article-typediretamente (e.g.article-type="retraction"). Quando esse valor é armazenado no campov71, odocument_typeproperty retornava'undefined'para tipos comoretraction,data-article,reply, etc., que não estavam no dicionário.A issue foi atualizada para incluir o dicionário
DOCTOPICdo SPS 1.10, que define a correspondência bidirecional entre valores@article-typee códigos legados. Os mapeamentos de códigos legados foram corrigidos para alinhar com essa correspondência (e.g.sc→rapid-communication,pr→in-brief).Os novos atributos
sps_doctypeelegacy_doctypepermitem obter o tipo de documento em ambos os formatos (SPS e legado) a partir do campov71, facilitando a compatibilidade entre os sistemas.Screenshots
N/A
Quais são tickets relevantes?
Referências
<article>— tabela de valores permitidos para@article-typeDOCTOPICdo SPS 1.10 — correspondência do novo para o velho para manter a compatibilidadeOriginal prompt
This section details on the original issue you should resolve
<issue_title>Atualizar a lista de tipos de documentos</issue_title>
<issue_description>Complete os tipos de documentos com os valores da coluna
@article-type✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.