Feat/workers by gabrielgz0 · Pull Request #1 · gabrielgz0/pypncp

gabrielgz0 · 2026-05-13T19:00:02Z

This pull request introduces a new, configurable prefetch and concurrency mechanism for paginated resource iteration, enabling significant performance improvements when retrieving large datasets. The API is extended to support a prefetch parameter, allowing users to select between sequential, simple prefetch, or multi-worker concurrent fetching. Documentation and docstrings are updated accordingly, and resource classes are streamlined for clarity.

Prefetch and Concurrency for Pagination

Added a prefetch parameter to all list_all* methods in AtasResource, ContratosResource, and ContratacoesResource, allowing users to control the level of concurrency: sequential (0), simple prefetch (1, default), or N workers (N≥2). [1] [2] [3] [4] [5] [6] [7]
Implemented three pagination strategies in BaseResource._list_all: sequential, prefetch (background fetch of next page), and concurrent workers (multiple pages fetched in parallel with ordered delivery).
Updated the README with detailed explanations and usage examples for the new prefetch and concurrency options, including diagrams and recommendations. [1] [2]

Documentation and Code Cleanup

Simplified and clarified resource class docstrings, removing redundant endpoint lists and harmonizing argument descriptions for consistency. [1] [2] [3] [4] [5] [6] [7]
Improved docstrings for public methods to reflect the new prefetch parameter and its behavior. [1] [2] [3] [4] [5] [6]

Internal Refactoring

Added an internal _STOP sentinel for worker coordination and refactored the code to use asyncio more robustly for background and concurrent fetching. [1] [2]

These changes enable much faster data collection and scraping scenarios, while maintaining a simple, backward-compatible API for end users.

BaseResource._list_all aceita prefetch=N (padrao 1). Cada nova pagina ja dispara o download da proxima em background via asyncio.ensure_future — a latencia da API fica sobreposta ao processamento do consumidor. - _list_all com prefetch em BaseResource - Parametro exposto em todos os list_all* (contratos, contratacoes, atas) - README atualizado na secao Paginacao com diagrama e exemplo de prefetch=0

O bug: remaining era decrementado no finally do worker ANTES de _STOP ser posto na fila. O consumidor via remaining=0 e saia do while, ignorando itens ainda na fila. Troca remaining por contagem de _STOP recebidos: cada worker poe _STOP ao finalizar, e o consumidor conta ate num_workers. prefetch agora: - 0: sequencial - 1: preload simples (1 pagina de antecipacao) - >=2: N workers concorrentes com stride e buffer ordenado

Adiciona test_list_all_with_workers (prefetch=2) e test_list_all_sequential (prefetch=0) para cobrir as 3 estrategias do _list_all router. Cobertura de base.py: 46% → 90% Cobertura total: 90.49%

README agora explica as 3 estrategias: - prefetch=0: sequencial - prefetch=1: preload background (padrao) - prefetch=N: N workers concorrentes com stride e diagrama Docstrings dos resources atualizadas para: Nivel de concorrencia: 0=seq, 1=prefetch, N=N workers

gabrielgz0 added 4 commits May 13, 2026 15:15

test: cobertura para prefetch=0 e prefetch>=2

39f2010

Adiciona test_list_all_with_workers (prefetch=2) e test_list_all_sequential (prefetch=0) para cobrir as 3 estrategias do _list_all router. Cobertura de base.py: 46% → 90% Cobertura total: 90.49%

gabrielgz0 merged commit e5d706d into main May 13, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/workers#1

Feat/workers#1
gabrielgz0 merged 4 commits into
mainfrom
feat/workers

gabrielgz0 commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gabrielgz0 commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant