Skip to content

Commit bc6acb6

Browse files
committed
remove scrapy-poet registry in lieu of web-poet's registry
1 parent 2611199 commit bc6acb6

File tree

14 files changed

+105
-283
lines changed

14 files changed

+105
-283
lines changed

CHANGELOG.rst

Lines changed: 44 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -61,61 +61,67 @@ in page objects and spider callbacks. The following is now possible:
6161
6262
In line with this, the following changes were made:
6363

64-
* Added a new ``scrapy_poet.page_input_providers.ItemProvider`` which makes
65-
the usage above possible.
66-
* Multiple changes to the ``scrapy_poet.PageObjectInputProvider`` base class
67-
which are backward incompatible:
68-
69-
* It now accepts an instance of ``scrapy_poet.injection.Injector`` in its
70-
constructor instead of ``scrapy.crawler.Crawler``. Although you can
71-
still access the ``scrapy.crawler.Crawler`` via the ``Injector.crawler``
72-
attribute.
73-
* ``is_provided()`` is now an instance method instead of a class
74-
method.
75-
76-
* The ``scrapy_poet.injection.Injector``'s attribute and constructor parameter
77-
called ``overrides_registry`` is now simply called ``registry``.
64+
* Added a new :class:`scrapy_poet.page_input_providers.ItemProvider` which
65+
makes the usage above possible.
66+
* Multiple changes to the
67+
:class:`scrapy_poet.page_input_providers.PageObjectInputProvider` base
68+
class which are backward incompatible:
69+
70+
* It now accepts an instance of :class:`scrapy_poet.injection.Injector`
71+
in its constructor instead of :class:`scrapy.crawler.Crawler`. Although
72+
you can still access the :class:`scrapy.crawler.Crawler` via the
73+
``Injector.crawler`` attribute.
74+
* :meth:`scrapy_poet.page_input_providers.PageObjectInputProvider.is_provided`
75+
is now an instance method instead of a class method.
76+
77+
* The :class:`scrapy_poet.injection.Injector`'s attribute and constructor
78+
parameter called ``overrides_registry`` is now simply called ``registry``.
7879
This is backwards incompatible.
79-
* An item class is now supported by ``scrapy_poet.callback_for`` alongside
80-
the usual page objects. This means that it won't raise a ``TypeError``
81-
anymore when not passing a subclass of ``web_poet.ItemPage``.
82-
* ``scrapy_poet.overrides.OverridesRegistry`` has been deprecated and
83-
overhauled into ``scrapy_poet.registry.OverridesAndItemRegistry``:
84-
85-
* It is now subclassed from ``web_poet.RulesRegistry`` which allows
86-
outright access to its registry methods.
87-
* It now allows retrieval of rules based on the returned item class.
88-
* The registry doesn't accept tuples as rules anymore. Only
89-
``web_poet.ApplyRule`` instances are allowed. The same goes for
90-
``SCRAPY_POET_RULES`` (and the deprecated ``SCRAPY_POET_OVERRIDES``).
91-
92-
* As a result, the following type aliases have been removed:
93-
``scrapy_poet.overrides.RuleAsTuple`` and
94-
``scrapy_poet.overrides.RuleFromUser``
95-
* These changes are backward incompatible.
96-
97-
* New exception: ``scrapy_poet.injector_error.ProviderDependencyDeadlockError``.
80+
* An item class is now supported by :func:`scrapy_poet.callback_for`
81+
alongside the usual page objects. This means that it won't raise a
82+
:class:`TypeError` anymore when not passing a subclass of
83+
:class:`web_poet.pages.ItemPage`.
84+
* New exception: :class:`scrapy_poet.injection_errors.ProviderDependencyDeadlockError`.
9885
This is raised when it's not possible to create the dependencies due to
9986
a deadlock in their sub-dependencies, e.g. due to a circular dependency
10087
between page objects.
10188

10289
Other changes:
10390

91+
* Now requires ``web-poet >= 0.7.0``.
92+
* In line with web-poet's new features, the ``scrapy_poet.overrides`` module
93+
which contained ``OverridesRegistryBase`` and ``OverridesRegistry`` has now
94+
been removed. Instead, scrapy-poet directly uses
95+
:class:`web_poet.rules.RulesRegistry`.
96+
97+
Everything should pretty much the same except for
98+
:meth:`web_poet.rules.RulesRegistry.overrides_for` now accepts :class:`str`,
99+
:class:`web_poet.page_inputs.http.RequestUrl`, or
100+
:class:`web_poet.page_inputs.http.ResponseUrl` instead of
101+
:class:`scrapy.http.Request`.
102+
103+
* This also means that the registry doesn't accept tuples as rules anymore.
104+
Only :class:`web_poet.rules.ApplyRule` instances are allowed. The same goes
105+
for ``SCRAPY_POET_RULES`` (and the deprecated ``SCRAPY_POET_OVERRIDES``).
106+
As a result, the following type aliases have been removed:
107+
108+
* ``scrapy_poet.overrides.RuleAsTuple``
109+
* ``scrapy_poet.overrides.RuleFromUser``
110+
111+
These changes are backward incompatible.
112+
104113
* Moved some of the utility functions from the test module into
105114
``scrapy_poet.utils.testing``.
106115
* Documentation improvements.
116+
* Official support for Python 3.11
107117

108118
Deprecations:
109119

110-
* The ``scrapy_poet.overrides`` module has been replaced by
111-
``scrapy_poet.registry``.
112-
* The ``scrapy_poet.overrides.OverridesRegistry`` class is now replaced by
113-
``scrapy_poet.registry.OverridesAndItemRegistry``.
114120
* The ``SCRAPY_POET_OVERRIDES_REGISTRY`` setting has been replaced by
115121
``SCRAPY_POET_REGISTRY``.
116122
* The ``SCRAPY_POET_OVERRIDES`` setting has been replaced by
117123
``SCRAPY_POET_RULES``.
118-
* Official support for Python 3.11
124+
119125

120126
0.6.0 (2022-11-24)
121127
------------------

docs/api_reference.rst

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -43,10 +43,3 @@ Injection errors
4343

4444
.. automodule:: scrapy_poet.injection_errors
4545
:members:
46-
47-
Registry
48-
========
49-
50-
.. automodule:: scrapy_poet.registry
51-
:members:
52-
:show-inheritance:

docs/rules-from-web-poet.rst

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Rules from web-poet
55
===================
66

77
scrapy-poet fully supports the functionalities of :class:`web_poet.rules.ApplyRule`.
8-
It has its own registry called :class:`scrapy_poet.registry.OverridesAndItemRegistry`
8+
It uses the registry from web_poet called :class:`web_poet.rules.RulesRegistry`
99
which provides functionalties for:
1010

1111
* Returning the page object override if it exists for a given URL.
@@ -296,9 +296,10 @@ regarding :ref:`rules-item-class-example`.
296296
Registry
297297
========
298298

299-
As mentioned above, scrapy-poet has its own registry called
300-
:class:`scrapy_poet.registry.OverridesAndItemRegistry`.
299+
As mentioned above, scrapy-poet uses the registry from web-poet called
300+
:class:`web_poet.rules.RulesRegistry`.
301+
301302
This registry implementation can be changed if needed. A different registry can
302303
be configured by passing its class path to the ``SCRAPY_POET_REGISTRY`` setting.
303-
Such registries must be a subclass of :class:`scrapy_poet.registry.OverridesRegistryBase`
304-
and must implement the :meth:`scrapy_poet.registry.OverridesRegistryBase.overrides_for` method.
304+
Such registries must be a subclass of :class:`web_poet.rules.RulesRegistry`
305+
to ensure the expected methods and its types are properly accounted for.

docs/settings.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@ SCRAPY_POET_RULES
2929
Default: ``None``
3030

3131
Mapping of overrides for each domain. The format of the such ``dict`` mapping
32-
depends on the currently set Registry. The default is currently
33-
:class:`~.OverridesAndItemRegistry`. This can be overriden by the setting below:
34-
``SCRAPY_POET_OVERRIDES_REGISTRY``.
32+
depends on the currently set registry. The default is currently
33+
:class:`web_poet.rules.RulesRegistry`. This can be overriden by the setting below:
34+
``SCRAPY_POET_REGISTRY``.
3535

3636
There are sections dedicated for this at :ref:`intro-tutorial` and
3737
:ref:`rules-from-web-poet`.
@@ -46,9 +46,9 @@ SCRAPY_POET_REGISTRY
4646

4747
Defaut: ``None``
4848

49-
Sets an alternative Registry to replace the default :class:`~.OverridesAndItemRegistry`.
50-
To use this, set a ``str`` which denotes the absolute object path of the new
51-
Registry.
49+
Sets an alternative Registry to replace the default
50+
:class:`web_poet.rules.RulesRegistry`. To use this, set a ``str`` which denotes
51+
the absolute object path of the new registry.
5252

5353
More info at :ref:`rules-from-web-poet`.
5454

scrapy_poet/downloadermiddlewares.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,9 @@
99
from scrapy import Spider, signals
1010
from scrapy.crawler import Crawler
1111
from scrapy.http import Request, Response
12-
from scrapy.utils.misc import create_instance, load_object
12+
from scrapy.utils.misc import load_object
1313
from twisted.internet.defer import Deferred, inlineCallbacks
14+
from web_poet import RulesRegistry
1415

1516
from .api import DummyResponse
1617
from .injection import Injector
@@ -22,7 +23,7 @@
2223
RequestUrlProvider,
2324
ResponseUrlProvider,
2425
)
25-
from .registry import OverridesAndItemRegistry
26+
from .utils import create_registry_instance
2627

2728
logger = logging.getLogger(__name__)
2829

@@ -60,12 +61,10 @@ def __init__(self, crawler: Crawler) -> None:
6061
registry_cls = load_object(
6162
settings.get(
6263
"SCRAPY_POET_REGISTRY",
63-
settings.get(
64-
"SCRAPY_POET_OVERRIDES_REGISTRY", OverridesAndItemRegistry
65-
),
64+
settings.get("SCRAPY_POET_OVERRIDES_REGISTRY", RulesRegistry),
6665
)
6766
)
68-
self.registry = create_instance(registry_cls, settings, crawler)
67+
self.registry = create_registry_instance(registry_cls, crawler)
6968
self.injector = Injector(
7069
crawler,
7170
default_providers=DEFAULT_PROVIDERS,

scrapy_poet/injection.py

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,9 @@
1212
from scrapy.statscollectors import StatsCollector
1313
from scrapy.utils.conf import build_component_list
1414
from scrapy.utils.defer import maybeDeferred_coro
15-
from scrapy.utils.misc import create_instance, load_object
15+
from scrapy.utils.misc import load_object
1616
from twisted.internet.defer import inlineCallbacks
17+
from web_poet import RulesRegistry
1718
from web_poet.pages import is_injectable
1819

1920
from scrapy_poet.api import _CALLBACK_FOR_MARKER, DummyResponse
@@ -24,9 +25,8 @@
2425
UndeclaredProvidedTypeError,
2526
)
2627
from scrapy_poet.page_input_providers import PageObjectInputProvider
27-
from scrapy_poet.registry import OverridesAndItemRegistry, OverridesRegistryBase
2828

29-
from .utils import get_scrapy_data_path
29+
from .utils import create_registry_instance, get_scrapy_data_path
3030

3131
logger = logging.getLogger(__name__)
3232

@@ -42,11 +42,11 @@ def __init__(
4242
crawler: Crawler,
4343
*,
4444
default_providers: Optional[Mapping] = None,
45-
registry: Optional[OverridesRegistryBase] = None,
45+
registry: Optional[RulesRegistry] = None,
4646
):
4747
self.crawler = crawler
4848
self.spider = crawler.spider
49-
self.registry = registry or OverridesAndItemRegistry()
49+
self.registry = registry or RulesRegistry()
5050
self.load_providers(default_providers)
5151
self.init_cache()
5252

@@ -138,7 +138,11 @@ def build_plan(self, request: Request) -> andi.Plan:
138138
callback,
139139
is_injectable=is_injectable,
140140
externally_provided=self.is_class_provided_by_any_provider,
141-
overrides=self.registry.overrides_for(request).get,
141+
# Ignore the type since andi.plan expects overrides to be
142+
# Callable[[Callable], Optional[Callable]] but the registry
143+
# returns a more accurate typing for this scenario:
144+
# Mapping[Type[ItemPage], Type[ItemPage]]
145+
overrides=self.registry.overrides_for(request.url).get, # type: ignore[arg-type]
142146
)
143147

144148
@inlineCallbacks
@@ -360,7 +364,7 @@ def is_provider_requiring_scrapy_response(provider):
360364
def get_injector_for_testing(
361365
providers: Mapping,
362366
additional_settings: Optional[Dict] = None,
363-
registry: Optional[OverridesRegistryBase] = None,
367+
registry: Optional[RulesRegistry] = None,
364368
) -> Injector:
365369
"""
366370
Return an :class:`Injector` using a fake crawler.
@@ -379,7 +383,7 @@ class MySpider(Spider):
379383
spider.settings = settings
380384
crawler.spider = spider
381385
if not registry:
382-
registry = create_instance(OverridesAndItemRegistry, settings, crawler)
386+
registry = create_registry_instance(RulesRegistry, crawler)
383387
return Injector(crawler, registry=registry)
384388

385389

scrapy_poet/overrides.py

Lines changed: 0 additions & 6 deletions
This file was deleted.

0 commit comments

Comments
 (0)