Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the experimental name space from cookie parameters #144

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
2736224
WIP
Gallaecio Nov 2, 2023
a46bb14
Update tests/test_api_requests.py::test_automap_cookies expectations
Gallaecio Nov 3, 2023
bd680e7
Update remaining test expectations
Gallaecio Nov 3, 2023
7abcc28
Address mypy issues
Gallaecio Nov 3, 2023
8586ef6
Warn about the use of deprecated experimental fields
Gallaecio Nov 3, 2023
318d382
Test that cookies do not affect request fingerprinting
Gallaecio Nov 3, 2023
a01a93a
Test that experimental.foo does not get a warning
Gallaecio Nov 3, 2023
6ad1bde
Test response parsing of cookies
Gallaecio Nov 3, 2023
519f880
Do not warn about experimental.responseCookies on response parsing
Gallaecio Nov 3, 2023
add4aa9
Merge remote-tracking branch 'scrapy-plugins/main' into no-more-exper…
Gallaecio Nov 3, 2023
a7e53bf
Merge remote-tracking branch 'scrapy-plugins/main' into no-more-exper…
Gallaecio Nov 21, 2023
2f60444
Fix merge leftovers
Gallaecio Nov 21, 2023
31bda6c
Merge remote-tracking branch 'scrapy-plugins/main' into no-more-exper…
Gallaecio Jan 22, 2024
76e7fd9
Improve cookie handling and messaging
Gallaecio Jan 23, 2024
30cf859
Keep mypy happy
Gallaecio Jan 23, 2024
c5f7002
Improve tests
Gallaecio Jan 23, 2024
e21655a
Fix a merge issue
Gallaecio Jan 23, 2024
e7125bf
Update the fingerprint docs now that responseCookies affects fingerpr…
Gallaecio Jan 23, 2024
888ac92
Remove an unused function
Gallaecio Jan 23, 2024
597f822
Merge remote-tracking branch 'scrapy-plugins/main' into no-more-exper…
Gallaecio Jan 27, 2025
d49d269
Fix tests
Gallaecio Jan 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 22 additions & 3 deletions CHANGES.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,31 @@
Changes
=======

TBR
---
0.13.0 (unreleased)
-------------------

* Cookie support is no longer experimental:

* If the ``COOKIES_ENABLED`` setting is ``True`` (default), automatic request
parameter mapping now sets ``responseCookies`` to ``True`` and maps request
cookies to ``requestCookies``.
Comment on lines +9 to +11
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that, by default, all Zyte API requests with automap will start including "responseCookies": True among their parameters. Similar to how we do with httpResponseHeaders, but in this case the behavior also affects browser rendering and automatic extraction scenarios. The test expectation updates are a great way to get an idea of the impact.

This is a big one, the reason I went for 0.13.0 instead of 0.12.3, and I wonder whether or not this is the right call. I wonder if we should implement an opt-in setting for responseCookies, or make it so that it is only added to requests if requestCookies is also added (either manually by the user our automatically mapped).


* The ``ZYTE_API_EXPERIMENTAL_COOKIES_ENABLED`` setting is now deprecated.
When enabled, however, the ``experimental`` name space is still used for
automatic request parameter mapping.

* If you use ``requestCookies``, ``responseCookies``, or ``cookieManagement``
within the ``experimental`` name space in request parameters, a deprecation
warning is now logged.

* The ``responseCookies`` response parameter is now handled the same as
``experimental.responseCookies``; the latter still works but is deprecated.

* Zyte API Request IDs are now included in the error logs.

* Bump the zyte-api dependency: 0.4.7 → 0.4.8.


0.12.2 (2023-10-19)
-------------------

Expand Down Expand Up @@ -42,7 +61,7 @@ TBR
Experimental is treated as a namespace, and its parameters are the ones
counted, i.e. there is no ``scrapy-zyte-api/request_args/experimental`` stat,
but there are stats like
``scrapy-zyte-api/request_args/experimental.responseCookies``.
``scrapy-zyte-api/request_args/experimental.foo``.


0.11.1 (2023-08-25)
Expand Down
46 changes: 20 additions & 26 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -323,10 +323,10 @@ possible:

- ``statusCode`` becomes ``response.status``.

- ``httpResponseHeaders`` and ``experimental.responseCookies`` become
- ``httpResponseHeaders`` and ``responseCookies`` become
``response.headers``.

- ``experimental.responseCookies`` is also mapped into the request cookiejar.
- ``responseCookies`` is also mapped into the request cookiejar.

- ``browserHtml`` and ``httpResponseBody`` are mapped into both
``response.text`` (``str``) and ``response.body`` (``bytes``).
Expand Down Expand Up @@ -412,17 +412,15 @@ parameters are chosen as follows by default:

- ``Request.body`` becomes ``httpRequestBody``.

- If the ``ZYTE_API_EXPERIMENTAL_COOKIES_ENABLED`` Scrapy setting is
``True``, the COOKIES_ENABLED_ Scrapy setting is ``True`` (default), and
provided request metadata does not set dont_merge_cookies_ to ``True``:
- If the COOKIES_ENABLED_ Scrapy setting is ``True`` (default), and provided
request metadata does not set dont_merge_cookies_ to ``True``:

.. _COOKIES_ENABLED: https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#std-setting-COOKIES_ENABLED
.. _dont_merge_cookies: https://docs.scrapy.org/en/latest/topics/request-response.html#std-reqmeta-dont_merge_cookies

- ``experimental.responseCookies`` is set to ``True``.
- ``responseCookies`` is set to ``True``.

- Cookies from the request `cookie jar`_ become
``experimental.requestCookies``.
- Cookies from the request `cookie jar`_ become ``requestCookies``.

.. _cookie jar: https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#std-reqmeta-cookiejar

Expand All @@ -434,10 +432,10 @@ parameters are chosen as follows by default:
If the cookies to be set exceed the limit defined in the
``ZYTE_API_MAX_COOKIES`` setting (100 by default), a warning is logged,
and only as many cookies as the limit allows are set for the target
request. To silence this warning, set ``experimental.requestCookies``
manually, e.g. to an empty dict. Alternatively, if Zyte API starts
supporting more than 100 request cookies, update the
``ZYTE_API_MAX_COOKIES`` setting accordingly.
request. To silence this warning, set ``requestCookies`` manually, e.g.
to an empty dict. Alternatively, if Zyte API starts supporting more
than 100 request cookies, update the ``ZYTE_API_MAX_COOKIES`` setting
accordingly.

If you are using a custom downloader middleware to handle request
cookiejars, you can point the ``ZYTE_API_COOKIE_MIDDLEWARE`` setting to
Expand Down Expand Up @@ -511,20 +509,18 @@ following parameters:
"value": "application/json"
}
],
"experimental": {
"requestCookies": [
{
"name": "a",
"value": "b",
"domain": ""
}
],
"responseCookies": true
},
"httpResponseBody": true,
"httpResponseHeaders": true,
"httpRequestBody": "eyJmb28iOiAiYmFyIn0=",
"httpRequestMethod": "POST",
"requestCookies": [
{
"name": "a",
"value": "b",
"domain": ""
}
],
"responseCookies": true,
"url": "https://httpbin.org/anything"
}

Expand Down Expand Up @@ -553,10 +549,8 @@ following parameters:

{
"browserHtml": true,
"experimental": {
"responseCookies": true
},
"requestHeaders": {"referer": "https://example.com/"},
"responseCookies": true,
"url": "https://quotes.toscrape.com"
}

Expand Down Expand Up @@ -815,7 +809,7 @@ The following Zyte API parameters are *not* taken into account for request
fingerprinting:

- Request header parameters (``customHttpRequestHeaders``,
``requestHeaders``)
``requestHeaders``, ``requestCookies``)

- Metadata parameters (``echoData``, ``jobId``)

Expand Down
3 changes: 2 additions & 1 deletion scrapy_zyte_api/_cookies.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ def _process_cookies(
):
if not cookie_jars:
return
response_cookies = api_response.get("experimental", {}).get("responseCookies")
old_response_cookies = api_response.get("experimental", {}).get("responseCookies")
response_cookies = api_response.get("responseCookies", old_response_cookies)
if not response_cookies:
return
cookie_jar = _get_cookie_jar(request, cookie_jars)
Expand Down
Loading