Skip to content

Commit 8b44c5e

Browse files
PEP 749: Updates (#4316)
* PEP 749: Add conditional annotations and partially executed modules * Update peps/pep-0749.rst Co-authored-by: Adam Turner <[email protected]> * Update peps/pep-0749.rst Co-authored-by: Adam Turner <[email protected]> * More updates * More changes * Update peps/pep-0749.rst --------- Co-authored-by: Adam Turner <[email protected]>
1 parent 424872d commit 8b44c5e

File tree

1 file changed

+197
-46
lines changed

1 file changed

+197
-46
lines changed

peps/pep-0749.rst

Lines changed: 197 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,9 @@ specification:
3535
(which were added by :pep:`695` and :pep:`696`) using PEP 649-like semantics.
3636
* The ``SOURCE`` format is renamed to ``STRING`` to improve clarity and reduce the risk of
3737
user confusion.
38+
* Conditionally defined class and module annotations are handled correctly.
39+
* If annotations are accessed a partially executed module, the annotations executed so far
40+
are returned, but not cached.
3841

3942
Motivation
4043
==========
@@ -195,10 +198,11 @@ The module will contain the following functionality:
195198
module, or class. This will replace :py:func:`inspect.get_annotations`. The latter
196199
will delegate to the new function. It may eventually be deprecated, but to
197200
minimize disruption, we do not propose an immediate deprecation.
198-
* ``get_annotate_function()``: A function that returns the ``__annotate__`` function
199-
of an object, if it has one, or ``None`` if it does not. This is usually equivalent
200-
to accessing the ``.__annotate__`` attribute, except in the presence of metaclasses
201-
(see :ref:`below <pep749-metaclasses>`).
201+
* ``get_annotate_from_class_namespace(namespace: Mapping[str, Any])``: A function that
202+
returns the ``__annotate__`` function from a class namespace dictionary, or ``None``
203+
if there is none. This is useful in metaclasses during class construction. It is
204+
a separate function to avoid exposing implementation details about the internal storage
205+
for the ``__annotate__`` function (see :ref:`below <pep749-metaclasses>`).
202206
* ``Format``: an enum that contains the possible formats of annotations. This will
203207
replace the ``VALUE``, ``FORWARDREF``, and ``SOURCE`` formats in :pep:`649`.
204208
PEP 649 proposed to make these values global members of the :py:mod:`inspect`
@@ -235,7 +239,7 @@ The module will contain the following functionality:
235239
This is useful for
236240
implementing the ``SOURCE`` format in cases where the original source is not available,
237241
such as in the functional syntax for :py:class:`typing.TypedDict`.
238-
* ``value_to_string(value: object) -> str``: a function that converts a single value to a
242+
* ``type_repr(value: object) -> str``: a function that converts a single value to a
239243
string representation. This is used by ``annotations_to_string``.
240244
It uses ``repr()`` for most values, but for types it returns the fully qualified name.
241245
It is also useful as a helper for the ``repr()`` of a number of objects in the
@@ -498,45 +502,64 @@ attribute lookup is used, this approach breaks down in the presence of
498502
metaclasses, because entries in the metaclass's own class dictionary can render
499503
the descriptors invisible.
500504

501-
While we considered several approaches that would allow ``cls.__annotations__``
502-
and ``cls.__annotate__`` to work reliably when ``cls`` is a type with a custom
503-
metaclass, any such approach would expose significant complexity to advanced users.
504-
Instead, we recommend a simpler approach that confines the complexity to the
505-
``annotationlib`` module: in ``annotationlib.get_annotations``, we bypass normal
506-
attribute lookup by using the ``type.__annotations__`` descriptor directly.
505+
We considered several solutions but landed on one where we store the ``__annotate__``
506+
and ``__annotations__`` objects in the class dictionary, but under a different,
507+
internal-only name. This means that the class dictionary entries will not interfere
508+
with the descriptors defined on :py:class:`type`.
509+
510+
This approach means that the ``.__annotate__`` and ``.__annotations__`` objects in class
511+
objects will behave mostly intuitively, but there are a few downsides.
512+
513+
One concerns the interaction with classes defined under ``from __future__ import annotations``.
514+
Those will continue to have the ``__annotations__`` entry in the class dictionary, meaning
515+
that they will continue to display some buggy behavior. For example, if a metaclass is defined
516+
with the ``__future__`` import enabled and has annotations, and a class using that metaclass is
517+
defined without the ``__future__`` import, accessing ``.__annotations__`` on that class will yield
518+
the wrong results. However, this bug already exists in previous versions of Python. It could be
519+
fixed by setting the annotations at a different key in the class dict in this case too, but that
520+
would break users who directly access the class dictionary (e.g., during class construction).
521+
We prefer to keep the behavior under the ``__future__`` import unchanged as much as possible.
522+
523+
Second, in previous versions of Python it was possible to access the ``__annotations__`` attribute
524+
on instances of user-defined classes with annotations. However, this behavior was undocumented
525+
and not supported by :func:`inspect.get_annotations`, and it cannot be preserved under the
526+
:pep:`649` framework without bigger changes, such as a new ``object.__annotations__`` descriptor.
527+
This behavior change should be called out in porting guides.
507528

508529
Specification
509530
-------------
510531

511-
Users should always use ``annotationlib.get_annotations`` to access the
512-
annotations of a class object, and ``annotationlib.get_annotate_function``
513-
to access the ``__annotate__`` function. These functions will return only
514-
the class's own annotations, even when metaclasses are involved.
532+
The ``.__annotate__`` and ``.__annotations__`` attributes on class objects
533+
should reliably return the annotate function and the annotations dictionary,
534+
respectively, even in the presence of custom metaclasses.
515535

516-
The behavior of accessing the ``__annotations__`` and ``__annotate__``
517-
attributes on classes with a metaclass other than ``builtins.type`` is
518-
unspecified. The documentation should warn against direct use of these
519-
attributes and recommend using the ``annotationlib`` module instead.
520-
521-
Similarly, the presence of ``__annotations__`` and ``__annotate__`` keys
522-
in the class dictionary is an implementation detail and should not be relied
523-
upon.
536+
Users should not access the class dictionary directly for accessing annotations
537+
or the annotate function; the data stored in the class dictionary is an implementation
538+
detail and its format may change in the future. If only the class namespace
539+
dictionary is available (e.g., while the class is being constructed),
540+
``annotationlib.get_annotate_from_class_namespace`` may be used to retrieve the annotate function
541+
from the class dictionary.
524542

525543
Rejected alternatives
526544
---------------------
527545

528-
We considered two broad approaches for dealing with the behavior
546+
We considered three broad approaches for dealing with the behavior
529547
of the ``__annotations__`` and ``__annotate__`` entries in classes:
530548

531549
* Ensure that the entry is *always* present in the class dictionary, even if it
532550
is empty or has not yet been evaluated. This means we do not have to rely on
533551
the descriptors defined on :py:class:`type` to fill in the field, and
534552
therefore the metaclass's attributes will not interfere. (Prototype
535553
in `gh-120719 <https://github.com/python/cpython/pull/120719>`__.)
554+
* Warn users against using the ``__annotations__`` and ``__annotate__`` attributes
555+
directly. Instead, users should call function in ``annotationlib`` that
556+
invoke the :class:`type` descriptors directly. (Implemented in
557+
`gh-122074 <https://github.com/python/cpython/pull/122074>`__.)
536558
* Ensure that the entry is *never* present in the class dictionary, or at least
537559
never added by logic in the language core. This means that the descriptors
538560
on :py:class:`type` will always be used, without interference from the metaclass.
539-
(Prototype in `gh-120816 <https://github.com/python/cpython/pull/120816>`__.)
561+
(Initial prototype in `gh-120816 <https://github.com/python/cpython/pull/120816>`__;
562+
later implemented in `gh-132345 <https://github.com/python/cpython/pull/132345>`__.)
540563

541564
Alex Waygood suggested an implementation using the first approach. When a
542565
heap type (such as a class created through the ``class`` statement) is created,
@@ -558,19 +581,8 @@ While this approach would fix the known edge cases with metaclasses, it
558581
introduces significant complexity to all classes, including a new built-in type
559582
(for the annotations descriptor) with unusual behavior.
560583

561-
The alternative approach would be to never set ``__dict__["__annotations__"]``
562-
and use some other storage to store the cached annotations. This behavior
563-
change would have to apply even to classes defined under
564-
``from __future__ import annotations``, because otherwise there could be buggy
565-
behavior if a class is defined without ``from __future__ import annotations``
566-
but its metaclass does have the future enabled. As :pep:`649` previously noted,
567-
removing ``__annotations__`` from class dictionaries also has backwards compatibility
568-
implications: ``cls.__dict__.get("__annotations__")`` is a common idiom to
569-
retrieve annotations.
570-
571-
This approach would also mean that accessing ``.__annotations__`` on an instance
572-
of an annotated class no longer works. While this behavior is not documented,
573-
it is a long-standing feature of Python and is relied upon by some users.
584+
The second approach is simple to implement, but has the downside that direct
585+
access to ``cls.__annotations__`` remains prone to erratic behavior.
574586

575587
Adding the ``VALUE_WITH_FAKE_GLOBALS`` format
576588
=============================================
@@ -608,10 +620,17 @@ the ``VALUE_WITH_FAKE_GLOBALS`` format is requested, so the standard
608620
library will not call the manually written annotate function with
609621
"fake globals", which could have unpredictable results.
610622

623+
The names of annotation formats indicate what kind of objects an
624+
``__annotate__`` function should return: with the ``STRING`` format, it
625+
should return strings; with the ``FORWARDREF`` format, it should return
626+
forward references; and with the ``VALUE`` format, it should return values.
627+
The name ``VALUE_WITH_FAKE_GLOBALS`` indicates that the function should
628+
still return values, but is being executed in an unusual "fake globals" environment.
629+
611630
Specification
612631
-------------
613632

614-
An additional format, ``FAKE_GLOBALS_VALUE``, is added to the ``Format`` enum in the
633+
An additional format, ``VALUE_WITH_FAKE_GLOBALS``, is added to the ``Format`` enum in the
615634
``annotationlib`` module, with value equal to 2. (As a result, the values of the
616635
other formats will shift relative to PEP 649: ``FORWARDREF`` will be 3 and ``SOURCE``
617636
will be 4.)
@@ -622,10 +641,10 @@ they would return for the ``VALUE`` format. The standard library will pass
622641
this format to the ``__annotate__`` function when it is called in a "fake globals"
623642
environment, as used to implement the ``FORWARDREF`` and ``SOURCE`` formats.
624643
All public functions in the ``annotationlib`` module that accept a format
625-
argument will raise :py:exc:`NotImplementedError` if the format is ``FAKE_GLOBALS_VALUE``.
644+
argument will raise :py:exc:`NotImplementedError` if the format is ``VALUE_WITH_FAKE_GLOBALS``.
626645

627646
Third-party code that implements ``__annotate__`` functions should raise
628-
:py:exc:`NotImplementedError` if the ``FAKE_GLOBALS_VALUE`` format is passed
647+
:py:exc:`NotImplementedError` if the ``VALUE_WITH_FAKE_GLOBALS`` format is passed
629648
and the function is not prepared to be run in a "fake globals" environment.
630649
This should be mentioned in the data model documentation for ``__annotate__``.
631650

@@ -752,6 +771,126 @@ PEP, the four supported formats are now:
752771
- ``FORWARDREF``: replaces undefined names with ``ForwardRef`` objects.
753772
- ``STRING``: returns strings, attempts to recreate code close to the original source.
754773

774+
Conditionally defined annotations
775+
=================================
776+
777+
:pep:`649` does not support annotations that are conditionally defined
778+
in the body of a class or module:
779+
780+
It's currently possible to set module and class attributes with
781+
annotations inside an ``if`` or ``try`` statement, and it works
782+
as one would expect. It's untenable to support this behavior
783+
when this PEP is active.
784+
785+
However, the maintainer of the widely used SQLAlchemy library
786+
`reported <https://github.com/python/cpython/issues/130881>`__
787+
that this pattern is actually common and important:
788+
789+
.. code:: python
790+
791+
from typing import TYPE_CHECKING
792+
793+
if TYPE_CHECKING:
794+
from some_module import SpecialType
795+
796+
class MyClass:
797+
somevalue: str
798+
if TYPE_CHECKING:
799+
someothervalue: SpecialType
800+
801+
Under the behavior envisioned in :pep:`649`, the ``__annotations__`` for
802+
``MyClass`` would contain keys for both ``somevalue`` and ``someothervalue``.
803+
804+
Fortunately, there is a tractable implementation strategy for making
805+
this code behave as expected again. This strategy relies on a few fortuitous
806+
circumstances:
807+
808+
* This behavior change is only relevant to module and class annotations,
809+
because annotations in local scopes are ignored.
810+
* Module and class bodies are only executed once.
811+
* The annotations of a class are not externally visible until execution of the
812+
class body is complete. For modules, this is not quite true, because a partially
813+
executed module can be visible to other imported modules, but this case is
814+
problematic for other reasons (see the next section).
815+
816+
This allows the following implementation strategy:
817+
818+
* Each annotated assignment is assigned a unique identifier (e.g., an integer).
819+
* During execution of a class or module body, a set, initially empty, is created
820+
to hold the identifiers of the annotations that have been defined.
821+
* When an annotated assignment is executed, its identifier is added to the set.
822+
* The generated ``__annotate__`` function uses the set to determine
823+
which annotations were defined in the class or module body, and return only those.
824+
825+
This was implemented in `python/cpython#130935
826+
<https://github.com/python/cpython/pull/130935>`__.
827+
828+
Specification
829+
-------------
830+
831+
For classes and modules, the ``__annotate__`` function will return only
832+
annotations for those assignments that were executed when the class or module body
833+
was executed.
834+
835+
Caching of annotations on partially executed modules
836+
====================================================
837+
838+
:pep:`649` specifies that the value of the ``__annotations__`` attribute
839+
on classes and modules is determined on first access by calling the
840+
``__annotate__`` function, and then it is cached for later access.
841+
This is correct in most cases and preserves compatibility, but there is
842+
one edge case where it can lead to surprising behavior: partially executed
843+
modules.
844+
845+
Consider this example:
846+
847+
.. code:: python
848+
849+
# recmod/__main__.py
850+
from . import a
851+
print("in __main__:", a.__annotations__)
852+
853+
# recmod/a.py
854+
v1: int
855+
from . import b
856+
v2: int
857+
858+
# recmod/b.py
859+
from . import a
860+
print("in b:", a.__annotations__)
861+
862+
Note that while ``recmod/b.py`` executes, the ``recmod.a`` module is defined,
863+
but has not yet finished execution.
864+
865+
On 3.13, this produces:
866+
867+
.. code:: shell
868+
869+
$ python3.13 -m recmod
870+
in b: {'v1': <class 'int'>}
871+
in __main__: {'v1': <class 'int'>, 'v2': <class 'int'>}
872+
873+
But with :pep:`649` implemented as originally proposed, this would
874+
print an empty dictionary twice, because the ``__annotate__`` function
875+
is set only when module execution is complete. This is obviously
876+
unintuitive.
877+
878+
See `python/cpython#130907`__ for implementation.
879+
880+
__ https://github.com/python/cpython/issue/130907
881+
882+
Specification
883+
-------------
884+
885+
Accessing ``__annotations__`` on a partially executed module will
886+
continue to return the annotations that have been executed so far,
887+
similar to the behavior in earlier versions in Python. However, in this
888+
case the ``__annotations__`` dictionary will not be cached, so later
889+
accesses to the ``__annotations__`` attribute will return a fresh dictionary.
890+
This is necessary because ``__annotate__`` must be called again in order to
891+
incorporate additional annotations.
892+
893+
755894
Miscellaneous implementation details
756895
====================================
757896

@@ -840,16 +979,28 @@ to be supported by third-party libraries. Nevertheless, it is a serious issue fo
840979
that perform introspection, and it is important that we make it as easy as possible for
841980
libraries to support the new semantics in a straightforward, user-friendly way.
842981

843-
We will update those parts of the standard library that are affected by this problem,
844-
and we propose to add commonly useful functionality to the new ``annotationlib`` module,
845-
so third-party tools can use the same set of tools.
982+
Several pieces of functionality in the standard library are affected by this issue,
983+
including :mod:`dataclasses`, :class:`typing.TypedDict` and :class:`typing.NamedTuple`.
984+
These have been updated to support this pattern using the functionality in the new
985+
``annotationlib`` module.
846986

847987

848988
Security Implications
849989
=====================
850990

851-
None.
852-
991+
One consequence of :pep:`649` is that accessing annotations on an object, even if
992+
the object is a function or a module, may now execute arbitrary code. This is true
993+
even if the STRING format is used, because the stringifier mechanism only overrides
994+
the global namespace, and that is not enough to sandbox Python code completely.
995+
996+
In previous Python versions, accessing the annotations of functions or modules
997+
could not execute arbitrary code, but classes and other objects could already
998+
execute arbitrary code on access of the ``__annotations__`` attribute.
999+
Similarly, almost any further introspection on the annotations (e.g.,
1000+
using ``isinstance()``, calling functions like ``typing.get_origin``, or even
1001+
displaying the annotations with ``repr()``) could already execute arbitrary code.
1002+
And of course, accessing annotations from untrusted code implies that the untrusted
1003+
code has already been imported.
8531004

8541005
How to Teach This
8551006
=================

0 commit comments

Comments
 (0)