Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dict() overload insufficient #10013

Closed
imba-tjd opened this issue Apr 5, 2023 · 7 comments
Closed

dict() overload insufficient #10013

imba-tjd opened this issue Apr 5, 2023 · 7 comments

Comments

@imba-tjd
Copy link
Contributor

imba-tjd commented Apr 5, 2023

    # Next overload is for dict(string.split(sep) for string in iterable)
    # Cannot be Iterable[Sequence[_T]] or otherwise dict(["foo", "bar", "baz"]) is not an error
    @overload
    def __init__(self: dict[str, str], __iterable: Iterable[list[str]]) -> None: ...

This typing works with dict([['a', '1'], ['b', '2']]), but not dict([['a', 1], ['b', 2]])

Or

    @overload
    def __init__(self, __iterable: Iterable[tuple[_KT, _VT]]) -> None: ...
    @overload
    def __init__(self: dict[str, _VT], __iterable: Iterable[tuple[str, _VT]], **kwargs: _VT) -> None: ...

This works with dict([('a', 1), ('b', 2)]), but not dict([['a', 1], ['b', 2]])

@srittau
Copy link
Collaborator

srittau commented Apr 5, 2023

Currently typing does not support multi-type lists. See python/typing#592. Unfortunately, there is no way for us to fix/work around this in typeshed.

@srittau srittau closed this as not planned Won't fix, can't repro, duplicate, stale Apr 5, 2023
@ods
Copy link

ods commented Apr 18, 2023

It's strange to see it closed. According to docs

[…] the positional argument must be an iterable object. Each item in the iterable must itself be an iterable with exactly two objects.

It's OK if typing tools don't catch all the cases that are not possible to describe with current typing system. But explicitly disallowing valid cases is certainly a bug. The problem is not only with multi-type list, and even not lists only. The following case is valid too, but is not acceptable according to current annotations:

dict([iter([1, 2]), iter([2, 3])])

@Akuli
Copy link
Collaborator

Akuli commented Apr 18, 2023

Do you actually have a use case for dict([[1, 2], [3, 4]])? As the comment says, we currently support strings because passing an iterable of foo.split() to dict() is common (e.g. when parsing a text file that consists of key=value lines). If you want us to support something more general, please explain why you need it :)

As the comment says, we can't just accept an iterable of iterables, because then type checkers would not complain about dict(iterable_of_strings). We really want a type checker error for that. It's a somewhat common mistake, because iterables of strings are very common. For example, you might forget to .split() your strings when attempting dict(foo.split("=") for foo in blah).

@JelleZijlstra
Copy link
Member

Yes, the trouble here is that it's hard to write the types so that they accept everything the runtime accepts, while still rejecting a reasonable set of common mistakes.

@ods
Copy link

ods commented Apr 19, 2023

Do you actually have a use case for dict([[1, 2], [3, 4]])?

Not exactly, it was just a simplest example of valid code with false positive complaint due to bug in typeshed. Here is real life example with SQLAlchemy:

result: CursorResult[tuple[str, some_type]] = await connection.execute(
    select(some_table.c.string_field, some_table.c.some_type_field)
)
dict(result.all())

Here CursorResult.all() returns a Sequence[Row[…]], and Row is an Iterable[Any].

If you want us to support something more general, please explain why you need it :)

Because for typing false negative is unfortunate, but ok, while false positive is a certain bug.

As the comment says, we can't just accept an iterable of iterables, because then type checkers would not complain about dict(iterable_of_strings).

But it MUST NOT complaint about dict(iterable_of_strings), because it's pretty valid both according to docs and implementation:

>>> dict(['ab', 'cd'])
{'a': 'b', 'c': 'd'}

Do you mean we can potentially use strings of different length? But typing is not expected to cover all possible runtime errors. So dict(['abc', 'cde']) is pretty valid from typing point of view, even if it causes an error when run.

@imba-tjd
Copy link
Contributor Author

imba-tjd commented Apr 22, 2023

In my case, I was parsing HTTP Headers into dict

header_raw = b'''\
Host: example.com\r
User-Agent: curl/8.0.1\r
Accept: */*'''

result = dict(line.split(b': ') for line in header_raw.split(b'\r\n'))

print(result)

When using str rather than bytes, it can be inferred.

Not fixing this is not a big issue for me. I understand there are limitations.

@Akuli
Copy link
Collaborator

Akuli commented Apr 22, 2023

I added a similar overload with bytes and checked that the HTTP headers example works with the latest typeshed.

So dict(['abc', 'cde']) is pretty valid from typing point of view, even if it causes an error when run.

I like "practicality beats purity": it just isn't practical for type checkers to be happy with that. It's more likely a mistake than not, and IMO it's exactly the kind of mistake that people expect type checkers to point out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants