The Wayback Machine - https://web.archive.org/web/20211029174234/https://github.com/python/cpython/pull/29170
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-45250: fix docs regarding __iter__ and iterators #29170

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

@brettcannon
Copy link
Member

@brettcannon brettcannon commented Oct 22, 2021

https://bugs.python.org/issue45250

@brettcannon brettcannon self-assigned this Oct 22, 2021
@srittau
Copy link
Contributor

@srittau srittau commented Oct 25, 2021

I'm strongly opposed to changing the meaning of a term that has had a clear definition for decades (and is implemented as collections.abc.Iterator) by an ambiguous definition. It's clear that some methods accept a super type, instead of requiring a full iterator, but that means the description of those methods should be changed, not the definition.

Addendum: As a typeshed maintainer I see the problem of ill-defined terms and protocols in Python everyday ("file-like objects" are my personal bane). Making existing clear definitions less clear is not the right way forward.

@brettcannon
Copy link
Member Author

@brettcannon brettcannon commented Oct 25, 2021

The issue is the definition as written down is wrong/inaccurate from the perspective of Python itself. So it isn't that this redefines a term as much as actually makes the glossary reflect the real world as to how Python itself uses and considered what iterators are (i.e. this properly reflects what PyIter_Check() actually checks for). For instance, if a pass an iterable to a for loop and it returns something that only defines __next__(), it will work. If you were to ask someone what that object returned by the iterable is, do you think people would argue that it isn't an iterator simply because it doesn't define __iter__(), or that it is an iterator because a for loop accepts it?

And I don't quite understand how the definition is ambiguous? An iterator defines __next__(). You probably should define __iter__(), but it isn't strictly necessary to meet the iterator protocol, it just so happens the commonly used ABC for showing iterator compliance provides an __iter__() definition for free. If the ABC wasn't used for type hints then I don't think anyone would care, but because the class predates protocols then people have accidentally thought __iter__() was a required part of being an iterator (along with the inaccurate glossary definitions).

What's your proposal otherwise? To create a brand new term of what e.g. a for loop expects an iterable to return that isn't an iterator? To my knowledge everyone considers that an iterator, so trying to redefine that is also a major ask and something that I personally think is less helpful than simply getting the definition to be accurate.

Do note that Guido and other folks agree with this plan in https://mail.python.org/archives/list/[email protected]/thread/3W7TDX5KNVQVGT5CUHBK33M7VNTP25DZ/#3W7TDX5KNVQVGT5CUHBK33M7VNTP25DZ, so this isn't entirely without discussion and some agreement. We can ask the SC to make a final call on this if you really want to (I will abstain from voting on it if it comes to that).

@srittau
Copy link
Contributor

@srittau srittau commented Oct 26, 2021

(I'll use the terms "partial iterator" for objects just defining __next__() and "full iterator" for objects also defining __iter__().)

I'm not looking at this from the perspective how this is implemented in Python, but from a user's perspective. I don't think it's correct to say Python considers an iterator to have only __next__(). That may be true for the CPython core, but it's clearly not the case for the standard library (which I consider to be part of Python), where collections.abc.Iterator and typing.Iterator require the __iter__() method.

Practically, the status quo (some function, methods, statements don't require __iter__()) is a non-issue in practice. But saying "__iter__() is optional (but recommended)" turns this into a practical communication issue. Now, if I have one API that returns an iterator and a second API that accepts an iterator, I can't be sure that they are compatible. I can only hope that either the API returning the iterator follows the recommendation or the API consuming it doesn't access __iter__(). In the future, APIs need to document whether they include __iter__() or not. It also means that all existing documentation that mentions "iterators" needs to be updated to clarify whether they mean partial or full iterators.

I see two solutions to this problem:

  1. Introduce a new term as you suggest (e.g. "partial iterator") and define a link between the "iterator" and "partial iterator" glossary entries. This has the advantage that both concepts are clearly named.
  2. My preferred choice: Keep the definition of "iterator" as is and update the documentation where applicable, either with a note that technically __iter__() is not required, or by replacing the term "iterator" with a sentence like "an object implementing the __next__() protocol, e.g. an iterator".

@brettcannon
Copy link
Member Author

@brettcannon brettcannon commented Oct 26, 2021

collections.abc.Iterator and typing.Iterator require the iter() method.

I think this is where our views are differing. It's an unfortunate side-effect/bug, from my perspective, that because collections.abc.Iterator includes a default implementation for __iter__() that typing considers that method as required (tangent: this is the same reason I want to create protocols for importlib as importlib.abc has classes that are too broad for typing purposes due to the helper methods provided).

In other words I don't view collections.abc.Iterator as the official definition/type of what an iterator is, but instead an ABC that helps you implement an iterator.

Maybe this is suggesting it's time to have a lower-level protocols module or something which gets all of these interfaces correctly based on the language definition, then collections.abc can inherit from that new module? Or perhaps typing.Iterator comes back as a protocol or there's a new typing.SupportsIteration that does the right thing while collections.abc.Iterator inherits from it?

@brettcannon brettcannon reopened this Oct 28, 2021
@ambv
Copy link
Contributor

@ambv ambv commented Oct 28, 2021

The relevant python-dev thread doesn't seem to have reached consensus and died out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment