-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StopIteration from first() caught by a generator #465
Comments
As a matter of record, x = list(map(first, [[1,2,3], [], [1,2,3], [1,2,3]])) # x == [1] The question is should Faced with a situation like this, I would probably suggest something like: from toolz import pluck
x = list(pluck(0, [[1, 2, 3], [], [1, 2, 3], [1, 2, 3]])) # Raises IndexError
# or
from toolz.curried import get
x = list(map(get(0), [[1, 2, 3], [], [1, 2, 3], [1, 2, 3]])) # Raises IndexError Note both of these will only work on indexed sequences. |
Thank your for your answer.
The question is should `first` and friends handle `StopIteration`?
Which are the other functions apart from `first` and `second` that raise
StopIteration?
Raising a different exception would break backwards compatibility and
potentially break existing code.
Yes, this is something I thought about. Generally, I believe that
backward compatibility should be protected and I consider it a priority.
However, in this case, it might be more costly for the users to keep
raising StopIteration than introducing a different exception. Why?
Because StopIteration is very likely to introduce silent logical errors
in the code (it actually happened to me, before I discovered this
behavior). It's especially problematic in functional style with
function composition and many `map`s. "The Zen of Python" says "Errors
should never pass silently." ;-)
I agree that, it will break existing code, but it will be explicit. My
feeling is that this code might already be broken, because it's just so
unexpected that `first` on [] can stop `map`, e.g.,
`calculate_average(map(first, seq_of_seqs))`. It would be good to know, how
many people find this behavior surprising, or how many use this as a
feature. Could you comment on that?
Faced with a situation like this, I would probably suggest something like:
```python
from toolz import pluck
x = list(pluck(0, [[1, 2, 3], [], [1, 2, 3], [1, 2, 3]])) # Raises IndexError
# or
from toolz.curried import get
x = list(map(get(0), [[1, 2, 3], [], [1, 2, 3], [1, 2, 3]])) # Raises IndexError
```
Note both of these will only work on indexed sequences.
I see, but this was a simplistic example to illustrate the issue and
generally there's could be a generator instead of a list.
Thanks again!
|
@mrkrd, any function that manipulates the iterator directly could potentially raise a StopIteration. Such a functions would be You make a very good point and I would like to add that there is precedent in itertoolz already for catching and handling StopIteration. I agree that the behavior is subtle. @eriknw what are your thoughts on this? |
This might be useful to the issue: https://www.python.org/dev/peps/pep-0479/ @mrkrd are you using Python 3? If so, does adding EDIT: Just remembered that |
I agree this is non-ideal behavior, so thanks for the thoughtful discussion already. My initial reaction is we could perhaps follow in the spirit of https://www.python.org/dev/peps/pep-0479/ and raise def first(seq):
for rv in seq:
return rv
raise RuntimeError()
def second(seq):
for rv in itertools.islice(seq, 1, None):
return rv
raise RuntimeError() I'll want to give careful consideration to other functions. Returning |
@eriknw, I think returning an Perhaps raising a custom BTW: @eriknw, your for-loop implemenation of |
I agree this is non-ideal behavior, so thanks for the thoughtful discussion already. My initial reaction is we could perhaps follow in the spirit of https://www.python.org/dev/peps/pep-0479/ and raise `RuntimeError`. In this case, `first` and `second` could be implemented as:
```python
def first(seq):
for rv in seq:
return rv
raise RuntimeError()
def second(seq):
for rv in itertools.islice(seq, 1, None):
return rv
raise RuntimeError()
```
I like this idea. It solves the problem at hand (short term) and is aligned with the mentioned PEP (long term).
|
Thanks!
|
Unfortunately, this bug shows that Toolz API shall not be locked yet. IMHO, the best would be introduction a If that's not possible, the The documentation update can be done before #473 is merged and reference |
IMHO, the best would be introduction a `Maybe` monad. `first` should
return `Nothing` on empty input sequence and as consequence
`map(first, [[1,2,3], [], [1,2,3], [1,2,3]])` shall return `Nothing`
as well (using design from the Wikipedia example here). The Toolz
documentation emphasizes a lot of functional programming concepts and
it seems it needs one more to not fall short of its promises.
No one promised anything, right? ;)
If that's not possible, the `IterationError` looks like the next best
alternative. The documentation shall be improved of `first` function
to suggest its use together with `excepts` to avoid side effects in
form of an exception (you basically implement your own `Maybe` monad
every time).
As a user of toolz, I have a feeling that it tries to balance
"functional style" and "Pythonic style." In some cases, it's easier, in
other more difficult.
If I understand correctly, toolz has its roots in the Python modules
itertools and functools. Generally, it pushes the limits of "functional
style" while remaining quite Pythonic. I.e., it doesn't try to be
purely functional at any cost.
In the case of first(), raising an exception seems more Pythonic to me
than `Maybe` monad and returning `Nothing`. Imagine, if I had written a
library based on toolz, and this library had been used by some Python
programmers, I guess they would be more familiar with `IterationError`
exception than with `Nothing`.
There are also those two lines from The Zen of Python:
[…] practicality beats purity.
Errors should never pass silently.
PS I understand that those are not strong technical arguments by any
means, and it's easy to find contra-arguments. However, I think that
the key is to focus on the bigger picture (right balance between "purely
functional" and "Pythonic").
The documentation update can be done before #473 is merged and
reference `StopIteration` for now. Would such documentation change
have a chance to be merged any time soon? Asking as the project seems
to be stalled a bit (this bug, #470, #466, #477, etc).
I agree that I would be nice to have those fixed.
|
@wrobell that is a good suggestion. I'm curious to know what that would look like for toolz API to support monads. I'm open to experimenting with that in the future. @mrkrd is correct in that the aim of toolz is to extend parts of the standard library and bring useful utility functions to python. It seeks to complement pythonic paradigms. From the documentation, " The toolz API is also strongly affected by the principles of the Python language itself, and often makes deviations in order to be more approachable to that community." (https://toolz.readthedocs.io/en/latest/heritage.html). Depending on toolz should be a very easy thing and not require substantial changes to possibly production code. I think the right course of action right now is the iteration error route. I'd like to see this issue resolved as well. |
If more Pythonic, then what about raising the exception and allow setting a default value like |
@wrobell one can certainly pass a default value to Taking a step back, do you have a use case where setting a default is required? |
Please consider
vs.
|
Coming from Clojure, it is very counterintuitive that I’ve been bitten by this behavior right now. I wanted to open an issue, then found this one. |
Hello, I've noticed a surprising behavior of
first()
which can be well illustrated by:I would expect this statement to raise an exception, because
first([])
normally raises an exception on an empty sequence. However, the exception isStopIteration
which seems to interfere withmap
and stops its iteration.What do you think about catching
StopIteration
infirst()
and re-raising a different exception, so it can propagate until explicitly caught? Which exception would be appropriate?The text was updated successfully, but these errors were encountered: