New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
__hash__ method in str child class causing unintended side effects #100313
Comments
I suspect this is undefined behaviour. I quote from the documentation of
In other words, Another fix (probably the correct fix) to this issue is to make sure class MyClass:
def __init__(self, data):
self.data = data
def __str__(self):
return str(self.data) ...
This is not really guaranteed, since your code has undefined behaviour. |
The strange part here is how Here is a somewhat bandaid: class MyStr(str):
def __init__(self, value, *args, **kwargs):
self.value = str(value)
super().__init__(*args, **kwargs)
def __hash__(self) -> int:
return hash(str(self))
def __str__(self) -> str:
return str.__str__(self.value)
def dummy_func(x):
class MyClass:
def __init__(self, data: str):
self.data = data
def __str__(self):
return self.data
str(MyClass(x))
a = MyStr("teststring")
print(a in {}) # False
dummy_func(a)
print(a in {}) # False |
I was able to recreate what was happening above a bit more concisely, class DerivedStr(str):
def __init__(self, v) -> None:
# type(v)=<class 'str'>
# type(v)=<class '__main__.Repro'>
print(f"{type(v)=}")
self.v = v
def __str__(self) -> str:
return self.v
class Repro:
def __str__(self) -> str:
return DerivedStr("Subclass test")
r = Repro()
x = str(r) # Returns DerivedStr instance
# Instantiates DerivedStr with Repro
y = str(x) # TypeError: __str__ returned non-string (type Repro) Apparently instantiating |
An issue with this reproducer is the use of class DerivedStr(str):
def __new__(cls, v) -> None:
self = super().__new__(cls, v)
print(f"__init__ {v=} {type(v)=}")
self.v = v
return self
def __str__(self) -> str:
print(f"__str__ {self.v=} {type(self.v)=}")
return (self.v) Leaving the issue open because I'm not sure if this fully explains what's going on here. |
I think the main idea here is that
This mutation causes
The heart of this is that I agree with the suggestions that |
One possibility to prevent this would be to deprecate returning anything but an exact |
@sweeneyde Thanks, that's a super clear view of what's happening here. I think we may either throw an TypeError if |
That would break existing code though. I'm using a subclass of str() in PyObjC, including in the implementation of |
Marlin-Na commentedDec 17, 2022
Bug report
Consider the following example:
dummy_func
is a function that should have absolutely no side effects.However, checkout:
In this case,
dummy_func
mutates the input, and creates a weird circular dependency (i.e.id(a.value.data) == id(a)
).The issue can be fixed by removing the
__hash__
method ofMyStr
.Please confirm the behavior is unintended.
Environment
The text was updated successfully, but these errors were encountered: