The Wayback Machine - https://web.archive.org/web/20220221170423/https://github.com/pandas-dev/pandas/issues/42875
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: wrong/confusing error message in DataFrame.take with a scalar indexer #42875

Closed
jorisvandenbossche opened this issue Aug 4, 2021 · 20 comments · Fixed by #44763
Closed

BUG: wrong/confusing error message in DataFrame.take with a scalar indexer #42875

jorisvandenbossche opened this issue Aug 4, 2021 · 20 comments · Fixed by #44763

Comments

@jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Aug 4, 2021

In [4]: df = pd.DataFrame({'a': [1, 2, 3]})

In [5]: df.take(1)
...

~/scipy/pandas/pandas/core/indexes/range.py in take(self, indices, axis, allow_fill, fill_value, **kwargs)
    433     ) -> Int64Index:
    434         with rewrite_exception("Int64Index", type(self).__name__):
--> 435             return self._int64index.take(
    436                 indices,
    437                 axis=axis,

~/scipy/pandas/pandas/core/indexes/base.py in take(self, indices, axis, allow_fill, fill_value, **kwargs)
    962             self._values, indices, allow_fill=allow_fill, fill_value=self._na_value
    963         )
--> 964         return type(self)._simple_new(taken, name=self.name)
    965 
    966     @final

~/scipy/pandas/pandas/core/indexes/base.py in _simple_new(cls, values, name)
    611         Must be careful not to recurse.
    612         """
--> 613         assert isinstance(values, np.ndarray), type(values)
    614 
    615         result = object.__new__(cls)

AssertionError: <class 'numpy.int64'>

take requires an array-like for the indexer, but when passing a scalar you get a not very useful error message as above. We should raise a ValueError (or rather TypeError) instead with an informative message.

It actually comes from the Index take implementation:

In [9]: pd.Index([1, 2, 3]).take(1)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-9-ce9f87ba8bd2> in <module>
----> 1 pd.Index([1, 2, 3]).take(1)

~/scipy/pandas/pandas/core/indexes/base.py in take(self, indices, axis, allow_fill, fill_value, **kwargs)
    962             self._values, indices, allow_fill=allow_fill, fill_value=self._na_value
    963         )
--> 964         return type(self)._simple_new(taken, name=self.name)
    965 
    966     @final

~/scipy/pandas/pandas/core/indexes/base.py in _simple_new(cls, values, name)
    611         Must be careful not to recurse.
    612         """
--> 613         assert isinstance(values, np.ndarray), type(values)
    614 
    615         result = object.__new__(cls)

AssertionError: <class 'numpy.int64'>
@radioactive11
Copy link

@radioactive11 radioactive11 commented Aug 4, 2021

Can I take up this issue?

@jorisvandenbossche
Copy link
Member Author

@jorisvandenbossche jorisvandenbossche commented Aug 4, 2021

Yes, certainly!

@Arnab1181412
Copy link

@Arnab1181412 Arnab1181412 commented Aug 4, 2021

Sir can I take this issue? Im new to the contrib community.

@jorisvandenbossche
Copy link
Member Author

@jorisvandenbossche jorisvandenbossche commented Aug 4, 2021

@Arnab1181412 as you can read in the message above, somebody else already asked that an hour ago. So let's first give some time to @radioactive11 to try.

@radioactive11
Copy link

@radioactive11 radioactive11 commented Aug 4, 2021

@jorisvandenbossche I am working on this issue. Just wanted to approve my approach. Currently, an assertion is being raised if values is not of type np.ndarray.

Instead, we want to raise a TypeError with a helpful message stating values should be of type np.ndarray.

@jorisvandenbossche
Copy link
Member Author

@jorisvandenbossche jorisvandenbossche commented Aug 4, 2021

Currently, an assertion is being raised if values is not of type np.ndarray.

This is raised inside the Index._simple_new. But for fixing this PR, I would add a check higher up in Index.take to avoid that we pass a non-ndarray to Index._simple_new

@Arnab1181412
Copy link

@Arnab1181412 Arnab1181412 commented Aug 4, 2021

@jorisvandenbossche Sure sir and I had one query that how will I know that the issue has been merged successfully .

@jreback jreback removed this from the Contributions Welcome milestone Aug 4, 2021
@jreback jreback added this to the 1.4 milestone Aug 4, 2021
@jorisvandenbossche
Copy link
Member Author

@jorisvandenbossche jorisvandenbossche commented Aug 5, 2021

@Arnab1181412 you will get a notification when this issue would get closed.

@sinking8
Copy link

@sinking8 sinking8 commented Aug 17, 2021

@jorisvandenbossche Sir can I take up this issue?

@a-n-622
Copy link

@a-n-622 a-n-622 commented Aug 17, 2021

Sir, instead of scalar indexer, we should ideally pass a scalar array of indices. This is what I gathered from the documentation, and the error message hints towards the same thing, that it is expecting an array (I had similar issue in some other project I was doing). I hope this is resolved.

@radioactive11
Copy link

@radioactive11 radioactive11 commented Aug 18, 2021

@a-n-622 We should pass an array-like indexer in .take(). The goal of this issue is to improve the error message when a scalar indexer is passed.

@a-n-622
Copy link

@a-n-622 a-n-622 commented Aug 18, 2021

@radioactive11 got it. I am viewing the discussion on #42886 now. thanks.

@S-T-A-R-L-O-R-D
Copy link

@S-T-A-R-L-O-R-D S-T-A-R-L-O-R-D commented Aug 24, 2021

Hello everyone, I am working on this issue. I have understood the problem and hopefully will solve it by tomorrow.

@radioactive11
Copy link

@radioactive11 radioactive11 commented Aug 24, 2021

@S-T-A-R-L-O-R-D thanks for showing interest but I already have a PR open and have almost solved the issue.

@S-T-A-R-L-O-R-D
Copy link

@S-T-A-R-L-O-R-D S-T-A-R-L-O-R-D commented Aug 24, 2021

ok

@Anupam-USP
Copy link
Contributor

@Anupam-USP Anupam-USP commented Sep 28, 2021

Is the issue solved?

@radioactive11
Copy link

@radioactive11 radioactive11 commented Oct 9, 2021

@Anupam-USP I am currently working on the issue. I have added an appropriate error message and handled the error as well. However, I am unable to pass all the tests and am currently investigating the tests which are failing

@ADing4818
Copy link

@ADing4818 ADing4818 commented Nov 26, 2021

I see that this issue is still open with no comments in the past 1.5 months. I would like to take this issue.

@ADing4818
Copy link

@ADing4818 ADing4818 commented Nov 26, 2021

take

@gfkang
Copy link
Contributor

@gfkang gfkang commented Dec 5, 2021

FYI, I'm @ADing4818 's teammate for a project, so I'll be creating the PR for this issue on behalf of him.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

10 participants