The Wayback Machine - https://web.archive.org/web/20200917122934/https://github.com/pandas-dev/pandas/issues/36308
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrameGroupBy.transform with axis=1 fails #36308

Open
rhshadrach opened this issue Sep 12, 2020 · 4 comments · May be fixed by #36350
Open

BUG: DataFrameGroupBy.transform with axis=1 fails #36308

rhshadrach opened this issue Sep 12, 2020 · 4 comments · May be fixed by #36350
Assignees
Milestone

Comments

@rhshadrach
Copy link
Member

@rhshadrach rhshadrach commented Sep 12, 2020

The following seems to happen on all transform groupby kernels:

df = DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df.groupby([1, 1], axis=1).transform("shift")

results in ValueError: Length mismatch: Expected axis has 2 elements, new values have 3 elements.

I think the issue is in pandas.core.groupby.generic._wrap_transformed_output; this method does not take into account self.axis when wrapping the output. All that needs to happen is result = result.T and to use the index labels rather than the column labels for columns there.

@Ethanator
Copy link

@Ethanator Ethanator commented Sep 12, 2020

Could I work on this issue?

@xxsacxx
Copy link

@xxsacxx xxsacxx commented Sep 12, 2020

I tried the same example with :
df.groupby(['val']*df.shape[1],axis=1).transform('shift')
which failed with
Shape of passed values is (2, 3), indices imply (3, 3)

while
df.groupby(['val']*df.shape[1],axis=1).transform(lambda x:x.shift())

gives :
A B
0 NaN 1.0
1 NaN 2.0
2 NaN 3.0

what output are we expecting here ?

@rhshadrach
Copy link
Member Author

@rhshadrach rhshadrach commented Sep 12, 2020

@Ethanator Absolutely! Simply make a comment here "take" and github will assign it to you. If you encounter any difficulties, feel free to reach out here.

@xxsacxx Your 2nd result looks correct to me; the values are shifted 1 to the right, and since there are no values to the left of the first column, you get NaN. The presence of NaN then coerces the dtype to being a float.

@Ethanator
Copy link

@Ethanator Ethanator commented Sep 12, 2020

take

@arw2019 arw2019 mentioned this issue Sep 13, 2020
5 of 5 tasks complete
Ethanator pushed a commit to Ethanator/pandas that referenced this issue Sep 14, 2020
@Ethanator Ethanator linked a pull request that will close this issue Sep 14, 2020
5 of 5 tasks complete
@jreback jreback modified the milestones: Contributions Welcome, 1.2 Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.