The Wayback Machine - https://web.archive.org/web/20211130071103/https://github.com/pandas-dev/pandas/issues/43793
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Drop Index Combinations from Multiindex Dataframe #43793

Closed
juliandwain opened this issue Sep 29, 2021 · 6 comments
Closed

ENH: Drop Index Combinations from Multiindex Dataframe #43793

juliandwain opened this issue Sep 29, 2021 · 6 comments

Comments

@juliandwain
Copy link
Contributor

@juliandwain juliandwain commented Sep 29, 2021

Is your feature request related to a problem?

I whish the pandas drop function would let me drop combination of rows (or columns) in multiindex dataframes.
E.g., I have a multiindex dataframe like this

Column 1 Column 2
Index 1 Index 2
A a 1 2
b 3 4
B a 5 6
b 7 8

And I want to drop only the Index combination (B, a).
With the current implementation of the drop function, this is not possible.

Describe the solution you'd like

Maybe adding an additional parameter called combination, which is a boolean value, could solve the problem.
When setting combination=True, the list-like label parameter could be recognized as the combination to drop, which would then result in

df = df.drop(labels=["B", "a"], axis=0, combination=True)
df
Column 1 Column 2
Index 1 Index 2
A a 1 2
b 3 4
B b 7 8

API breaking implications

I think adding the parameter would not break the API.

Describe alternatives you've considered

I have not considered alternatives

Additional context

For my problem, I implemented a function similar to the one below

def drop(df: pd.DataFrame, index1: Optional[str]=None, index2: Optional[int]=None) -> pd.DataFrame:
    if index1 is None:
        if index2 is None:
            # if no index1 and index2 is defined, return the original dataframe
            return df
        else:
            return df.drop(index2, level="Index2").sort_index()
    else:
        if index2 is None:
            return df.drop(index1, level="Index1").sort_index()
        else:
            a = df.drop(index1, level="Index1")
            b = pd.concat([df.loc[index1].drop(index2, level="Index2")], keys=[index1], names=["Index1"])
            return pd.concat([a, b]).sort_index()
@debnathshoham
Copy link
Contributor

@debnathshoham debnathshoham commented Sep 29, 2021

Thanks for the report @juliandwain !
You can do that by selecting the tuple you want to drop (see example below).
I think we can take a PR to include the example in the doc.

In [2]: import pandas as pd

In [3]: midx = pd.MultiIndex(levels=[['lama', 'cow', 'falcon'],
   ...: ...                              ['speed', 'weight', 'length']],
   ...: ...                      codes=[[0, 0, 0, 1, 1, 1, 2, 2, 2],
   ...: ...                             [0, 1, 2, 0, 1, 2, 0, 1, 2]])
   ...: >>> df = pd.DataFrame(index=midx, columns=['big', 'small'],
   ...: ...                   data=[[45, 30], [200, 100], [1.5, 1], [30, 20],
   ...: ...                         [250, 150], [1.5, 0.8], [320, 250],
   ...: ...                         [1, 0.8], [0.3, 0.2]])

In [4]: df
Out[4]: 
                 big  small
lama   speed    45.0   30.0
       weight  200.0  100.0
       length    1.5    1.0
cow    speed    30.0   20.0
       weight  250.0  150.0
       length    1.5    0.8
falcon speed   320.0  250.0
       weight    1.0    0.8
       length    0.3    0.2

In [5]: df.index
Out[5]: 
MultiIndex([(  'lama',  'speed'),
            (  'lama', 'weight'),
            (  'lama', 'length'),
            (   'cow',  'speed'),
            (   'cow', 'weight'),
            (   'cow', 'length'),
            ('falcon',  'speed'),
            ('falcon', 'weight'),
            ('falcon', 'length')],
           )

In [6]: df.drop(labels=('cow','speed'))
Out[6]: 
                 big  small
lama   speed    45.0   30.0
       weight  200.0  100.0
       length    1.5    1.0
cow    weight  250.0  150.0
       length    1.5    0.8
falcon speed   320.0  250.0
       weight    1.0    0.8
       length    0.3    0.2

Loading

@juliandwain
Copy link
Contributor Author

@juliandwain juliandwain commented Sep 29, 2021

Oh, it was as easy at it is. 😄
Thank you for clearing that up, maybe adding this to the docs would help more people when facing the same issues like I did.

Loading

@debnathshoham
Copy link
Contributor

@debnathshoham debnathshoham commented Sep 29, 2021

Right!
If you are interested, a PR would be welcome to include this in the docs.

Loading

@juliandwain
Copy link
Contributor Author

@juliandwain juliandwain commented Sep 29, 2021

Yes I can do that!

Loading

@acse-srm3018
Copy link

@acse-srm3018 acse-srm3018 commented Sep 29, 2021

If you need any help, I'll be happy to help. I'd like to start contributing.

Loading

@juliandwain
Copy link
Contributor Author

@juliandwain juliandwain commented Sep 30, 2021

take

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants