0% found this document useful (0 votes)
102 views2 pages

PYTHON PANDAS Cheat Sheet

This document provides a cheat sheet for using the Python Pandas library. It covers topics such as installing Pandas, reading and describing data, manipulating dataframes by selecting, sorting, filtering rows and columns, cleaning data by dropping null values, and joining dataframes. Functions covered include read_csv(), drop(), sort_values(), iloc, loc, ix for selection, and merge() for joining data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views2 pages

PYTHON PANDAS Cheat Sheet

This document provides a cheat sheet for using the Python Pandas library. It covers topics such as installing Pandas, reading and describing data, manipulating dataframes by selecting, sorting, filtering rows and columns, cleaning data by dropping null values, and joining dataframes. Functions covered include read_csv(), drop(), sort_values(), iloc, loc, ix for selection, and merge() for joining data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

PYTHON PANDAS Cheat Sheet

by sanjeev95 via cheatography.com/111141/cs/21621/

install and import Rows and columns Data Cleaning

installing pandas to delete a row - [axis=0 means rows] df.set​_in​dex​('c​olu​mn_​‐ Change the
pip install pandas new_df = df.dro​p([​2,3​],axis = 0) one') index with a new

pip install pandas #this drops the row with index column

import pandas as pd 2,3 df.columns = ['new_​col​‐ Rename


to delete a column- [axis=1 means columns] _na​me1​','​new​_co​l_n​‐ columns
Reading and describing new_df = df.dro​p([​'co​l1'​,'c​‐ ame​2',​'ne​w_c​ol_​‐
ol2​'],axis = 0) #this drops the name3']
pd -> pandas
df-> dataframe column with name col1 and col2 pd.isn​ull() Checks for null
Values, Returns
to read a file into a dataframe
Df manipu​lation Boolean Arrray
df= pd.rea​d_c​sv(​'fi​len​ame')
create or edit a new column pd.not​null() Opposite of
look at the first 5 lines
df['n​ew_​col​name'] = 5 #this pd.isn​ull()
df.he​ad()
creates a new new column with df.dro​pna() Drop all rows
to describe df
all values as 5 that contain null
df.de​scr​ibe()
values
create a new column
df.in​fo()
df['n​ew_​col​name'] = [list of df.dro​pna​(ax​is=1) Drop all
to print all the column names columns that
values] #this creates a new
telecom_data.columns
column with list of values contain null
to get the dimension of df values
assigned to each corres​ponding
df.shape row df.dro​pna​(ax​is=​1,t​hre​‐ Drop all rows
sh=n) have have less
NOTE : df['n​ew_​col​name'] = [list
Sorting and filtering than n non null
of values] throws an error if the no of
sort df.fil​lna(x) Replace all null
items in [list of values] doesn't match no of
sorting can be done column wise - default is values with x
rows
ascending
create or edit a new row
df.so​rt_​val​ues​(by​='Total day JOIN/C​OMBINE
df.lo​c[i​nde​x_o​f_row] = [list of
charge') df1.a​ppe​nd(​df2) Adds the rows in df1
items]
df.so​rt_​val​‐ Sort values by col1 in to the end of
NOTE : df.lo​c[i​nde​x_o​f_row] = df2 (columns should
ues​(col1) ascending order (use
ascending =False for [list of items] throws an error if the be identical)
descending sort) no of items doesn't match no of rows
pd.co​nca​t([df1, Adds the columns in
df.so​rt_​val​‐ Sort values by col1 in df2],a​xis=1) df1 to the end
Selection
ues​([c​ol1​,co​‐ ascending order then of df2 (rows should
l2]​,as​cen​din​‐ col2 in descending be

g=[​Tru​e,F​‐ order identical)

alse]) df1.j​oin​(df​2,o​‐ joins the columns in


n=c​ol1​,ho​w='​‐ df1 with the columns
Filtering
inn​er') on
df[co​ndi​tion] ​ ​ #eg:
df2 where the rows
df[df[​'co​l']​>5]
for col have identical
df[df​['col'] Rows where the values. how can be
> 0.5] column col is greater one of 'left',
than 0.5 'right', 'outer', 'inner'
df[(d​f[col] > Rows where 0.7 > col left = takes the index of left df
0.5) & > 0.5 right =takes the index of left
(df[col] < outer = union of both keys
0.7)] inner = inters​ection of both keys

Inplace
NOTE df[col] Returns column with label
df.me​rge​(df2) gives you a copy of df col as Series

merged with df2. you may save it to a new df[[col1, Returns multiple columns as
variable. ex df3=d​f.m​erg​e(df2) col2]] a new DataFrame

if you want to merge df2 to df right away use ​ Country ​ ​Capital ​ ​ ​ ​ ​ ​Pop​ulation
inplace. df.me​rge​(df​2,i​npl​ace​=True) 1 Belgiu​m Brussels ​ ​ ​ ​111​90846
2 India ​ ​ ​ ​ ​ ​ New Delhi ​130​3171035
3 Brazil ​ ​ ​ ​ ​ ​Bra​silia ​ ​ ​ ​ ​ ​207​847528
df.il​oc([0], [0]) --> 'Belgium' |
s.iloc[0] | Selection by position (0th position
on row and column)
df.lo​c([0], ['Coun​try']) -->
'Belgium'
df.ix[2] -->
Country ​ ​ ​ ​ ​ Brazil
Capital Brasilia
Population 207847528
df.ix[1, 'Capit​al'] --> 'New Delhi'

df.ilo​c[0,:] | select First row

By sanjeev95 Published 20th January, 2020. Sponsored by Readable.com


cheatography.com/sanjeev95/ Last updated 22nd January, 2020. Measure your website readability!
Page 1 of 2. https://readable.com

You might also like