Skip to Main Content

Python for Basic Data Analysis: PD.7 Removing and adding data

Get started on your learning journey towards data science using Python. Equip yourself with practical skills in Python programming for the purpose of basic data manipulation and analysis.

Removing data

In pandas, there may be data that needs to be removed but are not NaN entries, hence, we are unable to use preexisting remove NaN functions. In order to remove such entries, we can use DataFrame.drop()

Drop a column, axis=0 when dropping rows, axis=1 when dropping columns

df.drop('net_sales', axis=1)

Drop a row if it has a certain value (in this case, 'TRUE')

df[df.order_fufilled!= 'TRUE']

It also can be extended to indices

df.drop(df.index[0])

or dropping relative to the end of the data frame.

df[:5] #keep first five rows
df.drop(df.index[:-5]) #drop last five rows

Activity: Removing data

Come and test your learning out!

1, Drop the revenue column in the data frame

2. Drop net_sales with 0 value

3. Drop the first 10 rows of the data frame

Answers for Activity: Removing data

import pandas as pd
df=pd.read_csv("Retail dataset.csv")
df.drop('revenue',axis=1)

df=df[df.net_sales!= 0]

df.drop(df.index[0:10])

Video Guides

Further Readings