Skip to Main Content

Python for Basic Data Analysis

Start your data science journey with Python. Learn practical Python programming skills for basic data manipulation and analysis.

Removing data

In pandas, there may be data that needs to be removed but are not NaN entries, hence, we are unable to use preexisting remove NaN functions. In order to remove such entries, we can use DataFrame.drop()

Drop a column, axis=0 when dropping rows, axis=1 when dropping columns

df.drop('net_sales', axis=1)

Drop a row if it has a certain value (in this case, 'TRUE')

df[df.order_fufilled!= 'TRUE']

It also can be extended to indices

df.drop(df.index[0])

or dropping relative to the end of the data frame.

df[:5] #keep first five rows
df.drop(df.index[:-5]) #drop last five rows

Video Guides

Activity: Removing data

Come and test your learning out!

1, Drop the revenue column in the data frame

2. Drop net_sales with 0 value

3. Drop the first 10 rows of the data frame

Answers for Activity: Removing data

import pandas as pd
df=pd.read_csv("Retail dataset.csv")
df.drop('revenue',axis=1)

df=df[df.net_sales!= 0]

df.drop(df.index[0:10])

Further Readings