In pandas, there may be data that needs to be removed but are not NaN
entries, hence, we are unable to use preexisting remove NaN functions. In order to remove such entries, we can use DataFrame.drop()
Drop a column, axis=0 when dropping rows, axis=1 when dropping columns
df.drop('net_sales', axis=1)
Drop a row if it has a certain value (in this case, 'TRUE')
df[df.order_fufilled!= 'TRUE']
It also can be extended to indices
df.drop(df.index[0])
or dropping relative to the end of the data frame.
df[:5] #keep first five rows
df.drop(df.index[:-5]) #drop last five rows
Come and test your learning out!
1, Drop the revenue column in the data frame
2. Drop net_sales with 0 value
3. Drop the first 10 rows of the data frame
Answers for Activity: Removing data
import pandas as pd df=pd.read_csv("Retail dataset.csv") df.drop('revenue',axis=1) df=df[df.net_sales!= 0] df.drop(df.index[0:10])
You are expected to comply with University policies and guidelines namely, Appropriate Use of Information Resources Policy, IT Usage Policy and Social Media Policy. Users will be personally liable for any infringement of Copyright and Licensing laws. Unless otherwise stated, all guide content is licensed by CC BY-NC 4.0.