Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Python for Basic Data Analysis: PD.11 Data classification and summary

Get started on your learning journey towards data science using Python. Equip yourself with practical skills in Python programming for the purpose of basic data manipulation and analysis.

Data Classification and Summary

Classifying your data for more refined and accurate understanding of the data is an important faucet of data analysis, hence, we can utilize Pandas to carry out such operations by organizing and summarizing our data.

Groupby Function

One way we can classify our data is using the group by function:

df.groupby(by=grouping_columns)[columns_to_show].function()

Pivot_table Function

For summarizing data, we can employ similar methods to that of the excel function, pivot tables, to avoid showing large chunks of data

df.pivot_table([rows_to_be_displayed'],
               [columns_to_be_displayed], aggfunc=function)

 

Activity Data: Classification

Given the newly learnt functions can you perform these actions back to our retail dataset?

1. Using the groupby and describe function can we obtain statistical information of the retail dataset by net_sales and date

2. Can we create a pivot table showing the average net sales and net quantity for failed and successful orders by specific dates?

Answers for Activity: Classification

import numpy as np
import pandas as pd
df=pd.read_csv("Retail dataset.csv")
#
#1. Using the groupby and describe function can we obtain statistical information of the retail dataset by net_sales and date
print(df.groupby(['date'])['net_sales'].describe())
#2. Can we create a pivot table showing the average net sales and net quantity for failed and successful orders by specific dates?
df.pivot_table(['net_sales', 'net_quantity'],
               ['date', 'order_fufilled'], aggfunc='mean')

Video Guides