Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Python for Basic Data Analysis: PD.12 Data visualisation

Get started on your learning journey towards data science using Python. Equip yourself with practical skills in Python programming for the purpose of basic data manipulation and analysis.

Data Visualisation

Since Pandas simply helps us with data structuring, we will need to employ Pandas in conjunction with other modules to help visualize this data. Two common modules are seaborn and matplotlib. With seaborn and matplotlib we can create some visualisations quickly using the following functions

1. Line plots

sns.lineplot(x=X_FIEL',y=Y_FIELD,data=DATFRAME)

2. Regression plots

ay=sns.relplot(x=X_FIELD,y=Y_FIELD,hue=DATA_CLASSIFIER,data=DATFRAME)

3. Histogram Plots

plt.hist(X_FIELD, bins=NUMBER_OF_BINS)

4.  Pair Plots

sns.pairplot(DATAFRAME, hue=DATA_CLASSIFIER,height=HEIGHT)

 

 

Activity: Data visualisation

Go ahead and try to plot these graphs using the retail dataset.

1. Plot a line plot of net_sales against date, is there any correlation?

2. Perform a regression analysis between average_selling_price and avg_margins and see if there is any correlation

3. Plot a histogram of revenue across 50 bins

4. Perform a pairplot, classify the data by order_fufilled status

Activity: Data visualization

Go ahead and try to plot these graphs using the retail dataset.

1. Plot a line plot of net_sales against date, is there any correlation?

2. Perform a regression analysis between average_selling_price and avg_margins and see if there is any correlation

3. Plot a histogram of revenue across 50 bins

4. Perform a pairplot, classify the data by order_fufilled status

Answers for Activity: Data Visualization

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

df=pd.read_csv("Retail dataset.csv")

#1. Plot a line plot of net_sales against date, is there any correlation?
sns.lineplot(x='date',y='net_sales',data=df)
plt.show()
print("done")
#2. Perform a regression analysis between average_selling_price and avg_margins and see if there is any correlation
sns.regplot(x='average_selling_price',y='avg_margins',data=df)
#3. Plot a histogram of revenue across 50 bins
plt.hist(df.revenue, bins=50)
#4. Perform a pairplot, classify the data by order_fufilled status
sns.pairplot(df, hue='order_fufilled',height=3)
plt.show()

Video Guides

Further Readings