Skip to Main Content

# Python for Basic Data Analysis

Start your data science journey with Python. Learn practical Python programming skills for basic data manipulation and analysis.

## Basic Syntaxes and functions of pandas

Now that we have our data imported, we can make use of pandas basic syntaxes and functions to get a basic statistical analysis of our dataset.

1. `df.head()` - this shows you a mini preview of your data so you can get an idea of what column headers and the type of data within the file

2. `df.shape` - This returns the number of rows and columns in your dataset in a vector output (no. of rows, no. of columns)
3. `df.columns` - Returns the list of column headers and the datatype of these headers
4. `df.info()` - Returns detailed information about your dataset
5. `df.describe()` - Returns detailed statiscal information about your dataset

Go ahead and try these!

Note: Keep in mind we use `df` to call these functions as that is the variable we used to assign our dataset to, we can use any other variable as well

## Activity: Basic Functions

Given the newly learnt functions can you perform these actions on python using pandas?

1. Output the first few rows of data

2. Output data dimensions (i.e. the number of rows and columns of the dataset)

3. Output the column header names

4. Obtain detailed information about the dataset

5. Output statistical information about the dataset

```import pandas as pd

df=pd.read_csv("Retail dataset.csv")

#1. Output the first few rows of data
print(df.head())

#2. Output data dimensions (i.e. the number of rows and columns of the dataset)
print(df.shape)

#3. Output the column header names
print(df.columns)

#4. Obtain detailed information about the dataset
print(df.info())

#5. Output statistical information about the dataset
print(df.describe())```