Skip to Main Content

Communicating Research: Visualising Data

Introduction

Before starting your data visualisation process, you should first consider the following: Who are your audience? How familiar are they with your research topic? What story are you trying to tell?

Quick reads:

Key Steps in Data Visualisation

At the start, you would need to acquire some data. 

Where to acquire data?

  1. Surveys
  2. Lab experiments
  3. Interviews
  4. Social Media sources
  5. Downloads from Internet (E.g. Data.gov.sg, KaggleTableauWorld Bank etc.)

Keep copyright / privacy in mind when using data. You may sometimes need to anonymize your data to protect the privacy of the participants. 


4Cs of Data Quality

  1. Consistency
    Have a clear picture of data consistency. Is the data collected valid? Is it coherent? Are there extreme values, outliers or anomalies?
     
  2. Conformity
    Does the data comply with acceptable standards and patterns? This is of particular importance as some data (e.g. in Healthcare) need to conform to certain standards in order to be actionable. 
     
  3. Completeness
    Check if the data collected is complete, and is not missing important values.
     
  4. Currency
    Is your data is up to date? Is it refreshed regularly enough?
    This also depends on the nature of your research For some research, up to date data is important. For others such as historical research, currency may not be a concern. 

Surveys shows that research scientists and researchers takes around 60 to 80% of their time to prepare and manage their data for analysis. When doing research that involves a lot of data, start early because a lot of the time will be spent ensuring that you have a good collection of data to analyse. 

  1. Cleansing
    Clean up values represented in different forms and remap them to ensure consistency. E.g. "Female", "F" and "Fem" can be turned into a single format "Female" to make the data more consistent. 
     
  2. Aggregating
    Sort data and express them in summary form. 
     
  3. Merging
    Your data may be scattered in multiple datasets. Merge relevant parts from the different datasets to create a new file to work with. 
     
  4. Appending

    Stack datasets containing same or similar fields to create a larger dataset. 

  5. Deduping
    Remove duplicates from a dataset
     
  6. Transforming

    Performing an operation on a column to result in a new outcome (e.g. a new variable, combination of 2 columns such as "First Name" + "Last Name" columns can be turned into 1 column "Full Name")

  7. Filtering
    Sometimes, only a subset of a dataset would help address your research questions. Filter your data to extract the relevant data.  


For a visual representation of these steps, refer to https://www.rapidinsight.com/blog/7-data-cleanup-terms-explained-visually/.

There are many tools that you can use to conduct your analytics. Besides the software or tools that you would choose, you have to also consider the size of the data that you are crunching as each tool would come with its own limitations.

Popular, Easily accessible

  • Excel (Can crunch a maximum limit of 1, 048, 576 rows of data)

Commercial software, Visual drag and drop interface, Analyze data without much coding

  • Power BI
  • Qlik
  • Tableau (Offers free one-year Tableau licenses to students at accredited academic institutions)

Open-source, free, require basic programming skills, Useful if you know how to apply readily available libraries

Online platforms, Create Interesting and colourful visualisations