Skip to Main Content

Data Anonymisation

This guide aims to create awareness of basic data anonymisation concepts

Explanation

What is it?

The removal of an entire part of data (also referred to as “column” in databases and spreadsheets) in a dataset. This is the strongest type of anonymisation technique, because there is no way of recovering any information from such an attribute.

 

When to use it?

When an attribute is not required in the anonymised dataset.

Example

The dataset consists of student's name, tutor's name, and student's test score. The researcher only needs to anlayse student's test score with respect to their various tutors, but without analysis on the students themselves. Hence, the "Student" attribute was removed. 

Before anonymisation:

Student Tutor Test Score
John Teddy 87
Stella Teddy 56
Ming Teddy 92
Poh Song 83
Jake Song 67
Yong Song 45

After anonymisation:

Tutor Test Score
Teddy 87
Teddy 56
Teddy 92
Song 83
Song 67
Song 45