Skip to Main Content

Data Anonymisation

This guide aims to create awareness of basic data anonymisation concepts

Explanation

What is it?

The reduction in the precision of data. Example; converting age into an age range, or a precise location into a less precise location.

The design of generalisation (e.g. data range, precision) varies according to the intended purpose.

When to use it?

When values can be generalised and yet still be useful for the intended purpose. 

Example

This dataset contains participant's name (already pseudonymised), age, and residential address. For the "Age", we can generalise it into an "Age Range" (e.g. 21-30, 31-40, 41-50, 51-60, >60). For the "Address", one possible approach is to remove the block/house number and retain only the road name.

Before anonymisation:

Participant Age Address
A1 24 700 Toa Payoh Lorong 5
A2 31 800 Ang Mo Kio Ave 12
A3 44 900 Jurong East St 70
A4 29 750 Toa Payoh Lorong 5
A5 23 10 Tampines St 90
B1 75 50 Jurong East St 70
B2 28 720 Toa Payoh Lorong 5
B3 50 810 Ang Mo Kio Ave 12
B4 30 15 Tampines St 90
B5 37 18 Tampines St 90

After anonymisation:

Participant Age Range Address
A1 21-30 Toa Payoh Lorong 5
A2 31-40 Ang Mo Kio Ave 12
A3 41-50 Jurong East St 70
A4 21-30 Toa Payoh Lorong 5
A5 21-30 Tampines St 90
B1 >60 Jurong East St 70
B2 21-30 Toa Payoh Lorong 5
B3 41-50 Ang Mo Kio Ave 12
B4 21-30 Tampines St 90
B5 31-40 Tampines St 90