LibGuides: Data Anonymisation: Pseudonymisation

Explanation

What is it?

The replacement of identifying data with made up values.
Pseudonymisation is also known as coding.
It can be irreversible, where the original values are disposed.
It can be reversible, where the identity database is securely kept and not shared.

When to use it?

When data values need to be uniquely distinguished.

Source

PDPC Guide to basic data anonymisation techniques

Example

The dataset contains names of persons who obtained their driving licenses. Instead of suppressing the "Person'" attribute, it was replaced with pseudonyms, because the organisation wanted to be able to reverse the pseudonymisation if necessary.

Before anonymisation:

Person	Pre-Assessment Result	Hours of Lessons Taken
John Rohit	B	25
Stella Campbell	D	26
Ming Siew Lee	A	30
Poh Boon	B	32
Siva Vasanth	C	29
Siti Raudhah	A	25

After anonymisation:

Person	Pre-Assessment Result	Hours of Lessons Taken
4135891	B	25
3229873	D	26
4398642	A	30
783127	B	32
583419	C	29
983429	A	25

For reversible pseudonymisation, the identity database is securely kept in case there is a future legitimate need to identify individuals.

Identity database:

Pseudonym	Person
4135891	John Rohit
3229873	Stella Campbell
4398642	Ming Siew Lee
783127	Poh Boon
583419	Siva Vasanth
983429	Siti Raudhah