De-identification as a Privacy-Enhancing Tool: How, when and why to use it
De-identification is a powerful solution to many of the privacy concerns data sharing and use raises. It can reduce an individual’s privacy risk, while still allowing organisations to access the significant benefits that come with data sharing and use for purposes like data analytics.
What is de-identification in privacy?
De-identification is a term that has different meanings in different jurisdictions and depending on the relevant regulation being considered.
According to the Office of the Australian Information, de-identification involves the removal of direct identifiers from personal information, as well as either the removal or alteration of other information that could be used to re-identify a person; and/or the use of controls and safeguards to prevent re-identification (if necessary).
What’s the difference between de-identification and anonymisation?
The difference between de-identified data and anonymised data can be difficult to determine and can depend on the jurisdiction or specific regulation being considered. Some jurisdictions use de-identified data to signify that the data is almost impossible to re-identify, while anonymised data has been de-identified to a lower standard (for example, it is not reasonably likely to use that data to re-identify an individual). However, in other jurisdictions, ’anonymisation’ is sometimes used as a synonym for de-identification, alongside confidentialisation.
The definition of the term ‘de-identification’ is one of the issues considered as part of the recent Privacy Act Review Discussion Paper. (For more information on the Privacy 108 response to that Discussion Paper, see our previous blog post.)
In the Discussion Paper, it is proposed that the Australian Privacy Act should be amended to provide that information must be anonymous (rather than de-identified) before it is no longer protected by the Act. This would make the Act consistent with the GDPR and other jurisdictions that use the term ‘anonymous’ rather than ‘de-identified’.
How do you de-identify data?
There are quite a few well-known cases of companies releasing data that they believed had been anonymised – only to find out that was not the case – including the 2014 New York Taxi FOI request which allowed for the names, addresses, and income of certain taxi drivers to be identified from a dataset. Similarly, the Doe v Netflix case in the US resulted from Netflix releasing a de-identified dataset that allowed a woman’s sexual orientation to be revealed.
Steps to de-identify data
The steps you must take to de-identify data vary depending on the type of personal information you possess, as well as other risk factors.
Consider this scenario: you’re a supermarket chain in Brisbane. You collected the names, phone numbers, visit times, and age range of customers who visited your multiple stores across the city to enter them into a prize pool. The goal of collecting the data was to gather information about the time visitors from certain age ranges frequent your stores. The phone number was necessary to alert the winner of the prize.
De-identifying this data may be as simple as removing surnames and the phone number. In a large city, there will likely be many people with the same first name within any age range. Consequently, this method of de-identification may be sufficient. (Many other factors need to be considered, however).
Context counts when de-identifying data
However, if we conduct this same exercise in a small town – or even across one store location in a larger town – it becomes more likely that individuals giving their first name and age range could be identified. In this case, a better practice might be to use numbers in the place of names.
“The fundamental premise underpinning this guidance is that re-identification risk must be assessed contextually. To de-identify effectively, entities must consider not only the data itself but also the environment the data will be released into. Both factors must be considered in order to effectively determine which techniques and controls are necessary to de-identify the data, while ensuring it remains appropriate for its intended use.” – OAIC Guidance
In reality, de-identification assessments are complex, requiring the balance of risk, control, and data management. The Office of the Australian Information Commissioner and CSIRO partnered to create ‘The De-Identification Decision-Making Framework’. This framework provides details (over 93 pages) about de-identification ethics, the Five Safes, and how to undertake a data situation audit, amongst other things.
Consent and De-identified Data: Why De-Identification is a Valuable Tool for Australian Organisations
Information that has undergone effective de-identification is not considered to be personal information under the Australian Privacy Act. As a result, de-identified personal information may be used and shared by covered organisations in ways that are not otherwise permitted under the Privacy Act.
Since the information is not considered personal information, organisations do not need consent to share the data with third parties.
Other Reasons to De-Identify Data
Making de-identification a common practice within your organisation comes with a range of other benefits, including:
- Better risk management;
- Increased customer and community trust;
- Achieving compliance with the APPs; and
- Improved access to data sharing.
If your organisation would benefit from assistance navigating the de-identification of personal information, reach out. Our experienced team of privacy lawyers would love to help.