Pseudonymisation and anonymisation are often confused. Both techniques are relevant within the context of the GDPR. This confusion arises from differences between 1 the legal definition of personal data and recognizing data that is not directly identifying someone as a form of personal data in practice. In contrast to the binary, legal understanding of personal data, the data that is being processed today is more appropriately placed in a spectrum between what is clearly personal data and anonymous data and anything in between. The issue is that a significant amount of economic value can be derived from data that appears non-personal (anonymous) at first glance, but with enough effort, it can be transformed into personal information.
Precisely at this idea is where the confusion lies. Pseudonymisation enables the personal data to become unidentifiable unless more information is available whereas anonymization allows the processing of personal data to irreversibly prevent re-identification. 2
Understanding the difference between these two techniques is very important because it essentially determines whether the GDPR applies to the data processing or not.
Confusing pseudonymisation with anonymisation can create a false sense of security and put individuals' personal data at risk. If data is pseudonymised but not properly protected, it could still be re-identified by linking it with other information, and therefore could be subject to the same privacy risks as personal data that is not pseudonymised. Moreover, if the pseudonymised data is subject to a data breach, the individuals personal data may still be exposed and identifiable. On the other hand, if data is incorrectly treated as anonymised when it is only pseudonymised, it may be subject to unnecessary restrictions on its use or retention, which could impact its value for research or other purposes.3
Therefore, it is important to distinguish between pseudonymisation and anonymisation, and to ensure that appropriate safeguards are in place for each type of data processing, as required by the GDPR.
A. Definition of pseudonymisation
In simple terms, pseudonymization enables the personal data to go through a process that makes the personal data unidentifiable to a specific person without any extra information.4 This is usually done by replacing directly identifying information such as name, Social Security number, or date of birth with a random code.5 However, the data can still be indirectly linked to the person, so it remains to be considered personal data. Additionally, any extra information that could be used to identify the original person, such as an encryption key, must be kept separate from the data using technical or organizational measures.6 This separation can even occur within the same organization and must ensure that the data cannot be accidentally or unauthorizedly linked to a person.
A common example of pseudonymization is key-coding data, which is often used in medical research. Key-coded data is information about a person that has been labeled with a code. There are many other ways to pseudonymize data, including encryption with a secret key, using a hash function or using a keyed-hash function with a stored key.7
B. Advantages and disadvantages of pseudonymisation
Advantages of Pseudonymisation under the GDPR
Disadvantages of Pseudonymization under GDPR:
A. Definition of anonymisation
According to Recital 26 of the GDPR, data is considered anonymous if there is a reasonable likelihood that it cannot be linked to an identified or identifiable natural person.8 In data protection and privacy terms data that are not personal are typically referred to as ‘anonymous data’, and the process of rendering personal data non-personal is typically termed ‘anonymisation’. Non-personal data do not fall within the scope of application of the GDPR.
B. How anonymisation works
Anonymous data or non-personal data as information are combined (aggregated) to the point where specific events (such as a person's travel patterns) can no longer be linked to a specific individual. This type of data is commonly used in statistics or sales reports to analyze things like product popularity and features. Some other examples of anonymous data include information on high-frequency trading in the finance industry and data on precision farming, which helps to monitor and improve the use of things like pesticides, nutrients, and water.9
It must be noted that, when deciding if a person can be identified from their data, all possible ways of identification should be considered, whether it's by the data controller or someone else. To determine whether these methods are likely to be used, various factors should be taken into account, such as how much time and effort it would take to identify the person, and the technology available at the time of the data processing.10
C. When to use pseudonymisation and when to use anonymisation
Pseudonymization is not the same as anonymization. Pseudonymized data is not the same as anonymous data. Data is considered anonymous only when it's impossible to identify the person the data concernsbelongs to.11 While it may never be possible to completely exclude the possibility of re-identifying the data, EU regulations have a very high standard for what counts as anonymous.
Pseudonymization and anonymization are both techniques used to protect the privacy of individuals in the processing of personal data. The choice between the two depends on the specific situation and the level of privacy protection that is required. Nevertheless, it is important to mention that while the GDPR does not require pseudonymisation by default (per Article 25(2)), certain national data protection laws, such as the German GDPR Implementation Law, do impose strict pseudonymisation requirements.12 This German law, for instance, states that personal data must be anonymized or pseudonymized as early as possible and in accordance with the purpose of processing.13
Although there is an idea that the GDPR regards pseudonymisation mainly as a data security measure; this is inaccurate. Although Article 32(1)(a) does acknowledge pseudonymisation as a way of achieving proper security, it is also associated with the broader obligation of 'data protection by design' (as per Article 25(1)), as well as data minimisation measures applicable to processing for archival, scientific, historical, or statistical purposes (per Article 89(1)).
The following elaborates on specific examples of the usages of pseudonymization and anonymization.
Pseudonymization is best used when:
Anonymization, on the other hand, is best used when:
In conclusion, pseudonymization is a good compromise between privacy protection and the need for personal information, while anonymization offers the highest level of privacy protection, but also results in the loss of the data qualifying as personal data, thus removing the applicability of the GDPR.
In conclusion, there is a significant difference between pseudonymisation and anonymization. This is essential to understand as it determines whether the GDPR would be applicable or not. Pseudonymisation is the process of replacing identifying information with random codes, which can be linked back to the original person with extra information, whereas anonymisation is the irreversible process of rendering personal data non-personal, and not subject to the GDPR. The advantages of pseudonymisation include improved privacy, increased data sharing, and better security, whereas its disadvantages include a risk of re-identification, reduced data utility, and complexity. Understanding the differences between these two techniques is crucial to safeguard individuals' personal data as confusing them could lead to unnecessary restrictions on data use or retention, impacting its value for research or other purposes.
1 Nadezhda Purtova, ‘The Law ofEverything. Broad Concept of Personal Data and Future of EU Data ProtectionLaw’ (2018) 10 Law, Innovation and Technology 40.
2 Recital 26 and Recital 26, GDPR.
3 Michèle Finck, Frank Pallas, Theywho must not be identified—distinguishing personal from non-personal data underthe GDPR, International Data Privacy Law,Volume 10, Issue 1, February 2020, Pages 11–36
4 Article 4(5), GDPR.
5 Article 29 Working Party onAnonymisation Techniques p.20
6 Tosoni, Luca,' Article 4(5).Pseudonymisation',in Christopher Kuner and others (eds),The EU General DataProtection Regulation (GDPR): A Commentary (New York,2020;online edn,OxfordAcademic)
7 EC Guidance2019, p. 5 (noting that ‘[f]or instance, a research study on the effects of anew medicine would qualify as pseudonymisation [within the meaning of Art. 4(5)GDPR], if the personal data of study participants would be replaced by uniqueattributes (e.g. number or code) in the research documentation and theirpersonal data would be kept separately with the assigned unique attributes in asecured document (e.g. in a password protected database)’).
8 Recital 26 GDPR.
9 EC Guidance 2019, pp. 6–7.
10 Recital 26, GDPR.
11 Tosoni, Luca,' Article 4(5).Pseudonymisation',in Christopher Kuner and others (eds),The EU General DataProtection Regulation (GDPR): A Commentary (New York,2020;online edn,OxfordAcademic),
12 GermanGDPR Implementation Law
13 Tosoni, Luca,' Article 4(5).Pseudonymisation',in Christopher Kuner and others (eds),The EU General DataProtection Regulation (GDPR): A Commentary (New York,2020;online edn,OxfordAcademic)