AI tools are now commonplace in every major company. Whether the company develops the technology itself or implements one of the commercial solutions, it should pay attention – the development and use of artificial intelligence is inextricably linked to the issue of data protection. In this post, we will look at those aspects of data protection that require particular attention in the context of AI, and why it is useful to reiterate the basic principles of data processing.
If you are looking for ways to streamline your corporate processes, reduce operational costs, innovate products or meet customer needs, it is likely that you are considering implementing an AI tool. Even if (or especially when) you do not address AI intensively, it is likely that your employees use those tools for their work, whether it’s generative AI-powered assistants, translation or text editing tools, or tools for graphic design.
The use of AI is naturally linked to the issue of data protection, in addition to other aspects such as cybersecurity, regulation in the form of the AI Act,[1] and copyright. However, with the pace of technology development, compliance in this area is often overlooked in the face of companies’ efforts to gain, or at least not lose, a competitive advantage. Not surprisingly, AI systems are in the crosshairs of European supervisory authorities, as shown by the opinion[2] of the European Data Protection Board (“EDPB”), which commented on certain aspects of AI at the turn of this year.
Is an anonymous model truly anonymous?
AI developers often proclaim that AI is developed on anonymous data that cannot be linked to a specific individual. For this to be true, however, the EDPB holds that the likelihood of both (a) direct extraction of personal data used to develop the AI model and (2) obtaining, intentionally or not, such personal data from user prompts should be insignificant.
According to the EDPB, this can be achieved by implementing appropriate technical and organisational measures, such as careful selection of data and sources used to train the AI model, including measures to limit the collection of personal data and exclude inappropriate sources, preparation of datasets and removal of personal data, filtering of datasets before their use for training, and measures in relation to outputs, including in particular output filters to prevent the extraction of personal data within the outputs.
However, if anonymisation cannot be fully ensured during the development, or if personal data is entered into the AI by the users themselves, all conditions for personal data processing must be complied with. The processing principles set out in Article 5 of the GDPR[3] serve very well as a basic checklist of the requirements of the regulation – after all, non-compliance with the processing principles is one of the most common offences for which supervisory authorities impose sanctions.
Lawfulness and purpose limitation
Personal data must be processed in a lawful manner, i.e. for and within the limits of one or more of the legal bases listed in Article 6 of the GDPR. However, today’s accelerated pace, with new tools appearing virtually every day, is more conducive to controllers wanting to test and implement new tools as soon as possible. However, for the processing to be lawful, the controller must first determine the appropriate legal basis. In this context, the EDPB generally accepts the use of legitimate interest as a legal basis for the purposes of developing and deploying AI models (e.g. for the development or use of a chatbot or the development of an AI system for detecting fraudulent content or behaviour). However, according to the EDPB, the essential condition is that a three-step balancing test (LIA) is carried out to demonstrate the legitimate interests of the controller. In the event that the processing cannot be based on legitimate interest, another legal basis must be found in the context of personal data before the use of AI can begin.
It may be tempting for some controllers to skip this step and subsume the use of AI under one of the existing purposes for which they already have a legal basis. However, this may not be compatible with the purpose limitation principle, according to which personal data must be collected for specified, explicit and legitimate purposes and must not be further processed in a manner that is incompatible with those purposes. In addition to the legal basis, it is therefore necessary to also define the purpose for which the tool in question is to be used, and to assess its compatibility with the existing processing purposes.
Transparency
The principle of transparency is one of the most problematic areas in the context of AI – not only because the developers (let alone the customers) often do not know exactly how AI works and how it works with data, but also because of the anticipation of a potential negative response from data subjects. Data subjects, when informed that their personal data will be processed by AI, may be reluctant to provide their personal data to the controller.
However, it is transparency that the EDPB mentions as a key area, and the measures proposed by the EDPB go beyond the standard information required under Article 13 or Article 14 of the GDPR – particularly as regards data used for AI development. The EDPB suggests that controllers highlight the processing of personal data using AI, e.g. through an information campaign via email, the use of graphic visualisation, information in the Q&A section, etc. In the future, we might see similar requirements for transparency to those currently imposed on CCTV, with the important difference that CCTV is subject to natural geographical limits compared to the online environment.
Data minimisation, storage limitation and exercise of rights
Personal data must be adequate, relevant and limited to what is necessary in relation to the purpose for which it is processed. This principle, too, can cause problems in the context of users trying to provide AI with as much data as possible and leaving it up to the AI to decide what data is relevant to its output. However, from the perspective of data protection regulation, AI is a tool like any other, and so when using it, it is necessary to consider what data is being fed into it – and in particular whether anonymised or pseudonymised data is sufficient for the relevance of the output.
At the same time, personal data may not be stored in a form which permits identification of data subjects for longer than is necessary for the particular purpose of processing. Again, this principle can be problematic in practice – processing does not usually end when AI provides a response. All prompts entered by the user are stored in the solution, and each solution stores personal data for a different period of time – in case of certain solutions this can be up to several years. Data retention and the ability to delete data at the user’s request is thus another parameter that controllers should consider when selecting an AI solution.
Integrity and confidentiality
Personal data must be secured against unauthorised or unlawful processing and against accidental loss, destruction or damage. Different AI solutions provide differently strong safeguards when it comes to the protection of data supplied by the user. As a rule, freely available (free of charge) solutions offer no or very limited safeguards. According to the EDPB, controllers should conduct audits to assess the security measures taken and regularly test solutions against different types of attacks. This will of course be difficult in practice, but when choosing a solution we recommend checking how the operator approaches these activities.
Responsibility
The controller is responsible for compliance with the basic principles and must be able to demonstrate compliance. In this context, the EDPB also extends potential liability for unlawful processing to the customer if the unlawful processing has occurred or is occurring during the development phase and the customer has not carried out sufficient verification of compliance with the GDPR in advance. As difficult as it is to imagine the application of this conclusion in practice, it is clear that the EDPB places increased demands on controllers to select an AI solution and conduct due diligence before deploying it.
At the same time, controllers should properly document all processing operations involving artificial intelligence. According to the EDPB, the documentation should include, in particular, a data protection impact assessment (or a justification of why it was not necessary), information on technical and organisational measures and documentation demonstrating the AI model’s resilience to external threats.
How to approach this in practice?
When introducing a new AI tool or system, it is always necessary to evaluate the impact on the existing date processing processes (what personal data will be processed through AI, for how long, to whom it will be transferred, etc.) and then take this into account in the existing GDPR documentation – in particular records of processing activities or privacy policies. At the same time, data processing agreement will usually need to be executed with the provider of the tool and, in the case of transfers of personal data to third countries, this will need to be adequately addressed – for example, through standard contractual clauses. For more extensive processing, it is also appropriate to evaluate the need for a data protection impact assessment (DPIA), which is generally recommended in the context of the use of AI, as well as to consider the appointment of a data protection officer for a controller where one is not appointed. At the same time, it is advisable to regulate the use of AI within the organisation at the level of an internal policy.
And last but not least – do not forget the basic principles of processing!
Conclusion
Artificial intelligence, like any innovative technology, presents a number of opportunities for organisations to gain a competitive advantage. However, it also brings a number of legal challenges.
If you are using AI tools in your business and are not sure if you have everything you need covered, we would be happy to discuss it with you. Our expertise in technology and privacy makes it easy to navigate the legal aspects and help you find effective solutions to all AI-related issues.
- [1] – Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act).
- [2] – Opinion 28/2024 on certain data protection aspects related to the processing of personal data in the context of AI models, available at: https://www.edpb.europa.eu/our-work-tools/our-documents/opinion-board-art-64/opinion-282024-certain-data-protection-aspects_en.
- [3] – Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation).