ICQ Log - Data Protection & Information Security:

AI's Evolution and the GDPR 

 

Last Updated: 12 June 2020

Author: Alan Moore, Member of the Data Protection & Information Security Working Group. Senior Advisor: Trilateral Research

Since the 1990s when Artificial Intelligence (AI) first appeared in commercially available products, the digital world has rapidly changed, and
the promised era of AI is only now really emerging. From credit scoring, to fraud investigations, from marketing to autonomous vehicles, from cancer diagnosis to climate change, AI is now represented as the promise for a safer and more prosperous future. From a data protection point of view, this technology and its rollout raises particular challenges especially around transparency and accountability in relation to how our personal data will be processed. Where an organisation wishes to use this increasingly powerful technology, it needs to take into account the distinctive challenges AI can pose. 

What is AI?

Fundamentally, artificial intelligence is a technology that uses algorithms and statistics to identify patterns within data sets, with a variable but increasing degree of autonomy. The technology can be purely software or combined with hardware, such as sensors and cameras as
seen in self-driving cars and CCTV systems. Originally, the technology was held back by two fundamental problems, the lack of sufficiently large data sets from which to learn and the limitations in processing power. In our online, cloud-enabled world, these challenges are dissipating.
There are enormous databases of digital records available for interrogation and a theoretically unlimited level of processing power via
distributed processing in the cloud. Of course, individual processors have also doubled in speed every two years or so following Moore’s
Law. Just look at your mobile device with cutting edge cameras, fingerprint scanners and enough processing power for all your entertainment and social media needs. Thus, from the 1990s when character recognition was first rolled out in mailrooms, to the networked environment
of social media and online interaction of today, an enabling ecosystem for AI has only recently emerged. Thus, AI’s time may have finally arrived. 

Early systems, such as those developed for character recognition, would be pre-programmed with the pattern searching algorithms and were able only to follow the coded rule book provided. At each iteration, the designers would attempt to improve the coding to better reduce
the difference between the predicted and actual classes of data. These systems remained fully under the control of the designers and the algorithms could be explained, though with some difficulty as complexity increased, prior to the processing of any data.

With the inclusion of insights from other fields, such as neural networks, more recent AI systems have been empowered to adapt their own pattern searches. These AI systems have more autonomy, and how a pattern is identified can only now be explained after the fact, using
algorithm transparency tools and processing logs. In other words, we can ask the system what it has done, but it can’t necessarily tell us beforehand. Even with self-learning systems, there are different levels of control when implementing search capabilities. The first is where the designers of the system provide the pattern to be found and after each attempt tell the system which results are correct. The system then
refines its own search algorithms accordingly. This supervised learning approach still leaves a degree of responsibility and accountability with
the designers who point and release their statistical engines with defined rules giving feedback along the way. This has had remarkable success in areas such as cancer detection and fraud investigation. The second is where the AI system is given access to data sets with broad
parameters and varying degrees of guidance, and then left unsupervised to detect and act upon whatever patterns it can. While boundaries can be set for the AI, the designers have far less control as they have given the technology the responsibility for developing much of its own rule engine. This autonomous learning has been used to teach systems to operate self-drive vehicles and to develop adaptive IT security systems. 

Challenges Inherent within AI

Despite the remarkable development of the technology, especially in self-learning systems, certain challenges remain. Though AI can access and process vast amounts of data beyond the capability of any human workforce, and identify patterns within that data, it is still not good at dealing with uncertainty. In any analysis of a data set, it will find those records that match the pattern it is looking for and those that definitely
do not. Where records ‘kind of’ match the pattern and require interpretation in a wider context, these are left to the slow, inefficient humans to deal with. This is because, as yet, AI does not have the ability to handle ambiguity well and accuracy rates can be lower than expected as a result.

A more serious issue is bias within AI systems. Human bias within algorithms has long been recognised. In supervised learning algorithms programmers, limited by assumptions and our heuristically driven brains, can unintentionally transfer these biases in the millions of lines of code needed for most software systems. Even test scenarios are coloured by social and personal contexts that subtly nuance the way software operates. 

In unsupervised systems, there was a belief that letting an AI learn by itself might result in bias-free algorithms where pure logic would drive
the pattern identification and rulemaking. The reality however is that the data sets from which the AI learns can themselves transfer bias into the AI’s learning. This is because it is humans who have originally decided what data points to collect and the version of our reality that we choose to share through, for example, social media accounts. 

GDPR Challenges 

From a compliance point of view AI poses specific problems to data controllers and to us as the data subjects. One key issue is the centrality of transparency to data protection. Every controller must be able to set out, clearly and using plain language, how personal data will be used and for what purpose, at the point at which the personal data is gathered. This is sometimes referred to as the ‘no surprises’ rule.


However, if a controller is using AI to make an automated decision and that AI is self-learning, can the controller ever really explain how an individual’s personal data will be processed? Certainly, this is unlikely before the processing takes place, and it is unclear whether a controller will have the capability to explain how the AI arrived at the decision afterwards.


Another major issue revolves around AI and its increasing use for profiling of nearly every aspect of our daily lives. We see this in supermarkets, banking and internet usage simply because all this personal data is gathered through lots of separate systems and can now be integrated. This raises issues around fairness and proportionality.


Some organisations have taken the creative compliance approach holding that statistically inferred data is not really personal data and that any
data they use is publicly available anyway. Others have relied on an argument that the individual was not profiled but rather it was their property or the professional role they carry out. The GDPR is quite clear, however, and defines profiling as any use of data that can be used to identify
a living person, i.e. personal data, by a computer system to evaluate, analyse or predict aspects about them. It particularly references an
individual’s performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements. In other words, all the really interesting information! The fine issued against Österreichische Post AG (€18 million) reinforced this point, making it clear that the use of personal or non-personal data, even when public, to assign attributes to data subjects, is subject to GDPR.


Where AI is processing data that includes biometric data, the challenges increase still further. A particular case is the use of AI for pattern
creation from images and the matching of this pattern against different reference databases. Recent examples have seen the London Metropolitan Police using AI for real time mass surveillance on the streets of London despite fears about the proportionality and accuracy of the system. Another is the controversial Clearview AI Facial Recognition Technology that has scraped an unknown number of photos of millions
of individuals from the web. The tool purports to be able to identify almost anyone from these sources. The photos were likely never taken, or made available, with the expectation of this biometric processing after the fact, in some cases several decades after they were first published. Another example closer to home is the biometric capability built into cameras of the new Children’s Hospital.

Where a controller uses AI to access and integrate large amounts of personal data the main concerns are:


• How accurate is the underlying data it is integrating (given its age, source, structure) and therefore the end decision or profile that is the output?
• Where did the underlying data actually come from and was it obtained lawfully?
• How are necessity and proportionality of the processing going to be demonstrated along the
chain of data sharing and processing?
• How does the processing of data by AI remain effectively under the control of the data controller?
• How much certainty is there about how decisions are made?
• How transparent is the use and decision making undertaken by the AI?
• Is the use of personal data fair and how will it be perceived by customers?

Handling AI Systems 

Controllers using AI need to be able to ensure that they meet their obligations in terms of protecting data subjects’ rights and freedoms:

 

 • Acknowledge that AI is different from other more traditional technologies.
• Undertake due diligence to know where the personal data being processed has come from and how it was obtained.
• Be satisfied they know and can explain how any AI system implemented will use or has used any personal data.
• Be transparent and ensure data subjects know their personal data is being used for profiling or automated decision-making and that      they can contest any decision the system may make even where they permit its use. This includes where AI is used to interact with living persons.
• Manage the risks to data subjects by ensuring a detailed Data Protection Impact Assessment is undertaken before any AI is rolled out and especially as part of any procurement process where the system is provided by a third party.
• Even where such systems are used for the prevention or detection of crime, the processing must undergo necessity and proportionality tests which must be documented.
• Avoid using AI to process personal data of children, special category data or in any situation where significant risks are likely to arise.
• Demonstrate the accuracy and reliability of the systems being used by means of algorithmic auditing and logging of processing.
• Check for bias and proportionality.
• Implement adequate fail-safes and boundaries in terms of technical and organisational measures to demonstrate how any risks to data subjects are being mitigated.
• Remember, as controller you remain accountable for the autonomous processing of personal data by any AI systems you implement directly or through engaging processors under contract.

Conclusion

AI is a powerful technology that will have implications for nearly every part of our lives by offering the possibility to augment our own natural capabilities and to overcome our human limitations. In terms of speed of processing alone this will guarantee AI a major role, and the EU
hopes a lucrative role, in data processing going forward. The main concern for data controllers is the potential loss of control over how personal data is processed which is a primary obligation under the GDPR. The value of AI in economic terms is estimated to be worth some $13 Trillion by 2030 (McKinsey Global Institute). Thus, it will place significant pressure on Europe to balance the promised economic advantage with possible limitations arising from its commitment to ethical data processing and its proposed ‘ecosystem of trust’. The optimistic EU whitepaper ‘On Artificial Intelligence - A European approach to excellence and trust’ reads as if AI may well be a panacea for many of the major issues our societies face. An underlying assumption within the whitepaper is that AI can be controlled and held to account by its human masters. More widely available expertise and EU regulation are put forward as the key to keeping the technology in check.
With the novel nature of AI and its rapid development, a key question, already raised in relation to the GDPR, is the ability of regulation at all levels to deal with rapid innovation. Going by Ireland’s inability to regulate even the humble electric scooter on our streets, can our traditional reliance on regulation be our only defence against the deliberate or accidental misuse of AI? Whatever regulatory environment emerges, our
approach must ensure AI’s expanding capabilities and resulting commercial advantage do not automatically outweigh our rights and freedoms as individuals. 

Lawyer Photo

Author: Alan Moore 

Data Protection Manager and Deputy DPO (EMEA) at MetLife

ICQ Summer Edition 2020

This article was taken from the ACOI's ICQ Summer Edition 2020