Why financial services struggle with identity-based data retention

By Lee Biggenden, below, COO & Co-Founder of Nephos Technologies

With an increase in compliance laws in recent years governing how financial organisations must manage, retain and eventually delete customer data, it is vital that policies and processes are put in place so that obligations can be met and individual rights respected. For instance, one of the common challenges is what must happen to data when a customer leaves, with the legacy old-school “delete everything after X years” approach no longer fit for purpose when data plays such an important role in contemporary business and is weaved into so many datasets.

Today, financial services organisations must create and implement data retention strategies that are more nuanced and focused – balancing the value that data retention can add in areas such as business insight, intelligence and personalisation with the requirement to delete when retention is no longer justified.

Data storage limitation guidelines

In the UK, the Information Commissioner’s Office (ICO) sets out data storage limitation guidelines which stipulate that organisations must not keep data longer than they need it. An example cited by the ICO illustrates the issues faced by banks in relation to retaining customer data: “A bank holds personal data about its customers. This includes details of each customer’s address, date of birth and mother’s maiden name. The bank uses this information as part of its security procedures. It is appropriate for the bank to retain this data for as long as the customer has an account with the bank. Even after the account has been closed, the bank may need to continue holding some of this information for legal or operational reasons for a further set time”

Instead, banks must be able to justify how long they are retaining data and put in place policies that set out their retention periods. These must be regularly reviewed so that they can be erased or anonymised as necessary. Personal data can only be retained for longer periods if it is for reasons of “public interest archiving, scientific or historical research, or statistical purposes.”

Less than a decade ago the ‘right to erasure’ or ‘right to be forgotten’ was established as a human right by the European Court of Justice. This means people can ask organisations that hold data about them to remove it, and these individual rights are also set out in Article 17 of the GDPR, enforced in the UK by the ICO. It lists a range of criteria under which data must be deleted, including “the personal data are no longer necessary in relation to the purposes for which they were collected or otherwise processed.”

Identity-based data retention

Alongside other grounds for deletion, including lack of consent or unlawful processing, compliance presents a broad range of scenarios where data must be identified and removed. For today’s data-hungry organisations, and the sheer volume of systems that most companies now have, it is becoming a mammoth task to comply with these rules and actually work out where that data resides, let alone fully delete it.

Failure to comply with these rules can result in heavy financial penalties. In September this year, for example, Instagram was fined EU405 million by the Irish Data Commission for violating GDPR. It was found to have allowed users between the ages of 13-17 to set up business accounts that displayed their phone numbers and email addresses.

In recent years, compliance has become increasingly challenging, with governance and privacy teams tasked with delivering processes and technologies to fully identify specific customer datasets.

As a result, building the technology infrastructure to map stored data across all systems to an “entity”, in this case, a customer, and correlate all data to that entity is becoming critical to meeting compliance obligations. However, there is no other way to prove beyond doubt that the customer data has been deleted from all systems unless it can be demonstrated that the “entity” data no longer exists across all systems. In many organisations, this has the potential to create compliance issues, especially when their systems, tools and processes are not designed to provide these increasingly crucial capabilities.

Automation holds the key

To address these challenges, the answer lies in the implementation of a correlated “entity” based retention solution capable of working across multiple systems and datasets. By automating discovery and classification across disparate data silos, it becomes possible for organisations to implement effective retention and removal policies based on that “entity”.

The ideal approach is that financial organisations should be able to create a virtual identity of the customer, with the ability to select attributes that reflect what a customer looks like in their systems. For example, the organisation could select columns from multiple different databases to be able to create a gold-image/master data set of customer data. This would be information such as name, email, address, phone number, account number etc, and then the solution would correlate these data points across ALL systems.

Instead of just looking for specific patterns that inevitably return a narrow set of results, the most effective and innovative emerging systems take those additional data points and then apply machine learning to automate and find every instance, whether it fits within a pre-defined pattern or not. By using those reference points and effectively creating a virtual database of all the data points that relate to each customer, it gives organisations a much more granular view. Where legacy systems would have delivered perhaps 60% accuracy, are now effective to 95% in correlating the relevant data sets for removal.

This also creates synergy between technology and policy, using the right tool sets to ensure that policy can fit both the business purpose and the regulatory need. At the same time, the enforcement of retention policies can be automated so the right customer data can be moved beyond use at the right time.

The result is a transformational capability that empowers data-centric financial organisations to join the dots across their datasets to meet audit and compliance requirements.

This enables organisations to reduce the risk surface of the data they store and process, while also minimising the volumes of data they are dealing with. In doing so, it becomes possible to optimise the cost and complexity of technology infrastructure to deliver a strategy that meets governance needs.

As data volumes and complexity increase, so do the risks for financial services organisations that do not establish the processes or technology to effectively manage their entire lifecycle – including the stage at which it must be deleted. Without a greater emphasis on closing the loop, there are certain to be many more breaches of governance rules in the years ahead. Putting the right tools in place, however, puts organisations in a strong position to balance the value of the data they hold with the rights of individuals across the digital economy.