BCBS 239 is summed up by the Basel Committee on Banking Supervision as: “the right information needs to be presented to the right people at the right time”. In the global financial crisis of 2007/8 (remember that one?), regulators found some “globally systemically important” banks did not have a firm grip on their risk reporting and could not react quickly enough.
The Principles for effective risk data aggregation and risk reporting were developed to try to ensure this did not happen again.
2 steps forward, 1 step back
According to the latest progress report (April 2020), “In general, banks require more time to ensure that the Principles are effectively implemented.” So why is it taking so long? Why are banks with (supposedly) limitless resources not succeeding - and what can be learnt from their mistakes?
What is taking most of the time?
Perhaps surprisingly, over half the resources of a typical BCBS 239 project are taken up by determining and documenting data flows as data lineage. This is a critical part of the process – after all, if a bank can’t describe what data flows where, or document all of the sources of data used in a regulatory report, a regulator will not be satisfied. Seeing data in context is essential for a full high-level view. The worst scenario for a bank is to spend vast quantities of money on a manual data lineage exercise, only for results to be discarded because they do not link together with other metadata and have gone stale. This is ‘dead’ data lineage. The ideal scenario is for system owners to publish updated lineage information when it changes, and for that to be merged in and change-managed with all other metadata (like data catalog, data quality metrics, etc). This is ‘live’ data lineage.
Most banks are between these two extremities, able to extract lineage and metadata programmatically in some cases, and needing to merge this with data curated by system owners and subject matter experts. It is imperative to make this process as efficient as possible.
What takes up the rest of the time? Implementing data quality controls is the next largest resource demand. To fully benefit from data quality controls, issues in data need to be reflected in quality metrics of datasets – and consumers of affected data (not just immediate – all downstream consumers) need to be aware. With many quality assessment implementations for different architectures possible, the main aim for BCBS 239 should be to capture quality metrics for datasets, visualise data quality in context, and maintain it along with other metadata.
Data dictionaries and catalogs - Data dictionaries/glossaries/catalogs – providing business meaning for data and identifying critical data elements – is next for resources. Materiality (or impact of getting it wrong) of data can be used to prioritise governance. Data glossaries play a crucial role in completeness of reporting – classifying all relevant data for a report across multiple systems. Integrated with lineage flows and quality metrics, they help to ensure all data for a report is present with correct quality.
Harsh lessons in sustainable governance
The main mistake is to have a short-term view: ramp up a big push, hire some consultants to collect data and ‘get across the line’, without thinking of the bigger picture and a sustainable process. This approach ends up with wasted effort and the need to do it all again – properly this time. To focus on the longer term, think about what you want to be able to do when you have all this regulation-required information. You want to see quality issues along the flows for data to your regulatory reports, and anywhere else. You want to find all the creditor data in your systems and make sure it is being reported. You want to help your team find the data they need to run new initiatives. You want to do ‘what-if’ experiments to see what would change if systems move to the cloud.
There are 2 key aspects to get right:
• A sustainable process to collect up-to-date metadata
• An integrated metadata platform
A sustainable process requires federating subject matter and system owners work across the organisation, and to automate where possible. Consider how business as usual will look. Users need to easily apply their subject matter expertise and carry out manual tasks quickly and consistently, with intelligent suggestions for lineage and relationships from machine learning. Connectors and open APIs, and integrations through expert partners, should be used to extract and load metadata and keep it up-to-date – including properties like dataset quality metrics.
An integrated platform for metadata treats metadata as a first-class asset, with all the change control and ownership that implies. The underlying architecture needs to be simple and adaptable, without artificial constraints. It needs to include lineage, quality metrics, glossaries, high level views, search, queries and reports – seamlessly. Without integration, organisations face a synchronisation and maintenance nightmare and inability to see the end-to-end view.
Solidatus is designed by people who have been there and done it, and see the full picture. Why not use the regulatory imperative to deliver an asset of growing value to the whole organisation, where business leaders and regulators can see the full data picture?