Data governance efforts have been fragmented into specific areas of focus, such as data quality, master data management, or a data warehousing. However, organizations today need enterprise-wide data governance that often rely on all three. Thanks to dynamic data catalogs, enabled by data virtualization, enterprise-wide data governance is not only possible, but practically seamless. Originally designed for data analysts, today data catalogs are instrumental to ensuring data governance and stewardship and more importantly to helping an organization understand their data assets so they can improve analytics business processes.
While no one will debate the benefits of data governance, the challenge is that any technology that deals with data has some form of data governance already built into it. Master data management employs master data governance rules, specifying what constitutes master data, who owns that definition, who is allowed to make changes to it, and so on. Similarly, data warehouses are set up with rules governing which data can be stored for analysis, which BI applications and users can run which types of analysis and reports, and so on. However, these rules are imposed at the application or departmental level, but never at the enterprise level.
This creates many issues: First, business users who need to understand and use data across systems and applications are at a loss because they need to search for the required data in each of these systems separately and then manually relate them. Second, these systems have different levels of access security, so business users end up gaining only a partial view of the data they need. Finally, the documentation of the same data entities, such as customer, could be different in each of the systems, creating silos of similar information and ultimately conflicts.
Business users always need to understand the ways that different data elements connect. A retailer might want to know which customers own which products, whether they bought them in the store or online, and if they have warranties for any of them. The relevant customer data could come from a CRM system, the product information from an ERP system, and the warranty information from a warranty registration system.
Only with a unified knowledge about the customer and his or her activities can a company be effective in pursuing revenue-yielding business initiatives such as cross-selling and up-selling. Business users, such as customer care representatives, would like to search for specific customers and see them in a 360-degree view that includes their purchased products, warranties, transactions, etc., to provide better customer service.
For such views, data governance would have to transcend local systems and rise to the enterprise level. Representatives would then be able to see all of the data across the relevant systems, the relationships among the data, and any associated notes. Data security would be enforced so that data can only be seen by people authorized to view it. For example, one global mobile insurance service provider was regulated for protecting Personally Identifiable Information (PII) in the countries it operated. As a result, the company enforced access controls that ensured that representatives providing services in certain countries could only see information relevant to those countries.
However, if each system – CRM, ERP, etc., – has its own product-specific data governance, how can an organization establish enterprise-wide data governance? One technology that is gaining attention for this broad use case is the dynamic data catalog made possible by data virtualization.
Data virtualization is a modern data integration and management technology that integrates data in real time. It does so by creating logical views of the source data without replicating it into yet another repository. The advantages of this logical approach are speed and cost savings as the data virtualization layer does not physically store any data, but rather the critical metadata for accessing each source system.
The data catalog builds upon the base views within the data virtualization layer by augmenting them with information about who owns the data, the history of the data, its lineage, the associations and relationships among the data, business definitions, secure access privileges, and much more. Since data virtualization is real-time data integration and delivery, the data catalog is updated in real time, hence the term dynamic data catalog.
So, how does a dynamic data catalog enable enterprise-wide data governance? First, it enables business users to perform data discovery using google-like search features to easily find data entities such as customers and products across the enterprise. Then they will be able to see the lineage of how this data has been combined with data from other systems. Also, they will be able to see the relationships of this data entity with others, such as which customers own which products.
Data owners can be empowered to document the business definitions for each of the entities, across all applicable data sources, simultaneously. They can package up their searches and resulting views into a query that can be invoked by other authorized users. Fine grained security can be used to protect any sensitive data by dynamically masking the fields for which certain users do not have the access privileges. In short, enterprise data governance is baked into the very architecture of the dynamic data catalog.
Thanks to machine learning and data lakes, the proliferation of data is making it unbearable for organizations who struggle to access and understand it and for IT to properly manage. Enterprise-wide data availability has its merits but it also needs to be secured with the right level of access. Dynamic data catalogs, built on data virtualization, enable enterprise-wide data governance and ensures that all of the enterprise data is made available to business users with strict access controls. They augment real-time data views with business definitions, associations, lineage, and security. Because of these critical data governance capabilities, it is hard to imagine a data management future without a dynamic data catalog.
Ravi Shankar is senior vice president and chief marketing officer at Denodo, a provider of data virtualization software. For more information visit https://www.denodo.com or follow at @denodo
Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp
_____________
Disclaimer: Analytics Insight does not provide financial advice or guidance. Also note that the cryptocurrencies mentioned/listed on the website could potentially be scams, i.e. designed to induce you to invest financial resources that may be lost forever and not be recoverable once investments are made. You are responsible for conducting your own research (DYOR) before making any investments. Read more here.