For industrial companies, the path to ultimate value from data liberation requires three crucial steps. Many organizations have already achieved Step One: liberating data from siloed source systems and aggregating it in a traditional data warehouse (DWH).
The bad news: Step Two is far more difficult to achieve. The good news: The rewards for successfully taking it are correspondingly higher.
In today’s mature data warehouse (DWH) market, progressive data-driven organizations are actively utilizing data fabric solutions as a complement to existing DWH strategies. With data fabric, organizations can liberate their data once again. Lifting it from the pool of aggregation and turning it into contextualized knowledge to deliver on their ambitions for advanced analytics.
The two main pillars of data fabric are Context and Discovery. These define data fabric and make it both distinctly different from and complementary to existing DWH.
Data context is the sum of meaningful use case supportive relationships within and across different data types and data artifacts. It is the result of data relationship mining and curation in a so-called contextualization pipeline. The process of adding context to data is often referred to as data contextualization or data fusion.
Prior to contextualization, data is often integrated from a multitude of source systems and co-located in a common data repository, similar to traditional DWH. Alternatively, data integration is virtualized through data federation, avoiding the need for data duplication and transfer. More recently, a hybrid approach has become common, especially for latency-sensitive IoT data applications, where data aggregation and data synthesis must be performed close to the data.
In Oil & Gas, digitalization efforts have long been limited to pilot projects, proofs of concept and case studies, with no large-scale operationalized projects. This is mainly due to outdated IT infrastructures that rely on legacy systems and only enable point-to-point integrations for application providers. These one-off solutions--sometimes including limited digital twins--can actually complicate digitalization goals because the resulting projects are as siloed as the original data, making them impossible to scale and, therefore, costly to the point of wastefulness.
Complementing existing DWH solutions with data fabric has dramatically reduced costs, while simultaneously enabling scalability, speed of development, and data openness throughout our many complex customer organizations.
Recently, the exponential increase in data volume, velocity, and business value, coupled with the meteoric rise of low code and citizen data science programs, is making data discovery more important than ever before.
In the context of enterprise data management, enabling the right data to be easily discoverable relies on much the same recipe: the right metadata, labeling, linkages to other data, and data cataloging to make it readable by both machines and humans. Outdated manual metadata management is gradually being replaced by active, machine learning-supported metadata practices, used to discover and infer additional metadata from relationships and clustering.
This is why progressive organizations are seeking out data fabric solutions to complement their DWH strategies. Data fabric adds critical context and discovery to existing DWH data assets. Complementing DWH with data fabric is the only way to push through all three steps to true data liberation.