The Rise of Metadata

The Rise of Metadata

My Impressions and Perspectives on the Gartner® Data & Analytics Summit 2022

 Contents of the blog series:

In part 1 of the blog, I discussed the Data Fabric – what it is, who it is for, and the benefits and timelines for the adoption.

In Part 2 of the blog, I will cover Active Metadata, and its crucial position on the intersection of business and data.

The Rise of Metadata

Based on my impressions, Metadata was the most frequently used word throughout the conference. During Gareth Herschel and Debra Logan’s keynote [1] “Unleash Innovation, Transform Uncertainty” I understood Metadata to be defined as a communication mechanism: not only does it tell us which data we have, but it also describes meaning.

I further realized that Active Metadata enables implementation of the Data Fabric framework (among other benefits), thus allowing it to identify data drifts, to find new categories of data and to discover interrelated decisions.

Mark Beyer developed this topic further in his session “Metadata is the Key to Self-Learning, Augmented Data Governance” [2].

He defined four sources of metadata uses: technical, operational, business and social. As Looking at this different metadata sources, I realized that the first two types are “passive” – byproducts of data practice – while the last two types are “active” – purposeful creation of metadata and learnings about it.

passive metadata
(Diagram: Metadata is the Key to Self Learning, Augmented Data Governance, Mark Beyer, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

For me, a way to understand it is to think about passive metadata as a runtime static report, it doesn’t change the way it behaves. Despite this static nature, most of the customers focus on that passive type. As proof – illumex has received many requests from prospective customers about data lineage.I think the problem is that Lineage doesn’t show if data actually works.

For me, a nice metaphor can illustrate that: when we think about formulas, we think about math and physics. In data, metadata is formulas – it is not just data about data, but also the logic of the data application.

every reuse of data provides more metadata
(Diagram: Metadata is the Key to Self Learning, Augmented Data Governance, Mark Beyer, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

My thoughts after Mr. Beyer’s session were:

  • All data exists in the world, it is just not captured correctly.
  • Automation of functions starts with metadata discovery.
  • Metadata is the best method to determine if governance principles are being adhered to.
  • And “Data Fabric can resolve existing Metadata into Automated Governance”.
data fabric can resolve existing metadata into automated governance
(Diagram: Metadata is the Key to Self Learning, Augmented Data Governance, Mark Beyer, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

I further gathered that if the same data is used in 20 different places throughout the organization, it forms a community around a specific topic. Leveraging those communities to communicate between business and data teams improves data literacy.

Graphs take on a bigger role as an underlying technology for Active Metadata and other use cases, and they have a decisive role in converting traditional catalogs into augmented catalogs.

graph of quality integration governance usage
(Diagram: Metadata is the Key to Self Learning, Augmented Data Governance, Mark Beyer, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

The session inspired a vision: machines can only improvise – humans innovate: when metadata is available, we can automate; otherwise, machines need humans.

automated data management
(Diagram: Metadata is the Key to Self Learning, Augmented Data Governance, Mark Beyer, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

Later on, I attended Melody Chien’s session, “Maximize Business Outcome By Adopting Modern Data Catalog with AI-enabled Metadata Capabilities” [3].

Augmented metadata discovery, as I understood it, is an automated process of finding the right dataset for your business question, and then connecting Metadata and Data Catalogs. Data Catalogs address the “three Cs” – curate, collaborate, communicate.

augmented metadata discovery
(Diagram: Maximize Business Outcome by Adopting Modern Data Catalog With AI-Enabled Metadata Capabilities, Melody Chien, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)
what is data catalog
(Diagram: Maximize Business Outcome by Adopting Modern Data Catalog With AI-Enabled Metadata Capabilities, Melody Chien, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

Data Catalogs, I learned from this session, are not only a collection of metadata, but they also provide critical metadata about your data:

  • Where the data is and where it came from
  • What the data means and how it should be interpreted
  • What the importance of the data is and its quantifiable value
  • Who uses the data and for which business processes
  • Which policies and workflows are defined on top of the data

I learned the shortcomings of the traditional catalogs: being static, siloed, too technical.

common challenges of traditional data catalogs
(Diagram: Maximize Business Outcome by Adopting Modern Data Catalog With AI-Enabled Metadata Capabilities, Melody Chien, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

For me, the most surprising part was the prediction of the evolution from passive traditional data catalogs to Active Metadata Management by AI/ML. This as I understand means that we have a feedback loop from collecting metadata to leveraging it to learning from its consumption:

evolution from basic data catalog to machine enabled metadata
(Diagram: Maximize Business Outcome by Adopting Modern Data Catalog With AI-Enabled Metadata Capabilities, Melody Chien, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

Machine Learning and AI are key to modern Data Catalogs – via insights, engagement and collaboration. In my opinion, this will drive user adoption.

For those of us who are used to the lack of user adoption of the current data catalogs, you must be curious how the modern data catalogs are going to address that challenge?

The common use cases of the catalogs are still around analytics, GRC and data valuation, but those 3 alleys, in my opinion, can be used in any industry and for any use case – from security and customer Success.

common use cases of metadata
(Diagram: Maximize Business Outcome by Adopting Modern Data Catalog With AI-Enabled Metadata Capabilities, Melody Chien, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

For me, the most important takeaway from Ms. Chien’s session was the dramatic acceleration of the speed of delivery of new data assets by metadata analysis – as much as 70%!

accelerate new analysis by 70% by adopting metadata
(Diagram: Maximize Business Outcome by Adopting Modern Data Catalog With AI-Enabled Metadata Capabilities, Melody Chien, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

Ms. Chien sees how stand alone catalogs could live SIDE BY SIDE with the use case specific ones:

stand alone and embedded data catalogs
(Diagram: Maximize Business Outcome by Adopting Modern Data Catalog With AI-Enabled Metadata Capabilities, Melody Chien, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

For me, Ms. Chien opened the session from a broader Metadata perspective and then dived into the Catalogs context. But he further elaborated that catalogs are just part of the metadata management capabilities, and active metadata initiatives should not be limited to the catalogs only:

data catalog metadata ingestion business glossary data lineage metadata activation impact analysis rule management semantic frameworks
(Diagram: Maximize Business Outcome by Adopting Modern Data Catalog With AI-Enabled Metadata Capabilities, Melody Chien, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

To me, Ms. Chien session was very constructive and practical – it was also shown in the session recommendations:

recommendations for data governance and catalogs
(Diagram: Maximize Business Outcome by Adopting Modern Data Catalog With AI-Enabled Metadata Capabilities, Melody Chien, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022)

[1] Gartner, “Gartner Opening Keynote: Unleash Innovation, Transform Uncertainty”, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022.
[2] Gartner, “Metadata is the Key to Self Learning, Augmented Data Governance”, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022.
[3] Gartner, “Maximize Business Outcome By Adopting Modern Data Catalog with AI-enabled Metadata Capabilities”, Gartner Data & Analytics Summit, Orlando, Florida, 22-24 August, 2022.

Stay in the loop on all things Metadata, LLM Governance, GenAI, and Semantic Data Fabric. By subscribing you’re agreeing to the illumex Privacy Policy.

We use cookies to help personalize content, tailor and measure ads, and provide a safer experience. By continuing to use this website you consent to the use of the cookies in accordance with our Cookie Policy.