Table of Contents

Subscribe to our blog

Stay in the loop on all things Metadata, LLM Governance, GenAI, and Semantic Data Fabric. By subscribing you’re agreeing to the illumex Privacy Policy.

Submit this form to subscribe to illumex Blog. Your privacy is important; we won’t share your details. Use ‘unsubscribe’ in the blog email digest to stop receiving Metadata, Data Fabric, and LLM Governance content.

Table of Contents

Subscribe to our blog

Stay in the loop on all things Metadata, LLM Governance, GenAI, and Semantic Data Fabric. By subscribing you’re agreeing to the illumex Privacy Policy.

Submit this form to subscribe to illumex Blog. Your privacy is important; we won’t share your details. Use ‘unsubscribe’ in the blog email digest to stop receiving Metadata, Data Fabric, and LLM Governance content.

Identify Your Ultimate Data Stewards

Managing an organization’s data has never been more complicated nor has it been more critical for success. It seems like data always finds ways to introduce new complexities to challenge practitioners and technology alike.
Companies rely on data heavily to identify, predict and avoid mistakes and to expose, optimize and execute new opportunities. So it comes as no surprise that companies are investing exponentially to streamline the data, engineer it, clean it, analyze it, store it, etc.
But what about the human element? What about managing the way data is used?
This is how Data management and Data Governance teams came to be. It started with highly regulated organizations or well-funded companies with mature data approaches and it is now becoming a standard for any organization with a focus on being driven by data.
An absolutely key role that was created for this transition was the Data steward.

What is a Data Steward?

Data Stewardship can have very subjective definitions depending on your perspective.. However, the most common one is by DAMA (DMBOK P. 76): “A person or a group of persons that ‘represent the interests of all stakeholders and must take an enterprise perspective to ensure enterprise data is of high quality and can be used effectively.”

For me, a Data Steward is the main executer of the data governance strategy that takes ownership of a data domain and is accountable for its quality, access & security rights, documentation, updates, and more. But most importantly, they are in charge of bridging the gap between the data providers (data teams) and data consumers (business, data-app Product Managers, and ML developers).

More specifically, the main goal of the data steward is to create data literacy that is comprehensive and yet simple enough for the business to understand. And at the same time, build business literacy so that data practitioners can understand how to manage the data streamlining that will best serve the business needs.

There are many places in our day-to-day routines and in the systematic business processes we follow, where data can be processed or defined differently by different business stakeholders, Data Analysts, or different Business Operation teams.

Do you truly have a single version of your ARR formula? Do all of your business units have the same ARR figure? Do all of them define the business term Active Customer the same? Does a customer become active when they sign a contract? When does a contract start? After the onboarding? What is Onboarding?

If this resonates with the pains you experience, you can clearly see the value of having someone or something that will help us overcome this. 

Data Steward’s main focus areas

Let’s try to understand how data governance practices and data stewardship can contribute to bridging this gap. As mentioned, there are several activities a data steward is responsible for under a specific domain. Let’s dive into the full stack of the potential entities a data steward takes ownership of:

Data Management

Less from the technical side, and more from the governance side, the data steward should take part in the definition of data rules-set, policies, and processes – what we can and cannot do with our organization’s data, by whom, how, and when.

These definitions should be taken by a forum assembled by the data management team, business management, and a representation of the data governance team, who is in charge of enforcing the above while ensuring constant improvement in the efficiency of data-related compliances. 


Document the intent of the metadata

One of the major responsibilities of a Data Steward is to make sure that the intent of the data is clear across the board – The data teams know how to use it properly during their work on the data pipelines, the analysts know how to query it while applying the logic that will convert the data into insightful and actionable information, and the business partners will know how to read it properly and take the right decisions accordingly.
The Data Steward usually uses a business glossary platform to share this knowledge. But more importantly than the WHERE, is the HOW – the Data Steward must be a domain expert both from the data perspective as well as from the business side. Like writing a Latin-English/English-Latin dictionary – one must excel in both languages – so as a Data Steward being bilingual in a specific domain – translate the intent between the two languages. 


Security and Access Control 

Making sure that only permitted users are exposed to the data they are allowed to see. This is more of the job of DBA and applications administrators, however, is the Data Steward’s top priority to make sure that these permissions are being enforced without any leaks.


Data Quality

Making sure that the data is clean and reliable, and contains the right data that providing the expected outcome that will serve the business to take decision-based decisions.
This part is being executed by data practitioners and engineers, yet as the Data Steward is accountable, their role is to test the data quality on an ongoing basis. 


Encourage the use of data

Ultimately, we want to make sure the hard work and high budget are rewarded with a high ROI on data investment.
To do so, in addition to ensuring that the data is trustworthy, available and performed, accessible and secured, and understandable with a single source of truth – we need to make sure the business is actually using it. 
The Data Steward should act as a data ambassador to its consumers and should use every tool s/he has at their disposal – promote online active business glossary, newsletters, roadshows, success stories, training, gamification – you name it.
Like other aspects of the organization, the usage of the data should be tracked and measured, and taking actions accordingly together with all relevant stakeholders – same as for the company’s products and services to its external consumers.  

As we can learn from the above, the common ground of all of these 5 items is that they have one single objective – manage the intersection of data and business and ensure fluidity and clarity.
In order to do so, and do it well, one needs to understand how a business process works, and how it serves a business goal, and at the same time have a deep technical background to understand how the data should serve it all. This individual should also be detail-oriented, grasp everything in a specific domain, and master it from both the business and data aspects. 

In addition, as for every item above, there is a high level of collaboration with data and business teams, from DBAs to business executives through BI and Operations teams – The steward needs strong communication skills, interpersonal skills, and, as mentioned, be bilingual with a passion for translating technical items to business logic and vise versa.  

So how do you find your ultimate data steward?

Assuming your organization doesn’t hold a designated and well-funded data governance team, which statistically – you don’t – and without having a person or a team solely focus on this set of activities, or even a portion of it, you will need to find creative ways to handle these activities that are so critical to the business.

Here are 4 recommended steps to follow during your practice:

First, like many other things in life – don’t try to tackle this big problem as a whole – try to break it into smaller pieces in the shape of separated business processes. Start from the end, from the business side – where the data is being served. From there, select the top 2-3 pain points that make you suffer the most and make them a priority. After addressing them move the next one on the list (vertically – to another business process, or horizontally – to a different set of pains).

Second, you should answer some background questions:
For each business process –   
Who is the business domain owner? This is the person your data team will turn to with every inquiry about the meaning of a required KPI or a new report.
What is the required data to serve these processes? 
Who owns the data? This is the person your business users and analysts will go to when it’s not clear what’s wrong with the data when it’s unavailable or wants to refactor a new external data set to their model.

Third, identify the functions, roles and individuals who’s day to day is sepnt in the intersection between business and data. In many organizations you’ll find those to be the Business Analytics, Business Operations, and BI teams. However, organizations differ in this based on their company’s scale,  field, technology, data maturity, and culture.

Finally, out of the information you had collected – find out who contains the most out of the skills set described above, and holds most of the information to fulfill the job required to your specific needs. Most likely that it will be a mix of several individuals.

To conclude

Data Governance practices are crucial in order to get the most out of your data, allowing you to pick up the fruits you work so hard to raise.
And even when you don’t have a designated Data Governance team full of Data Stewards to help you manage your data, you definitely should leverage your existing staff with the help of the right tools and processes to ensure that:

  • Your Data meets your quality standard (reliability, availability, and performance).
  • Your Data is understandable to its consumers and in a unique fashion.
  • Your Data meets your security standard.
  • Your Data is being used by its target audience who are gaining meaningful value from it.

Related Posts

Generative Semantic Data Fabric - don't get RAGged by your RAG

Don’t Get RAGged by your RAG: Why Generative Semantic Fabric is the Future

So, you’ve hopped on the Retrieval Augmented Generation (RAG) bandwagon. It’s the popular choice, and...


Resilience and Revolution of Business Intelligence In the Age of GenAI

Embracing the Future: The Ongoing Evolution of BI Welcome to the final, 8th part of...

Subscribe to our newsletter

Submit this form to subscribe to illumex digest. Your privacy is important; we won’t share your details. Use ‘unsubscribe’ in the digest to stop receiving Metadata, Data Fabric, and LLM Governance content.

Stay in the loop on all things Metadata, LLM Governance, GenAI, and Semantic Data Fabric. By subscribing you’re agreeing to the illumex Privacy Policy.

We use cookies to help personalize content, tailor and measure ads, and provide a safer experience. By continuing to use this website you consent to the use of the cookies in accordance with our Cookie Policy.

Contact us

Reach out to learn more or request a demo