Why Do Organizations Struggle with Data Lineage

There are a variety of reasons but in my experience, here are five that stand out above the others and contribute data lineage struggles – Unrealistic Expectations , Lack of a Strategy, Poor Scoping or Phasing, Lack of Resources, and Business Context Missing.

1. Unrealistic Expectations

Unrealistic expectations or understanding of the complexities of building data lineage. Most companies particularly those who need good data lineage have developed their data pipelines over time using various platforms, technologies, and methodologies. Usually this has been done without an overarching data movement strategy or architecture. This complexity is why lineage vs simply capturing metadata into a catalog is more difficult and requires dedication of resources and time to complete.

2. Lack of a Strategy for Data Lineage

Companies tend to approach lineage as simply an extension of the Data Governance or Data Management Strategy. While these strategies should include a lineage component – the complexity of building data lineage really should have a strategy of its own that these other strategies reference.

3. Poor Scoping or Phasing

Building end to end lineage in one big effort or project i.e., boiling the ocean that is your data landscape. While the true value of lineage is seen by the description of these flows across your enterprise, trying to get it all in one fell swoop is a recipe for loss of momentum, confidence, and patience by the organization. Instead with a lineage strategy – identify the quick wins – perhaps governance of your Data Warehouse / Lake, support of a legacy migration use case, lineage of critical systems to help improve MTTR for data issues, etc. Once these are identified, start small and then build in agile chunks to get to the nirvana that is enterprise end to end lineage. This will open up additional benefits like metadata / lineage analytics but do so in a sustainable way.

4. Lack of Resources

4. Lack of technical resources for the project. As mentioned above lineage is complex and not an “easy button” solution. You will need resources to help connect to the various platforms, gather code that can be scanned, and to help fill in the blanks where you have custom systems that cannot be scanned. In addition, you want to build a maintenance plan and automation for lineage. Capturing lineage shouldn’t be a one-time thing, lineage ages just like anything else in technology. You will need good technical resources for this.

5. Business Context Missing

Expecting Business users to be the initial users of lineage out of the box is tough in most organizations. Most automated lineage products deliver detailed lineage initially and then will need additional business context added to provide “business lineage” at a level that most business users will want to deal with. This is an important part of the strategy mentioned above to provide a realistic timeline. The good news is as you scan you get the benefit of things like impact analysis, data traceability, etc. from the detailed technical lineage to get immediate benefit.

In conclusion while lineage is hard; with the right expectations, strategy, automation, and resources it can bring huge benefits to the organization in both the near and long term. 

Leave a comment

Your email address will not be published. Required fields are marked *