Data Lineage – why is it important to know the Origin of Data?
Dominik MachalicaReading time: 6 min
If you’re reading this, you’re probably a data specialist.
But imagine, for a moment, playing the role of an air traffic controller in an air traffic control tower – operating radars, systems that measure meteorological conditions, providing navigational guidance to pilots, keeping air and ground traffic safe. Do you see any similarities between the two jobs? Do they not seem to coincide with you? Can you help yourself both with specific skills and tools? We explain!
In both cases – in the position of data specialist and air traffic controller – you have to:
- Handle many requests and demands from many people (departments) at the same time.
- Act promptly and flexibly. After handling one request, another one immediately follows.
- Be resilient to stress, as very often tasks are accompanied by time pressure.
- Prioritise queries, because in fact they are all prioritised: “high”.
- Give very accurate, factual answers.
- Every wrong or late command (decision) is an additional cost for the company.
For example, an air traffic controller cannot allow two planes to land on the same runway at the same time because it will put passengers’ lives at risk. Similarly, an inaccurate answer from a data specialist will cause people and the company to lose time and/or money. The ability to give quick, clear answers without compromising accuracy is key to success in your business.
How do you cope with this intensity of tasks, time pressure and responsibility? There are 3 skills that are crucial to the success of your company’s data team:
- Ability to explaination
Having all three skills is not only helpful, but actually crucial when working with data. If all three competences are not at the highest level then it will be difficult for you and your team to be effective in the area of working with data.
Option #1 If you have Accuracy and Clarification Skills, but no Speed:
Your business users will be paralysed waiting for you to deliver an answer. They won’t be happy about having to wait and if you don’t provide them with an answer they won’t hesitate to blame their lack of response to a business event on you.
Option #2 If you have Speed and the Ability to Explain, but lack Accuracy:
Quick response times to business inquiries will please users… until it turns out that the answers you gave are wrong and do not reflect reality.
Option #3 If you have Accuracy and Speed, but can’t Explain:
Business users may trust you implicitly and ‘blindly’ accept every answer. Senior management may already have less confidence in your numbers. Auditors will have the least of it. If you cannot explain “where this particular figure came from” you are in a quandary.
As a data specialist, you don’t want to find yourself in any of the situations described above. Only the solid trio (Speed + Accuracy + Ability to Explain) supports you in almost every situation and shows your knowledge and professionalism to all who need to see it.
What is the secret to gaining these 3 key skills?
Data Lineage (Data Origins) refers to the process of understanding and visualising the flow of data from source to current location. It is about tracking any changes made to that data as it is transformed until it is used. It is the pedigree of your data. In the past, data teams had to manually track the origin of the data, today automated tools allow you to track the data faster and more accurately. The difference is as huge as using the Word Count function in Word compared to manually counting words from a document that is 1,000 pages long. In addition to the huge amount of time saved, we can avoid human error.
With Data Lineage, you will also find out where a particular piece of data came from, when and where it was extracted (exported, extracted), merged with other data and what transformations were applied during, from input to use.
What is the end result for you as a data professional? Reliable data that you, your team and your business users can use and rely on.
As we mentioned earlier, no one is going to manually count all the words in a 1000-page Word document. This will not be a productive use of time. When business users ask for an answer – they want it as quickly as possible, and preferably immediately (think of an air traffic control tower). Manually analysing how our data is changing increases the time it takes to get a business response and involves a huge amount of your data team’s valuable time. Automating the tracking of where data is coming from frees your team from time-consuming line-by-line reading of lines of code.
It’s all about understanding:
- What happened if the data has a particular form?
- And what happens if we make changes to the data?
What is the end result for you as a data professional? With Data Lineage, all you have to do is define the data you want to track – and look at a comprehensive visualisation of the flow of data through your systems.
ABILITY TO EXPLAIN
As a data specialist, much of your time revolves around answering the following questions:
- How did we get this particular figure?
- How did the reporting error occur?
- How will the change affect our systems if we make changes?
To answer these, you need to follow each step in the data extraction and transformation. The more source systems there are, the more difficult and confusing it is to follow this trail. Automating the tracking of where data comes from eliminates confusion and creates a clear, visual and comprehensive map of your data. All you have to do is select any data and Data Lineage will show you what source that data came from and what transformations were made along the way. You can also see what happens next with the data up to the point where you use it in your Business Intelligence system, for example.
What is the end result for you as a data professional? The ability to quickly and confidently answer any question about your data directed at you by business users, management or auditors.
In summary, by using Data Lineage as a data specialist you will be able to:
- Respond quickly to queries from business users.
- Detect, track and correct data processing anomalies.
- Reduce application maintenance costs and accelerate the development of new solutions.
- Increase confidence in data across your organisation.
Data Lineage, as we mentioned at the outset (comparing it to counting characters in Word), can be traced manually (slowly, subject to human factor error), or automatically (quickly, without worrying about small but crucial errors), using, for example, Qlik Sense in the SaaS version for Nprinting, Qlik Sense On-Premise and Qlik View.
Want to see how Data Lineage works live? Get in touch with us!
We will show you on a demo application the power of knowing the origin of your data, and if needed, we will solve your problem on your organisation’s data in 5 working days.
See recent writings
You drive us to strive for excellence in delivered projects and common challenges. Feel invited to read out blog that provides more in-depth knowledge on our implementations and experience. Read articles about digital business transformation, ERP and Business Intelligence systems. Discover interesting practical applications for future technologies.
Develop in the SAP area: Apply for summer internships at BPX!
Revolutionizing Business Intelligence with ChatGPT and Qlik: A Game Changer for Finance, Production, and Executive Decision-Making
When Outbound and Finance Tax requirements impact master data records
The best books, blogs, forums and training courses about Business Intelligence
Let’s talk! Are you interested in our solutions? Our experts are happy to answer all of your questions.