Data Observability for Data Engineering: Proactive strategies for ensuring data accuracy and addressing broken data pipelines
- Length: 228 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2023-12-29
- ISBN-10: 1804616028
- ISBN-13: 9781804616024
- Sales Rank: #0 (See Top 100 Books)
Ensure your data pipelines are healthy and promote data observability in your teams with this essential hands-on guide
Key Features
- Learn how to monitor your data pipelines in a scalable way
- Use real-life use cases and projects to practice implementing data observability
- Build trust in your pipelines among data producers and consumers alike
Book Description
In the information age, data is critically important. Every organization needs to manage its data effectively to ensure accuracy and to prevent its data pipelines from breaking. In these fast moving times of data engineering, how can you keep on top of this?
Data Observability for Data Engineering has the answer. Data observability is a union of techniques and methods that allow you to monitor and validate the health of your data, and this practical guide will show you how to implement it successfully in your organization.
We begin by explaining what data observability is, how it builds on data quality monitoring, and why it is essential from data engineering perspective. Once you’re familiar with the techniques and elements of data observability, you’ll get hands-on with a practical Python project to reinforce what you’ve learned.
At the end of the book, we provide some use cases and projects for you to experiment with, by which time you will be perfectly placed to implement Data Observability in your organization and never worry again about the quality of your data pipelines to ease the mind of data engineers.
What you will learn
- Monitor data pipelines proactively in a scalable way
- Implement a data observability approach in the pipelines
- Collect and analyze key metrics through coding examples
- Apply monkey patching in a Python module
- Manage the costs and risks of your data pipeline
- Understand the main techniques to collect observability metrics
- Implement analytics pipeline monitoring techniques in production
- Build a statistic engine continuously
Who This Book Is For
This book is for data engineers, data architects, data analysts, and data scientists who have experienced broken data pipelines or dashboards. It would also be useful for organizations that want to adopt the practice of data observability and managers, such as Head of Data or Head of Data Platforms, who are responsible for data quality and processes and are looking for a way to increase the confidence of the consumers and the awareness of producers in their data pipelines.