Azure Data Engineering Cookbook: Design and implement batch and streaming analytics using Azure Cloud Services
- Length: 454 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2021-04-05
- ISBN-10: 1800206550
- ISBN-13: 9781800206557
- Sales Rank: #2038701 (See Top 100 Books)
Over 90 recipes to help data scientists and AI engineers orchestrate modern ETL/ELT workflows and perform analytics using Azure services more easily
Key Features
- Discover how to work with different SQL and NoSQL data stores in Microsoft Azure
- Create and execute real-time processing solutions using Azure Databricks, Azure Stream Analytics, and Azure Data Explorer
- Design and execute batch processing solutions using Azure Data Factory
Book Description
Data engineering is a growing field that focuses on preparing data for analysis. This book uses various Azure services to implement and maintain infrastructure to extract data from multiple sources, and then transform and load it for data analysis.
This book takes you through different techniques for performing big data engineering using Microsoft cloud services. It begins by showing you how Azure Blob storage can be used for storing large amounts of unstructured data and how to use it for orchestrating a data workflow. You’ll then work with different Cosmos DB APIs and Azure SQL Database. Moving on, you’ll discover how to provision an Azure Synapse database and find out how to ingest and analyze data in Azure Synapse. As you advance, you’ll cover the design and implementation of batch processing solutions using Azure Data Factory, and understand how to manage, maintain, and secure Azure Data Factory pipelines. You’ll also design and implement batch processing solutions using Azure Databricks and then manage and secure Azure Databricks clusters and jobs. In the concluding chapters, you’ll learn how to process streaming data using Azure Stream Analytics and Data Explorer.
By the end of this Azure book, you’ll have gained the knowledge you need to be able to orchestrate batch and real-time ETL workflows in Microsoft Azure.
What you will learn
- Use Azure Blob storage for storing large amounts of unstructured data
- Perform CRUD operations on the Cosmos Table API
- Implement elastic pools and business continuity with Azure SQL Database
- Ingest and analyze data using Azure Synapse Analytics
- Develop Data Factory data flows to extract data from multiple sources
- Manage, maintain, and secure Azure Data Factory pipelines
- Process streaming data using Azure Stream Analytics and Data Explorer
Who this book is for
This book is for database administrators, database developers, and extract, load, transform (ETL) developers looking to build expertise in Azure Data engineering using a recipe-based approach. Technical architects and database architects with experience in designing data or ETL applications either on-premise or on any other cloud vendor who want to learn Azure Data engineering concepts will also find this book useful. Prior knowledge of Azure fundamentals and data engineering concepts is needed.
Table of Contents
- Working with Azure Blob Storage
- Working with Relational Database in Azure
- Analyzing Data with Azure Synapse Analytics
- Control Flow Activities in Azure Data Factory
- Control Flow Transformation and Copy Data Activity in Azure Data Factory
- Data Flow in Azure Data Factory
- Azure Data Factory Integration Runtime
- Deploying Azure Data Factory Pipelines
- Batch and Streaming Data Processing with Azure Databricks