
Dataproc Cookbook: Running Spark and Hadoop Workloads in Google Cloud
- Length: 436 pages
- Edition: 1st
- Language: English
- Publisher: O'Reilly Media
- Publication Date: 2025/07/08
- ISBN-10: 1098157702
- ISBN-13: 9781098157708
Want to build big data solutions in Google Cloud? Dataproc Cookbook is your hands-on guide to mastering Dataproc and the essential GCP fundamentals–like networking, security, monitoring, and cost optimization–that apply across Google Cloud services. Learn practical skills that not only fast-track your Dataproc expertise, but also help you succeed with a wide range of GCP technologies.
Written by data experts Narasimha Sadineni and Anu Venkataraman, this cookbook tackles real-world use cases like serverless Spark jobs, Kubernetes-native deployments, and cost-optimized data lake workflows. You’ll learn how to create ephemeral and persistent Dataproc clusters, run secure data science workloads, implement monitoring solutions, and plan effective migration and optimization strategies.
- Create Dataproc clusters on Compute Engine and Kubernetes Engine
- Run data science workloads on Dataproc
- Execute Spark jobs on Dataproc Serverless
- Optimize Dataproc clusters to be cost effective and performant
- Monitor Spark jobs in various ways
- Orchestrate various workloads and activities
- Use different methods for migrating data and workloads from existing Hadoop clusters to Dataproc