Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning, 2nd Edition
- Length: 462 pages
- Edition: 2
- Language: English
- Publisher: O'Reilly Media
- Publication Date: 2022-04-26
- ISBN-10: 1098118952
- ISBN-13: 9781098118952
- Sales Rank: #3357939 (See Top 100 Books)
Description
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP.
Through the course of this updated second edition, you’ll work through a sample business decision by employing a variety of data science approaches. Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.
You’ll learn how to:
- Employ best practices in building highly scalable data and ML pipelines on Google Cloud
- Automate and schedule data ingest using Cloud Run
- Create and populate a dashboard in Data Studio
- Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery
- Conduct interactive data exploration with BigQuery
- Create a Bayesian model with Spark on Cloud Dataproc
- Forecast time series and do anomaly detection with BigQuery ML
- Aggregate within time windows with Dataflow
- Train explainable machine learning models with Vertex AI
- Operationalize ML with Vertex AI Pipelines
Free ChaptersTry Audible and Get Two Free Audiobooks »
To access the link, solve the captcha.
Recommended BooksMore Similar Books »