Programming MapReduce with Scalding
- Length: 107 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2014-06-24
- ISBN-10: 1783287012
- ISBN-13: 9781783287017
- Sales Rank: #1855688 (See Top 100 Books)
A practical guide to designing, testing, and implementing complex MapReduce applications in Scala
Overview
- Develop MapReduce applications using a functional development language in a lightweight, high-performance, and testable way
- Recognize the Scalding capabilities to communicate with external data stores and perform machine learning operations
- Full of illustrations and diagrams, practical examples, and tips for deeper understanding of MapReduce application development
In Detail
Programming MapReduce with Scalding is a practical guide to setting up a development environment and implementing simple and complex MapReduce transformations in Scalding, using a test-driven development methodology and other best practices.
This book will first introduce you to how the Cascading framework allows for higher abstraction reasoning over MapReduce applications and then dive into how Scala DSL Scalding enables us to develop elegant and testable applications. It will then teach you how to test Scalding jobs and how to define specifications and behavior-driven development (BDD) with Scalding. This book will also demonstrate how to monitor and maintain cluster stability and efficiently access SQL, NoSQL, and search platforms.
Programming MapReduce with Scalding provides hands-on information starting from proof of concept applications and progressing to production-ready implementations.
What you will learn from this book
- Set up an environment to execute jobs in local and Hadoop mode
- Preview the complete Scalding API through examples and illustrations
- Learn about Scalding capabilities, testing, and pipelining jobs
- Understand the concepts of MapReduce patterns and the applications of its ecosystem
- Implement logfile analysis and ad-targeting applications using best practices
- Apply a test-driven development (TDD) methodology and structure Scalding applications in a modular and testable way
- Interact with external NoSQL and SQL data stores from Scalding
- Deploy, schedule, monitor, and maintain production systems
Approach
This book is an easy-to-understand, practical guide to designing, testing, and implementing complex MapReduce applications in Scala using the Scalding framework. It is packed with examples featuring log-processing, ad-targeting, and machine learning.
Table of Contents
Chapter 1. Introduction to MapReduce
Chapter 2. Get Ready for Scalding
Chapter 3. Scalding by Example
Chapter 4. Intermediate Examples
Chapter 5. Scalding Design Patterns
Chapter 6. Testing and TDD
Chapter 7. Running Scalding in Production
Chapter 8. Using External Data Stores
Chapter 9. Matrix Calculations and Machine Learning