Sams Teach Yourself Hadoop in 24 Hours
- Length: 496 pages
- Edition: 1
- Language: English
- Publisher: Sams Publishing
- Publication Date: 2017-04-10
- ISBN-10: B06XYM3XH4
- ISBN-13: 9780672338526
- Sales Rank: #888159 (See Top 100 Books)
Apache Hadoop is the technology at the heart of the Big Data revolution, and Hadoop skills are in enormous demand. Now, in just 24 lessons of one hour or less, you can learn all the skills and techniques you’ll need to deploy each key component of a Hadoop platform in your local environment or in the cloud, building a fully functional Hadoop cluster and using it with real programs and datasets. Each short, easy lesson builds on all that’s come before, helping you master all of Hadoop’s essentials, and extend it to meet your unique challenges. Apache Hadoop in 24 Hours, Sams Teach Yourself covers all this, and much more:
- Understanding Hadoop and the Hadoop Distributed File System (HDFS)
- Importing data into Hadoop, and process it there
- Mastering basic MapReduce Java programming, and using advanced MapReduce API concepts
- Making the most of Apache Pig and Apache Hive
- Implementing and administering YARN
- Taking advantage of the full Hadoop ecosystem
- Managing Hadoop clusters with Apache Ambari
- Working with the Hadoop User Environment (HUE)
- Scaling, securing, and troubleshooting Hadoop environments
- Integrating Hadoop into the enterprise
- Deploying Hadoop in the cloud
- Getting started with Apache Spark
Step-by-step instructions walk you through common questions, issues, and tasks; Q-and-As, Quizzes, and Exercises build and test your knowledge; “Did You Know?” tips offer insider advice and shortcuts; and “Watch Out!” alerts help you avoid pitfalls. By the time you’re finished, you’ll be comfortable using Apache Hadoop to solve a wide spectrum of Big Data problems.
Table of Contents
Part I: Getting Started with Hadoop
HOUR 1 Introducing Hadoop
HOUR 2 Understanding the Hadoop Cluster Architecture
HOUR 3 Deploying Hadoop
HOUR 4 Understanding the Hadoop Distributed File System (HDFS)
HOUR 5 Getting Data into Hadoop
HOUR 6 Understanding Data Processing in Hadoop
Part II: Using Hadoop
HOUR HOUR 7 Programming MapReduce Applications
HOUR 8 Analyzing Data in HDFS Using Apache Pig
HOUR 9 Using Advanced Pig
HOUR 10 Analyzing Data Using Apache Hive
HOUR 11 Using Advanced Hive
HOUR 12 Using SQL-on-Hadoop Solutions
HOUR 13 Introducing Apache Spark
HOUR 14 Using the Hadoop User Environment (HUE)
HOUR 15 Introducing NoSQL
Part III: Managing Hadoop
HOUR 16 Managing YARN
HOUR 17 Working with the Hadoop Ecosystem
HOUR 18 Using Cluster Management Utilities
HOUR 19 Scaling Hadoop
HOUR 20 Understanding Cluster Configuration
HOUR 21 Understanding Advanced HDFS
HOUR 22 Securing Hadoop
HOUR 23 Administering, Monitoring and Troubleshooting Hadoop
HOUR 24 Integrating Hadoop into the Enterprise