Data Lake Architecture: Designing the Data Lake and Avoiding the Garbage Dump
- Length: 166 pages
- Edition: First
- Language: English
- Publisher: Technics Publications
- Publication Date: 2016-04-01
- ISBN-10: 1634621174
- ISBN-13: 9781634621175
- Sales Rank: #205461 (See Top 100 Books)
Organizations invest incredible amounts of time and money obtaining and then storing big data in data stores called data lakes. But how many of these organizations can actually get the data back out in a useable form? Very few can turn the data lake into an information gold mine. Most wind up with garbage dumps.
Data Lake Architecture will explain how to build a useful data lake, where data scientists and data analysts can solve business challenges and identify new business opportunities. Learn how to structure data lakes as well as analog, application, and text-based data ponds to provide maximum business value. Understand the role of the raw data pond and when to use an archival data pond. Leverage the four key ingredients for data lake success: metadata, integration mapping, context, and metaprocess.
Bill Inmon opened our eyes to the architecture and benefits of a data warehouse, and now he takes us to the next level of data lake architecture.
Table of Contents
Chapter 1 Data Lakes
Chapter 2 Transforming the Data Lake
Chapter 3 Inside the Data Lake
Chapter 4 Data Ponds
Chapter 5 Generic Structure of the Data Pond
Chapter 6 Analog Data Pond
Chapter 7 Application Data Pond
Chapter 8 Textual Data Pond
Chapter 9 Comparing the Ponds
Chapter 10 Using the Infrastructure
Chapter 11 Search and Analysis
Chapter 12 Business Value in the Data Ponds
Chapter 13 Additional Topics
Chapter 14 Analytical and Integration Tools
Chapter 15 Archiving Data Ponds