Sharing Big Data Safely: Managing Data Security
- Length: 96 pages
- Edition: 1
- Language: English
- Publisher: O'Reilly Media
- Publication Date: 2016-01-08
- ISBN-10: 1491952121
- ISBN-13: 9781491952122
- Sales Rank: #2657584 (See Top 100 Books)
Many big data-driven companies today are moving to protect certain types of data against intrusion, leaks, or unauthorized eyes. But how do you lock down data while granting access to people who need to see it? In this practical book, authors Ted Dunning and Ellen Friedman offer two novel and practical solutions that you can implement right away.
Ideal for both technical and non-technical decision makers, group leaders, developers, and data scientists, this book shows you how to:
- Share original data in a controlled way so that different groups within your organization only see part of the whole. You’ll learn how to do this with the new open source SQL query engine Apache Drill.
- Provide synthetic data that emulates the behavior of sensitive data. This approach enables external advisors to work with you on projects involving data that you can’t show them.
If you’re intrigued by the synthetic data solution, explore the log-synth program that Ted Dunning developed as open source code (available on GitHub), along with how-to instructions and tips for best practice. You’ll also get a collection of use cases.
Providing lock-down security while safely sharing data is a significant challenge for a growing number of organizations. With this book, you’ll discover new options to share data safely without sacrificing security.
Table of Contents
Chapter 1. So Secure It’s Lost
Chapter 2. The Challenge: Sharing Data Safely
Chapter 3. Data on a Need-to-Know Basis
Chapter 4. Fake Data Gives Real Answers
Chapter 5. Fixing a Broken Large-Scale Query
Chapter 6. Fraud Detection
Chapter 7. A Detailed Look at log-synth
Chapter 8. Sharing Data Safely: Practical Lessons