Python: End-to-end Data Analysis Front Cover

Python: End-to-end Data Analysis

  • Length: 1321 pages
  • Edition: 1
  • Publisher:
  • Publication Date: 2017-05-31
  • ISBN-10: B072M6868D
  • Sales Rank: #2063470 (See Top 100 Books)
Description

Leverage the power of Python to clean, scrape, analyze, and visualize your data

About This Book

  • Clean, format, and explore your data using the popular Python libraries and get valuable insights from it
  • Analyze big data sets; create attractive visualizations; manipulate and process various data types using NumPy, SciPy, and matplotlib; and more
  • Packed with easy-to-follow examples to develop advanced computational skills for the analysis of complex data

Who This Book Is For

This course is for developers, analysts, and data scientists who want to learn data analysis from scratch. This course will provide you with a solid foundation from which to analyze data with varying complexity. A working knowledge of Python (and a strong interest in playing with your data) is recommended.

What You Will Learn

  • Understand the importance of data analysis and master its processing steps
  • Get comfortable using Python and its associated data analysis libraries such as Pandas, NumPy, and SciPy
  • Clean and transform your data and apply advanced statistical analysis to create attractive visualizations
  • Analyze images and time series data
  • Mine text and analyze social networks
  • Perform web scraping and work with different databases, Hadoop, and Spark
  • Use statistical models to discover patterns in data
  • Detect similarities and differences in data with clustering
  • Work with Jupyter Notebook to produce publication-ready figures to be included in reports

In Detail

Data analysis is the process of applying logical and analytical reasoning to study each component of data present in the system. Python is a multi-domain, high-level, programming language that offers a range of tools and libraries suitable for all purposes, it has slowly evolved as one of the primary languages for data science. Have you ever imagined becoming an expert at effectively approaching data analysis problems, solving them, and extracting all of the available information from your data? If yes, look no further, this is the course you need!

In this course, we will get you started with Python data analysis by introducing the basics of data analysis and supported Python libraries such as matplotlib, NumPy, and pandas. Create visualizations by choosing color maps, different shapes, sizes, and palettes then delve into statistical data analysis using distribution algorithms and correlations. You’ll then find your way around different data and numerical problems, get to grips with Spark and HDFS, and set up migration scripts for web mining. You’ll be able to quickly and accurately perform hands-on sorting, reduction, and subsequent analysis, and fully appreciate how data analysis methods can support business decision-making. Finally, you will delve into advanced techniques such as performing regression, quantifying cause and effect using Bayesian methods, and discovering how to use Python’s tools for supervised machine learning.

The course provides you with highly practical content explaining data analysis with Python, from the following Packt books:

  1. Getting Started with Python Data Analysis.
  2. Python Data Analysis Cookbook.
  3. Mastering Python Data Analysis.

By the end of this course, you will have all the knowledge you need to analyze your data with varying complexity levels, and turn it into actionable insights.

Style and approach

Learn Python data analysis using engaging examples and fun exercises, and with a gentle and friendly but comprehensive “learn-by-doing” approach. It offers you a useful way of analyzing the data that’s specific to this course, but that can also be applied to any other data. This course is designed to be both a guide and a reference for moving beyond the basics of data analysis.

Table of Contents

1. Module 1
1. Introducing Data Analysis and Libraries
2. NumPy Arrays and Vectorized Computation
3. Data Analysis with Pandas
4. Data Visualization
5. Time Series
6. Interacting with Databases
7. Data Analysis Application Examples
8. Machine Learning Models with scikit-learn
2. Module 2
1. Laying the Foundation for Reproducible Data Analysis
2. Creating Attractive Data Visualizations
3. Statistical Data Analysis and Probability
4. Dealing with Data and Numerical Issues
5. Web Mining, Databases, and Big Data
6. Signal Processing and Timeseries
7. Selecting Stocks with Financial Data Analysis
8. Text Mining and Social Network Analysis
9. Ensemble Learning and Dimensionality Reduction
10. Evaluating Classifiers, Regressors, and Clusters
11. Analyzing Images
12. Parallelism and Performance
3. Module 3
1. Tools of the Trade
2. Exploring Data
3. Learning About Models
4. Regression
5. Clustering
6. Bayesian Methods
7. Supervised and Unsupervised Learning
8. Time Series Analysis

To access the link, solve the captcha.