Python Data Mining Quick Start Guide
- Length: 188 pages
- Edition: 1
- Language: English
- Publisher: Packt Publishing
- Publication Date: 2019-04-25
- ISBN-10: 1789800269
- ISBN-13: 9781789800265
- Sales Rank: #2285218 (See Top 100 Books)
Explore different data mining techniques using Python libraries and packages
Key Features
- Grasp the basics of data loading, cleaning, analysis, and visualization
- Use popular Python libraries such as NumPy, pandas, Matplotlib, and scikit-learn for data mining
- Your one-stop guide to building efficient data mining pipelines without going into too much theory
Book Description
Data mining involves the use of tools and techniques to identify unique and useful patterns in a dataset. Thanks to its rich ecosystem of libraries used for data analysis, manipulation, and machine learning, Python has emerged as a popular tool for performing data mining. This book is a quick primer on how to get started with using Python for effective data mining.
Starting with a quick introduction to the concept of data mining, this book will help you put it to practical use with the help of popular Python packages and libraries. You’ll get a demonstration of working with different real-world datasets and extracting insights from them Python libraries such as NumPy, pandas, scikit-learn, and Matplotlib. The book will then learn take you through the different stages of data mining—loading, cleaning, analysis, and data visualization. You’ll also explore widely used data transformation, clustering, and classification techniques.
By the end of this book, you’ll be able to build an efficient data mining pipeline using Python with ease.
What you will learn
- Explore methods for summarizing datasets and visualizing/plotting data
- Collect and format data for analytical work
- Assign data points into groups and visualize clustering patterns
- Predict continuous and categorical output for your data
- Clean, filter noise from, and reduce the dimensions of data
- Serialize a data processing model using scikit-learn’s pipeline feature
- Deploy your data processing model using Python’s pickle module
Who this book is for
If you’re a Python developer interested in getting started with data mining, this book is for you. Budding data scientists and data analysts with Python programming knowledge will also find this book useful for getting up to speed with practical data mining using Python.
Table of Contents
- Data Mining and Getting Started with Python Tools
- Basic Terminology and Our End-to-End Example
- Collecting, Exploring, and Visualizing Data
- Cleaning and Readying Data for Analysis
- Grouping and Clustering Data
- Prediction with Regression and Classification
- Advanced Topics – Building a Data Processing Pipeline and Deploying It