Hands-On Machine Learning with Scikit-Learn

and Scientific Python Toolkits

A practical guide to implementing supervised and unsupervised machine learning algorithms in Python

Do you want to start your machine learning career but feel overwhelmed with the amount of the resources you need to read? Do you feel lost not knowing where to begin?

"The book is the perfect read for anyone who wants to transition into machine learning. It broadly covers all the key algorithms with an insightful practitioner's perspective"

I've written this book to help you kickstart your machine learning career. I have more than a decade of experience in machine learning. I did my postgraduate degree in the field, and worked in different scaleups, where I learned how to bring the theoretical parts into practice. This is my second data-related book, and I hope you will enjoy it.

This book focuses on Scikit-Learn, since it is a versatile library that is popular among machine learning practitioners. Nevertheless, the book goes beyond Scikit-Learn, and introduces you to complementary libraries such as NumPy, Pandas, SpaCy, imbalanced-learn, and Scikit-Surprise. The theoretical knowledge in this book should also prepare you to use libraries not mentioned here such as TensorFlow and PyTorch.

Start your machine learning journey by visitng this link

Book Reviews

Here are some example reviews

From GoodReads:

Ali Faizan rated it: 5 out of 5 stars.

"For a machine learning noob like me, it was pleasing to see that the book did not dive straight into the nitty-gritty of machine learning algorithms: it first established the raison d’être for machine learning and cohesively captured the whole gamut of developing a machine learning model. This helped me quite a bit to understand the bigger picture later on in the book where it demonstrated the practical use of various machine learning algorithms. I'll happily recommend this book to anyone interested in scikit-learn, and machine learning in general too".

Paul Schmidt rated it: 5 out of 5 stars.

"This book is information rich with practical examples. I whom never read or touched this area was suprised to learn the weight that data analysis had on machine learning. Yes, this book also teaches you about data analysis. Throughout the chapters you learn what not to do when building machine learning and deep learning models. The author teaches you what not to do by analysing the data at hand and improving the models upon that knowledge. The book is very information rich and can easily be reread from chapter to chapter. There are some things to keep in mind, this book is not for python beginners and i urge you to know some of the basics from the pandas and matplotlib modules. In other words this book is strongly recommended".

From Amazon:

Przemyslaw Chojecki rated it: 5 out of 5 stars.

"If you've already did a couple of data science projects, had a basic understanding of Python, did some visualisation and want to go deeper into some details of what it means to analyse data, then this book is for you. This is a practical guide to both supervised and unsupervised learning with plenty of examples in code. The main focus is on imperfect data and how to make sense of these imperfections through various machine learning algorithms. The author discusses standard data science algorithms using scikit-learn library which gives a coherent overview of the subjest. You will learn decision trees, KNN classification, Naive Bayes and much more; applied to classical datasets like Iris dataset, Boston housing prices or Fashion-MNIST. Recommended for beginning data scientists!".

Adam Powell rated it: 5 out of 5 stars.

"The perfect read for an analyst that wants to transition into machine learning. It broadly covers all the key algorithms with an insightful practitioner's perspective. Highly recommended!".

From YouTube:

DigitalSreeni: Book Review - Machine Learning with scikit-learn and scientific python toolkits

Dimitri Bianco: Hands-On Machine Learning with scikit-learn and Scientific Python Toolkit

Book Content

This book is composed of 13 chapters. Here is a brief about each chapter:

Chapter 1:

You may be wondering how machines actually learn.

Chapter 2: Making Decisions with Trees

This chapter will introduce you to our first supervised learning algorithm in this book - decision trees. This was picked to be introduced early on since it is a versatile and easy to understand algorithm. You will also see later on that it is used as the building block for numerous advanced algorithms, such as Random Forest and Gradient Boosted Trees.

In each chapter you will learn about general machine learning and statistical concepts in parallel to the main topic of the chapter. Thus, you will get to know about data splitting, model evaluation and hyper-parameter tuning.

By the end of this chapter, you will have a very good understanding of the following topics:

Chapter 3: Making Decisions with Linear Equations

The linear models are possibly the most commonly used algorithms in statistics and machine learning. They are used for both regression and classification. Thus, in this chapter we will start by looking into the basic least-squares algorithm, then will move on to more advanced algorithms as the chapter progresses.

The secondary topics that you will get introduced to in parallel to the linear model are regularization and regression intervals. Regularization is a very powerful concept that you will meet over and over again throughout your machine learning journey. Thus, I decided to introduce it early on in the book. The concept of regression intervals is also a very useful tool to quantify your uncertaining about your productions.

By the end of this chapter, you will have a very good understanding of the following topics:

Chapter 4: Preparing your data

You probably heard one version or another of the saying, "Data scientists spend 80% of their time cleaning data". Data cleaning is an essential part of the job, but furthermore, even when the data is clean, many algorithms demand the data to be processed in ways to make it suitable for them to operate on. In this chapter we will talk about the following:

Chapter 5: Image Processing with Nearest Neighbors

Image processing is an essential part of machine learning. I find the Nearest Neighbor Algorithm a good way to understand how image classification works before getting into more complex algorithms that may obscure things. In this chapter we will learn about the following topics.

Chapter 6: Classifying Text using Naive Bayes

"A word after a word after a word is port" - Margaret Atwood. In this chapter we will learn about Natural Language Processing (NLP) and text classification. Here are the topics covered here.

Chapter 7: Neural Networks; Here Comes the Deep Learning

The term deep learning refers to deep Artificial Neural Networks (ANNs). The latter concept comes in different forms and shapes. In this chapter, we are going to cover one subset of feedforward neural networks known as the Multilayer Perceptron (MLP). It is one of the most commonly used types and is implemented by scikit-learn. As its name suggests, it is composed of multiple layers, and it is a feedforward network as there are no cyclic connections between its layers. The more layers there are, the deeper the network is. These deep networks can exist in multiple forms, such as MLP, Convolutional Neural Networks (CNNs), or Long Short-Term Memory (LSTM). The latter two are not implemented by scikit-learn, yet this will not stop us from discussing the main concepts behind CNNs and manually mimicking them using the tools available from the scientific Python ecosystem.

In this chapter, we are going to cover the following topics:

Chapter 8: Ensembles - When one model is not enough

Chapter 9: The Y is as important as the X

Chapter 10: Imbalanced Learning - Not even 1% win the lottery

Chapter 11: Making Sense of Unlabeled Data

Chapter 12: Anomaly Detection and Finding Outliers in Data

Chapter 13: Recommender System - Getting to know their taste

Probably recommender systems are the first ones to come to a layperson's mind when they hear about machine learning. These systems are everywhere, from Spotify to Netflix and Amazon. In this chapter we will be using a sister library to scikit-learn called Surprise. Then you will learn the difference between content-based and collaborative filtering algorithms. You will learn how to solve the cold-start problem, and how to package your final model and serve it behind a REST API. Here are the main topics of this chapter:

Get your hands on Hands-On Machine Learning

Start your machine learning journey by visiting this link*

Links to Amazon are affiliate links.