A practical guide to implementing supervised and unsupervised machine learning algorithms in Python

Ready to kickstart your machine learning career but feeling lost in a sea of resources?

This guide is your personal compass, empowering you to navigate the complexities and become a machine learning expert.

"The book is the perfect read for anyone who wants to transition into machine learning. It broadly covers all the key algorithms with an insightful practitioner's perspective"

"I've written this book to help you start your journey in machine learning. With over a decade of practical experience and a postgraduate degree in the field, I've gained the expertise to bridge theory and practice. This book, my second in the data-related domain, aims to be an enjoyable resource for you.

The focus of this book is Scikit-Learn, a versatile and popular library among machine learning practitioners. However, it goes beyond Scikit-Learn, introducing complementary libraries like NumPy, Pandas, SpaCy, imbalanced-learn, and Scikit-Surprise. By understanding the theoretical concepts covered in this book, you'll also be well-prepared to explore other libraries such as TensorFlow and PyTorch, expanding your knowledge and skills in the field."

**Start your machine learning journey by visitng this link**

Here are some example reviews

*Ali Faizan* rated it: **5 out of 5** stars.

"For a machine learning noob like me, it was pleasing to see that the book did not dive straight into the nitty-gritty of machine learning algorithms: it first established the raison d’être for machine learning and cohesively captured the whole gamut of developing a machine learning model. This helped me quite a bit to understand the bigger picture later on in the book where it demonstrated the practical use of various machine learning algorithms. I'll happily recommend this book to anyone interested in scikit-learn, and machine learning in general too".

*Paul Schmidt* rated it: **5 out of 5** stars.

"This book is information rich with practical examples. I whom never read or touched this area was suprised to learn the weight that data analysis had on machine learning. Yes, this book also teaches you about data analysis. Throughout the chapters you learn what not to do when building machine learning and deep learning models. The author teaches you what not to do by analysing the data at hand and improving the models upon that knowledge. The book is very information rich and can easily be reread from chapter to chapter. There are some things to keep in mind, this book is not for python beginners and i urge you to know some of the basics from the pandas and matplotlib modules. In other words this book is strongly recommended".

*Przemyslaw Chojecki* rated it: **5 out of 5** stars.

"If you've already did a couple of data science projects, had a basic understanding of Python, did some visualisation and want to go deeper into some details of what it means to analyse data, then this book is for you. This is a practical guide to both supervised and unsupervised learning with plenty of examples in code. The main focus is on imperfect data and how to make sense of these imperfections through various machine learning algorithms. The author discusses standard data science algorithms using scikit-learn library which gives a coherent overview of the subjest. You will learn decision trees, KNN classification, Naive Bayes and much more; applied to classical datasets like Iris dataset, Boston housing prices or Fashion-MNIST. Recommended for beginning data scientists!".

*Adam Powell* rated it: **5 out of 5** stars.

"The perfect read for an analyst that wants to transition into machine learning. It broadly covers all the key algorithms with an insightful practitioner's perspective. Highly recommended!".

DigitalSreeni: Book Review - Machine Learning with scikit-learn and scientific python toolkits

Dimitri Bianco: Hands-On Machine Learning with scikit-learn and Scientific Python Toolkit

This book is composed of 13 chapters. Here is a brief about each chapter:

Embark on an illuminating journey into the realm of machine learning. Curious about how machines acquire knowledge? This chapter unveils the big picture, laying a solid foundation for the captivating algorithms we delve into next.

Introducing our first supervised learning algorithm in this book: decision trees.

We chose this versatile and easily comprehensible algorithm to kickstart your journey. As you progress, you'll discover its vital role as a foundation for advanced algorithms like Random Forest and Gradient Boosted Trees.

Each chapter is designed to expand your knowledge of machine learning and statistical concepts alongside the main topic. This here, we will explore data splitting, model evaluation, and hyper-parameter tuning.

By the chapter's end, you'll have mastered:

- The inner workings of decision trees and how they learn
- Optimal strategies for data splitting
- Harnessing cross-validation for reliable scores
- Unveiling hyper-parameters and their effective tuning
- Visualizing decision boundaries within the tree
- Leveraging decision trees for regression tasks
- Tailoring weights for diverse training samples

Get ready for a transformative learning experience that equips you with the tools to unlock the potential of decision trees and beyond.

The linear models are possibly the most commonly used algorithms in statistics and machine learning. They are used for both regression and classification. Thus, in this chapter we will start by looking into the basic least-squares algorithm, then will move on to more advanced algorithms as the chapter progresses.

The secondary topics that you will get introduced to in parallel to the linear model are regularization and regression intervals. Regularization is a very powerful concept that you will meet over and over again throughout your machine learning journey. Thus, I decided to introduce it early on in the book. The concept of regression intervals is also a very useful tool to quantify your uncertaining about your productions.

By the end of this chapter, you will have a very good understanding of the following topics:

- Understanding linear models and their history
- Learn about regression models evaluations criteria (MSE, MAE and Coefficient of Determination, i.e. R^2)
- How to use Confidence Intervals to get more reliable scores?
- How to engineer new features and find their Importances (e.g. Polynomial features)
- What is regularisation? What are solvers?
- Start using your first Generalised Linear Model (GLM), Logistic regression
- Additional linear models (Stochastics Gradient Descent, Elastic-net, RANSAC, etc)
- Finding regression intervals

You probably heard one version or another of the saying, "Data scientists spend 80% of their time cleaning data". Data cleaning is an essential part of the job, but furthermore, even when the data is clean, many algorithms demand the data to be processed in ways to make it suitable for them to operate on. In this chapter we will talk about the following:

- Imputing missing values (e.g. SimpleImputer and IterativeImputer)
- Encoding non-numerical features (e.g. One-hot Encoding, Ordinal Encoding, Target Encoding, Leave-one-out Encoding, etc.)
- Feature Scaling (MinMax Scaler, Standard Scaler, Robust Scaler, etc.)
- Feature Selection (Variance Threshold, Mutual Information, etc.)

Image processing is an essential part of machine learning. I find the Nearest Neighbor Algorithm a good way to understand how image classification works before getting into more complex algorithms that may obscure things. In this chapter we will learn about the following topics.

- Nearest Neighbor Algorithm (KNN)
- Different Distances (e.g. Euclidean, Cosine, Manhattan, Minkowski, etc.)
- Creating a custom distance to use with KNN
- Radius Neighborhood
- Nearest Centroid Algorithm
- Principal Component Analysis (PCA)
- Neighborhood Component Analysis (NCA)
- Bias-Variance Tradeoff
- Hyper-parameter tuning via GridSearchCV

"A word after a word after a word is port" - Margaret Atwood. In this chapter we will learn about Natural Language Processing (NLP) and text classification. Here are the topics covered here.

- Tokenization and Vector Space Model
- Bag og Words, TF-IDF and Word Embedding (Word2Vec)
- Bayes Rule and Naive Bayes Classifier
- Multinomial vs Bernoulli Naive Bayes Classifier
- Gaussian Naive Bayes Classifier
- Additive Smooting (Lidstone and Laplace Smoothing)
- F1-Score for combining Precision and Recall Scores
- Scikit-Learn Pipelines
- Creating a custom Scikit-Learn Transformer
- Using NLTK and SpaCy with Scikit-Learn

The term deep learning refers to deep Artificial Neural Networks (ANNs). The latter concept comes in different forms and shapes. In this chapter, we are going to cover one subset of feedforward neural networks known as the Multilayer Perceptron (MLP). It is one of the most commonly used types and is implemented by scikit-learn. As its name suggests, it is composed of multiple layers, and it is a feedforward network as there are no cyclic connections between its layers. The more layers there are, the deeper the network is. These deep networks can exist in multiple forms, such as MLP, Convolutional Neural Networks (CNNs), or Long Short-Term Memory (LSTM). The latter two are not implemented by scikit-learn, yet this will not stop us from discussing the main concepts behind CNNs and manually mimicking them using the tools available from the scientific Python ecosystem.

In this chapter, we are going to cover the following topics:

- Getting to know the Multilayer Perceptron (MLP)
- Monitoring and tuning your neural network's learning rate.
- Judging if you need more training data or more epochs.
- Activation functions such as Softmax, ReLu, Leaky ReLu, etc.
- Adding your own activation function to scikit-learn.
- Classifying items of clothing
- Learning about convolutions, kernels and max pooling

- Bagging vs Boosting
- Random Forest
- Bagging Meta Estimator (with KNN)
- AdaBoost
- Gradient Boosting
- Voting and Stacking Ensembles
- Random Tree Embedding
- Learning Deviance
- Quantile Regression and Regression Ranges
- Early Stopping and Adaptive Learning Rate
- The ROC Curve

- Regression Target Scaling
- Multi-Class vs Multi-Label (Multi-Output) Classifiers
- OneVsOne vs OneVsRest Classifiers
- Classifier Probability Calibration (Sigmoid / Isotonic)
- Precision at K

- Predicting Click-through Rate (CTR)
- Reweighting the training samples
- Random Oversampling (ROS)
- Random Undersampling (RUS)
- Combining Sampling with Ensemble Methods (e.g. Balanced Random Forest and Balanced Bagging)
- Area Under the Curve (AUC)
- Fairness in Machine Learning (Equal Opportunity Score)

- Understanding Clustering
- K-means Clustering Algorithm
- Agglomerative Clustering
- Dense-based Spatial Clustering (DBSCAN)
- The Silhouette Score
- The Adjusted Rand Index
- Affinity and Linkage

- EllipticEnvelope (Mahalanobis Distance)
- Local Outlier Factor (LOF)

Probably recommender systems are the first ones to come to a layperson's mind when they hear about machine learning. These systems are everywhere, from Spotify to Netflix and Amazon. In this chapter we will be using a sister library to scikit-learn called Surprise. Then you will learn the difference between content-based and collaborative filtering algorithms. You will learn how to solve the cold-start problem, and how to package your final model and serve it behind a REST API. Here are the main topics of this chapter:

- How the different recommendation paradigms work.
- How KNN (K-Nearest Neighbors) algorithm helps in recommendation.
- What is Singular Value Decomposition (SVD).
- What are the best options for a baseline recommender?
- How to deploy your machine learning models to production.

"Hands-On Machine Learning with Scikit-Learn" is generally well-regarded in the machine learning community. It is known for its practical approach, providing readers with hands-on examples and exercises using the Scikit-Learn library.

The book covers fundamental concepts and techniques in machine learning, making it suitable for beginners and intermediate learners. It is often praised for its clear explanations and code examples that help readers understand and apply machine learning algorithms effectively.

Start your machine learning journey by visiting this link*

_{Links to Amazon are affiliate links.}