Hyperparameter tuning is one of the most important parts of a machine learning pipeline. A wrong choice of the hyperparameters’ values may lead to wrong results and a model with poor performance.

Hyperparameter tuning is one of the most important parts of a machine learning pipeline. A wrong choice of the hyperparameters’ values may lead to wrong results and a model with poor performance.
Neural networks are fascinating and very efficient tools for data scientists, but they have a very huge flaw: they are unexplainable black boxes. In fact, they don’t give us any information about feature importance. Fortunately, there is a powerful approach we can use to interpret every model, even neural networks. It is the SHAP approach.
Google Sheets is a very powerful (and free) tool for creating spreadsheets. I’ve almost replaced LibreOffice Calc with Sheets, because it’s very comfortable to work with. Sometimes, a data scientist has to pull some data from a Google Sheet into a Python notebook. In this article, I’ll show you how to do it using just Pandas.
Neural networks are a fascinating field of machine learning. Let’s see how to find the best number of neurons of a neural network for our dataset.
The first thing I have learned as a data scientist is that feature selection is one of the most important steps of a machine learning pipeline. Fortunately, some models may help us accomplish this goal by giving us their own interpretation of feature importance. One of such models is the Lasso regression.
In the machine learning world, data scientists are often told to train a supervised model on a large training dataset and test it on a smaller amount of data. The reason why training dataset is always chosen larger than the test one is that somebody says that the larger the data used for training, the better the model learns.
Language detection (or identification) is a fascinating branch of Natural Language Processing. Its goal is to create a model that is able to detect the language a text is written in. Data Scientists usually employ neural network models to accomplish such a goal. In this article, I show how to create a simple language detection model in Python using a Naive Bayes model.
I’m glad to introduce my new free online workshop about feature importance in machine learning. In this workshop, feature importance in supervised machine learning is presented both in theory and in practice using Python programming language and its powerful scikit-learn library.
Feature selection is probably the most important part of machine learning, as well as hyperparameter tuning. How can we select the right …
Feature selection has always been a great task in machine learning. According to my experience, I can surely say that feature selection is much more important than model selection itself.