Step-by-step to a Data Scientist

January 24, 2022January 24, 2022

Weekly Article News #21

Step-by-step to a Data Scientist > Blog > 2022 > January

The recommended articles the author has read this week.
This letter is posted every Monday.

The topic this week is the version update of PyCaret. This update is incredible! Many new features are implemented.

PyCaret 2.3.6 is Here! Learn What’s New?

PyCaret 2.3.5 is the biggest release of PyCaret. New features are dashboard, EDA, converting model, fairness, web API, creating Dockerfile for web API, Web App, monitoring data drift, optimizing threshold for classification, and new documentation.

The author also introduces several new features in the following articles.

January 23, 2022January 23, 2022

Dockerfile for PyCaret

Step-by-step to a Data Scientist > Blog > 2022 > January

In this post, we will learn how to build a docker container for PyCaret.

We create a docker image from a Dockerfile. And, we will see how to build a docker container and run a jupyterlab.

Dockerfile

The entire contents of the Dockerfile are as follows.

FROM python:3.8

WORKDIR /opt
RUN pip install --upgrade pip
RUN pip install pycaret \
				jupyterlab 
RUN pip install lux-api
RUN jupyter nbextension install --py luxwidget
RUN jupyter nbextension enable --py luxwidget
WORKDIR /work

CMD ["jupyter","lab","--ip=0.0.0.0","--allow-root","--LabApp.token=''"]

We create the docker image based on the python image, whose version is 3.8.

By the sentence “WORKDIR /opt”, we specify the directory for installing the python libraries.

And, we upgrade pip, to install the external python libraries in order. Here, we install pycaret and jupyterlab. Of course, you can also install other libraries.

Note that the last sentence ‘WORKDIR /work’ indicates that the current directory is set at ‘/work/’ after we enter the docker container.

Build a Dockerfile

Let’s create a docker image from the Dockerfile. Execute the following command in the directory where the Dockerfile exists.

$ docker build .

After building the docker image, you can confirm the result by the following command. Later, we will use the ‘IMAGE ID’.

$ docker images

Run a docker container

Here, we run the docker container from the above docker image. The command format is as follows.

$ docker run -it -p 8888:8888 -v ~/mounted_to_docker/:/work (IMAGE ID)

'-p 8888:8888': 
-> Allows the port, whose number is 8888, in a docker container

'-v ~/mounted_to_docker/:/work': 
->Synchronizes the local directory you specified('~/mounted_to_docker/') with the directory in the container('/work').

When the docker container was successfully running, you can access a jupyterlab in your web browser. The URL appears in your terminal.

Your local directory ‘~/mounted_to_docker/’ is mounted to the working directory ‘/work’ in the container.

Congratulation!! You have prepared the environment for PyCaret.

January 22, 2022July 11, 2022

PyCaret 2.3.6, incredible update; Convert model and Web App

Step-by-step to a Data Scientist > Blog > 2022 > January

PyCaret is a useful auto ML python library because we can deploy machine learning models with low codes. We can also perform preprocessing, compare models, and tune hyperparameters, of course with low codes.

Recently, PyCaret version 2.3.6 was released. This is big news because several new wonderful functions were implemented in this release version! The details are described in this article written by PyCaret creator.

In this article, we will check the summary of this release. And, the three new features will be introduced.

New Features

Dashboard: interactive dashborad for a trained model.
EDA: Explonatory Data Analysis
Convert Model: converting a trained model from python into other programing language, such as C, Java, Go, JavaScript, Visual Basic, C#, PowerShell, R, PHP, Dart, Haskell, Ruby, F#.
Fairness
Web API based on FastAPI
Create Dockerfile and requirements file for a API
Web Application based on Gradio
Monitor Data Drift
Optimize Threshold for classification
New Documentation

As seeing the above new feature list, PyCaret is evolving dramatically!

In the following, we will introduce some of the new functions using normal regression analysis as an example.

Installation

If you have NOT installed PyCaret yet, you can easily install it by the following command. Note that specify the PyCaret version!

$pip install pycaret==2.3.6

From here, the sample code in this post is supposed to run on Jupyter Notebook.

Import Libraries

In advance, we load all the modules for regression analysis of PyCaret.

from pycaret.regression import *

Dataset

We use the diamond dataset for regression analysis.

# load dataset
from pycaret.datasets import get_data

df = get_data('diamond')

Set up the environment by the “setup()” function

PyCaret needs to initialize an environment by the “setup()” function. Conveniently, PyCaret infers the data type of the variables in the dataset.

Arguments of setup() are the dataset as Pandas DataFrame, the target-column name, and the “session_id”. The “session_id” equals a random seed.

s = setup(df, target='Price', session_id = 20220121)

Create a Model

Due to the simplicity of the technique and the interpretability of the model, we will adopt lr(Linear Regression) for the models that will be used below.

We can create the selected model by create_model() with the argument of “lr”. Another argument of “fold” is the number of cross-validation. “fold = 4” indicates we split the dataset into four and train the model in each dataset separately.

lr = create_model("lr", fold=4)

Introduction of New Features of PyCaret 2.3.6

From here, we will introduce two of the new features.

Convert model

With this new function, we can convert a trained model into another language, e. g. from python to C. This function is very useful when operating the created model.

Note that we need to install the dependency libraries.

$pip install m2cgen

Then, we can convert a model.

model_c = convert_model(model, language='c')
print(model_c)

Create Web App

This new function requires the “gradio” library.

$pip install gradio

Just 1 line code. We can create a web app.

create_app(model)

Summary

We have seen the new features of PyCaret 2.3.6.

In this article, we saw covering the model into another language and creating a web app.

Just 1 line.

Wouldn’t it be great? If you sympathize with it, please give it a try.

The author hopes this blog helps readers a little.

January 22, 2022January 22, 2022

PyCaret 2.3.6, incredible update; Dashboard and EDA functions

Step-by-step to a Data Scientist > Blog > 2022 > January

In this article, we will check the summary of this release. And, the three new features will be introduced.

New Features

Dashboard: interactive dashborad for a trained model.
EDA: Explonatory Data Analysis
Convert Model: converting a trained model from python into other programing language, such as C, Java, Go, JavaScript, Visual Basic, C#, PowerShell, R, PHP, Dart, Haskell, Ruby, F#.
Fairness
Web API based on FastAPI
Create Dockerfile and requirements file for a API
Web Application based on Gradio
Monitor Data Drift
Optimize Threshold for classification
New Documentation

As seeing the above new feature list, PyCaret is evolving dramatically!

In the following, we will introduce some of the new functions using normal regression analysis as an example.

Installation

If you have NOT installed PyCaret yet, you can easily install it by the following command. Note that specify the PyCaret version!

$pip install pycaret==2.3.6

From here, the sample code in this post is supposed to run on Jupyter Notebook.

Import Libraries

In advance, we load all the modules for regression analysis of PyCaret.

from pycaret.regression import *

Dataset

We use the diamond dataset for regression analysis.

# load dataset
from pycaret.datasets import get_data

df = get_data('diamond')

Set up the environment by the “setup()” function

PyCaret needs to initialize an environment by the “setup()” function. Conveniently, PyCaret infers the data type of the variables in the dataset.

Arguments of setup() are the dataset as Pandas DataFrame, the target-column name, and the “session_id”. The “session_id” equals a random seed.

s = setup(df, target='Price', session_id = 20220121)

Create a Model

Due to the simplicity of the technique and the interpretability of the model, we will adopt lr(Linear Regression) for the models that will be used below.

lr = create_model("lr", fold=4)

Introduction of New Features of PyCaret 2.3.6

From here, we will introduce two of the new features.

Dashboard

With this new function, we can create a dashboard for a trained model.

The dashboard function is implemented by ExplainerDashboard, we need the “explainerdashboard” library. We can install it with the pip command.

$pip install explainerdashboard

Then, we can create a dashboard.

dashboard(model)

Parts of the dashboard screen are introduced in the figure below.

EDA(Exploratory Data Analysis)

This new function requires the “autoviz” library.

$pip install autoviz

Just 1 line code. We can perform the EDA.

eda()

Summary

We have seen the new features of PyCaret 2.3.6.

In this article, we saw the Dashboard and EDA function.

Just 1 line.

We can create a dashboard and perform the EDA of a trained model. Wouldn’t it be great? If you sympathize with it, please give it a try.

The author hopes this blog helps readers a little.

January 17, 2022January 17, 2022

Weekly Article News #20

Step-by-step to a Data Scientist > Blog > 2022 > January

The recommended articles the author has read this week.

The topic this week is SHAP. This indicator is useful to evaluate the trained model, even if the model is a black box. As follows, the materials for learning are listed.

Month: January 2022

Weekly Article News #21

PyCaret 2.3.6 is Here! Learn What’s New?

Dockerfile for PyCaret

Dockerfile

Build a Dockerfile

Run a docker container

PyCaret 2.3.6, incredible update; Convert model and Web App

New Features

Installation

Import Libraries

Dataset

Set up the environment by the “setup()” function

Create a Model

Introduction of New Features of PyCaret 2.3.6

Convert model

Create Web App

Summary

PyCaret 2.3.6, incredible update; Dashboard and EDA functions

New Features

Installation

Import Libraries

Dataset

Set up the environment by the “setup()” function

Create a Model

Introduction of New Features of PyCaret 2.3.6

Dashboard

EDA(Exploratory Data Analysis)

Summary

Weekly Article News #20

Official GitHub repository

Introduction to SHAP with Python

Analysing Interactions with SHAP

Weekly Article News #19

Why You Should Start Using Pathlib as an Alternative to the OS Module

Introduction to Regression in Python with PyCaret