Weekly Article News #38

The recommended articles the author has read this week.
This letter is posted every Monday.

Announcing PyCaret 3.0 — An open-source, low-code machine learning library in Python

This is long-awaited news. The PyCaret 3.0, a low-code machine learning library, was released! The main features of this version are as follows:

  • Stable support of the Time Series Forecasting module
  • Object Oriented API in experiments. We can create an experiment instance of each call of the setup() function.

TabPFN

The TabPFN is an AutoML python library to construct a Neural-Network-based model on a tabular dataset. The most characteristic feature is that the algorithm is based on Transformer.

Note that the TabPFN supports just classification problems.

I have gotten the impression that the AutoML Python library is evolving more and more.

Weekly Article News #37

The recommended articles the author has read this week.
This letter is posted every Monday.

This week, the author introduces the OSS for managing pipelines. And, also introduce one article.

Apache Airflow

Airflow is a platform, created by Airbnb, to programmatically author, schedule, and monitor workflows. We can manage to schedule and monitor the workflow for data and machine learning pipelines.

Introduction to Apache Airflow

This article introduces the usage of Airflow and comparison with other alternatives. We can check the features and their differences.

Weekly Article News #36

The recommended articles the author has read this week.
This letter is posted every Monday.

PyTorch 2.0

Big news! There was an announcement of the future release of PyTorch 2.0. The first stable version will be released in early March 2023.

Surprisingly, PyTorch 2.0 is backward compatible with PyTorch 1.0. This is because the features of PyTorch 2.0 are new additive features. One of the crucial features is torch.compile. This function accelerates the performance of PyTorch, especially in GPU calculations.

The author can’t wait for it to be released!

Weekly Article News #35

The recommended articles the author has read this week.
This letter is posted every Monday.

This week, the author introduces the OSS for Astronomy, SunPy.

SunPy

Python library to access the data for solar physics. Therefore, we can easily visualize planet positions in the solar system by utilizing SunPy.

Recently, the author posted an article on how to use SunPy to visualize planet positions. Just a quick glance, when you have an interest.
Article Link

Weekly Article News #34

The recommended articles the author has read this week.
This letter is posted every Monday.

This week, the author introduces two practical OSS to perform Bayesian Optimization.

BoTorch

A python library for Bayesian Optimization accelerated by PyTorch. This OSS is worth to be paid attention although it is currently in beta and under active development.

Optuna

One of the most famous python libraries for Bayesian Optimization, developed by Preferred Networks, Inc. We can use it with flexibility, fast execution, and easy parallelization.

BayesianOptimization

This GitHub repository (a python library) is also educational and worth reading, where the algorithm is implemented based on the Gaussian Process.

Weekly Article News #33

The recommended articles the author has read this week.
This letter is posted every Monday.

This week, the author introduces two practical OSS to check the fairness of a machine learning model, e. g. a model is unwillingly biased toward certain information.

Themis ML

A python library to check the fairness. And, this library is built on top of pandas and scikit-learn, so it is expected user friendly.

AI Fairness 360 (AIF360)

A python and R library containing methods for checking fairness.

Weekly Article News #32

The recommended articles the author has read this week.
This letter is posted every Monday.

This week, the author introduces two practical OSS to validate the sensitivity of model prediction, e. g. how sensitive the output is to small changes in the input.

Foolbox

a python library to run fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX.

CleverHans

a python library to implement adversarial attacks against machine learning models.

Weekly Article News #31

The recommended articles the author has read this week.
This letter is posted every Monday.

On writing clean Jupyter notebooks

Jupyter Notebook is an excellent tool for developers. However, there are some differences from the scripting method, so there are things you should know. This article introduces valuable things.

Introducing Snapshot Testing for Jupyter Notebooks

The wonderful OSS, nbsnashot, is a tool for testing a Jupyter Notebook. In the script style, we make test code, however, it is difficult in the notebook style. This OSS makes it possible and easy!

Weekly Article News #30

The recommended articles the author has read this week.
This letter is posted every Monday.

DeepSpeed

Excellent library, developed by Microsoft, for optimization of deep learning training and inference.

Top Explainable AI (XAI) Python Frameworks in 2022

XAI, Explainable AI, is one of the recent hot topics. In this article, 6 popular OSS have been introduced, i.e., SHAP, LIME, Shapash, ELI5, InterpretML, OmniXAI.

Weekly Article News #29

The recommended articles the author has read this week.
This letter is posted every Monday.

This week, the articles are about a Machine learning project environment, e. g. MLflow, and Jupyter Notebook.

In the environment of a machine learning project, it is very important to prepare not only a data analysis environment such as Jupyter Notebook but also an MLflow environment for managing experiment records. Managing experiment records helps to improve reproducibility and project promotion efficiency.

Containerize your whole Data Science Environment (or anything you want) with Docker-Compose

To build the environment of a machine learning project, docker-compose is a powerful tool. By docker-compose, we can build data-analysis and experiment-management experiments separately. This article tells us how to build such a style environment.

Manage your machine learning life cycle with MLflow in Python

This article shows one example of how to use the MLflow tracking server, which is a tool for managing experiment records. There are several styles to build the MLflow tracking server. This article suggests one of the helpful styles.