Automated machine learning or AutoML explained

Posted on 21-08-2019 , by: admin , in , 0 Comments

The two biggest barriers to the use of machine learning (both classical machine learning and deep learning) are skills and computing resources. You can solve the second problem by throwing money at it, either for the purchase of accelerated hardware (such as computers with high-end GPUs) or for the rental of compute resources in the cloud (such as instances with attached GPUs, TPUs, and FPGAs).

On the other hand, solving the skills problem is harder. Data scientists often command hefty salaries and may still be hard to recruit. Google was able to train many of its employees on its own TensorFlow framework, but most companies barely have people skilled enough to build machine learning and deep learning models themselves, much less teach others how.

What is AutoML?

Automated machine learning, or AutoML, aims to reduce or eliminate the need for skilled data scientists to build machine learning and deep learning models. Instead, an AutoML system allows you to provide the labeled training data as input and receive an optimized model as output.

There are several ways of going about this. One approach is for the software to simply train every kind of model on the data and pick the one that works best. A refinement of this would be for it to build one or more ensemble models that combine the other models, which sometimes (but not always) gives better results.

A second technique is to optimize the hyperparameters (explained below) of the best model or models to train an even better model. Feature engineering (also explained below) is a valuable addition to any model training. One way of de-skilling deep learning is to use transfer learning, essentially customizing a well-trained general model for specific data.

What is hyperparameter optimization?

All machine learning models have parameters, meaning the weights for each variable or feature in the model. These are usually determined by back-propagation of the errors, plus iteration under the control of an optimizer such as stochastic gradient descent.