Data in, intelligence out: Machine learning pipelines demystified

Posted on 08-08-2018 , by: admin , in , 0 Comments

It’s tempting to think of machine learning as a magic black box. In goes the data; out come predictions. But there’s no magic in there—just data and algorithms, and models created by processing the data through the algorithms.

If you’re in the business of deriving actionable insights from data through machine learning, it helps for the process not to be a black box. The more you understand what’s inside the box, the better you’ll understand every step of the process for how data can be transformed into predictions, and the more powerful your predictions can be.

Devops people speak of “build pipelines” to describe how software is taken from source code to deployment. Just as developers have a pipeline for code, data scientists have a pipeline for data as it flows through their machine learning solutions. Mastering how that pipeline comes together is a powerful way to know machine learning itself from the inside out.

Data sources and ingestion for machine learning