Automated Machine Learning

Automated Machine Learning

The mitigation of manual labor through automation has always been a goal, especially since the dawn of machines with the industrial revolution. While the term automation was coined in the 1940s as it related to motor vehicle assembly, today the term has another meaning: automation of data science, predictive modeling, and machine learning is set to usher in a new era.

What is AutoML?

A machine learning (ML) project can be thought of as a sequence of decisions based on a finite set of considerations: collecting and formatting data, imputing missing values, deciding on independent and dependent variables, etc. Most ML projects follow a similar execution blueprint, which in turn makes the whole process feasible for automation.

Recognizing this, several companies including machine learning giants like Google and Microsoft have entered the fray, each with their own automated machine learning offerings.

My Perspective on AutoML

I have always thought of AutoML as an inevitability. I believed in the value of this approach so much that I built my own such application many years ago. It would search through different methods and rank models by selected performance metrics. I caveat quickly that my effort was not nearly as smooth and pleasant as my experience with today's polished AutoML offerings.

A high-quality predictive model can be constructed by careful orchestration of if-then-else statements that take into account:
- Data completeness
- The metric you want to optimize
- Resources you want to expend in the solution space search

The key is that AutoML is figuring out where and how to add human context and domain-specific knowledge into the sequence. For example, in Azure AutoML, users can:
- Decide which variables to use in modeling
- Exclude certain algorithms from testing
- Define data guardrails like training/testing data splits

While imperfect, I'm impressed with where AutoML is today. It provides a guided, no-code way to automatically try different methods to optimize selected performance metrics, and includes useful explanatory tools to interact with developed models.

The Exciting Future

I can see AutoML rapidly developing to encompass more models than could feasibly be run by a data scientist on any single project. New methods and techniques come out every month—it's impossible for a single data scientist or team to keep up. But an AutoML's library of available methods can always have the latest updates.

These models will execute faster than any data scientist could run them, due to high parallel computing on clusters of servers. The biggest advance I see coming is ever-enhancing ability to explain model output to non-expert audiences through visual aids, graphs, and explanations.

The ability to encode the collective expertise of mathematicians, algorithm experts, UX designers, and storytellers into a few clicks will yield an incredibly powerful tool for turning data into insight and prediction.

The Disconcerting Part

The question becomes: where does this leave the army of data scientists, predictive modelers, and machine learning enablers created over the last decade?

I won't predict that AI will take the jobs of its creators. But there will be an impact, and I think a good one. Historically, training emphasis in AI has been way too much on technique and coding. Machine learning is, like everything before it, all about telling a compelling story. AutoML can now include the talents and creativity of non-coders in telling that story.

The Reality Check

Thinking that some digital robot will be running these models in the future is a mistake. Companies are unlikely to:
- Readily outsource their data in today's digital security environment
- Rely on a blackbox creating more blackboxes

We will need trained and talented humans running AutoML, making sense of the output, and—most importantly—retelling the story they learn to the relevant human audience and making connections in ways only a human can.

AutoML has the potential to take the drudgery out of production while being more inclusive of different talents—just like the machines of the industrial revolution.


Originally posted on LinkedIn on November 14, 2020.

← All posts
aimachine-learningautomldata-science