Introducing Amazon SageMaker Autopilot

12.03.2019

Amazon SageMaker Autopilot is now Generally Available. With this feature, Amazon SageMaker can use your tabular data and the target column you specify to automatically train and tune your model, while providing full visibility into the process. As the name suggests, you can use it on autopilot, deploying the model with the highest accuracy with one click in Amazon SageMaker Studio, or use it as a guide to decision making, enabling you to make tradeoffs, such as accuracy with latency or model size.

Quite often, it is difficult to determine which ML algorithm will work best for a certain data set, not to mention difficult to find appropriate algorithm parameters. Additionally, you need to clean up data or pre-process the data in order to build good ML models. This is time-consuming, and sometimes requires advanced machine learning skills. These issues result in teams taking short cuts or workarounds such as using a dataset as is, instead of cleaning or pre-processing the data. Teams end up using the algorithm that is easy to use rather than the right algorithm for the problem at hand. As a result, businesses struggle with getting to the desired model quality. Also, data scientists with comprehensive ML knowledge spend a lot of time experimenting with different ML models before finding the best one for a particular problem, especially for applications like ad-serving or IoT, that have model size and latency constraints.

Amazon SageMaker Autopilot simplifies this entire process, making machine learning easier, faster, and more transparent. You can now build classification and regression models without deep machine learning knowledge – just provide a tabular dataset and select the target column to predict, and SageMaker Autopilot automatically explores machine learning solutions with different combinations of data preprocessors, algorithms, and algorithm parameter settings, to find the most accurate model. Instead of requiring you to decide which algorithm to use, SageMaker Autopilot automatically selects the right algorithm from a list of high-performing algorithms it natively supports, and evaluates all of them. SageMaker Autopilot also automatically tries different parameter settings on those algorithms to get the best model quality. You can now directly deploy the best model to production with just one click, or evaluate multiple candidates to trade off metrics like accuracy, latency, and model size. There is no need to worry about data cleaning and preprocessing, since SageMaker Autopilot automatically applies different types of data preprocessors on the data before passing it through the algorithms to train models.

Amazon SageMaker Autopilot is now available in the US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), EU (Paris), and EU (Stockholm) AWS regions. Visit the documentation page for more information on SageMaker Autopilot and read the blog post for how to use SageMaker Autopilot for your model creation tasks.

aws.amazon.com