AutoGluon is an automated machine learning framework and multimodal library designed to automate the end-to-end pipeline from data preprocessing to high-accuracy model training and validation. It functions as an automated model trainer for tabular, image, text, and time series data, as well as a tool for time series forecasting and foundation model finetuning.
The project is distinguished by its ability to jointly process and fuse different data types, allowing for the construction of multimodal neural networks that integrate images, text, and structured tables. It supports zero-shot inference and forecasting using pretrained foundation models, alongside parameter-efficient finetuning techniques to adapt large models to specific tasks.
Its broader capabilities include automated model selection and ensembling via bagging and stacking, as well as comprehensive computer vision pipelines for object detection and semantic segmentation. The framework also covers probabilistic time series forecasting, named entity recognition for natural language processing, and semantic search based on embedding extraction.
The system provides utilities for deploying trained predictors as cloud endpoints or serverless functions and offers hardware acceleration through ONNX and TensorRT.