Data Formulator is an automated data analysis and visualization platform that uses large language models to interpret natural language instructions for data preparation and reporting. It functions as an interactive workbench where users can clean, filter, and aggregate datasets while simultaneously generating visual representations. By combining conversational interfaces with automated transformation tools, the system enables users to explore data patterns and refine schemas without manual coding.
The platform distinguishes itself through an agentic architecture that translates natural language queries into executable data transformation scripts. It maintains a reactive pipeline that links data cleaning operations directly to visualization rendering, ensuring that every modification to the underlying structure triggers an immediate visual update. The system also supports structured data extraction, utilizing specialized parsing models to convert unstructured inputs like images, text, and web content into normalized tabular formats.
Beyond its core analysis capabilities, the platform provides a sandboxed environment for secure code execution and supports stateful session serialization to persist interaction history. Users can connect to various data sources, including local files and cloud storage, to ingest information for iterative exploration. The project is distributed as a TypeScript-based tool, offering both a conversational interface and command-line automation for managing analysis workflows.