Instructor is a schema enforcement and validation library designed to transform language model outputs into structured, type-safe data formats. It functions as a validation layer that uses Pydantic to ensure model responses conform to specific data models, acting as a tool for forcing large language models to return data in predefined schemas.
The project differentiates itself through a recursive error-feedback loop that automatically retries requests when structural errors occur, passing validation failure messages back to the model to guide corrections. It also includes a streaming parser capable of processing partial fragments of structured objects in real time as they are generated.
The library covers broad capabilities for structured data extraction, including the parsing of complex hierarchical information and nested structures into machine-readable formats. It utilizes prompt injection to translate type definitions into schema instructions and provides a type-safe wrapper interface to map raw responses directly into typed objects.