Marker is an LLM-powered document parser and OCR pipeline designed to convert PDFs and unstructured files into structured markdown, JSON, and HTML. It functions as a data preprocessor that transforms complex documents into machine-readable formats while preserving tables, equations, and layout structures.
The system utilizes large language models to refine OCR accuracy, clean mathematical notation, and merge fragmented tables across multiple pages. It employs model-based layout analysis to predict block types and bounding boxes, ensuring a more precise conversion of document elements.
Capabilities include extracting images and structured data based on predefined schemas, as well as chunking documents for retrieval augmented generation pipelines. The project supports high-volume processing by distributing conversion tasks across multiple GPUs.