Layout-parser هو إطار عمل للتعلم العميق لتحليل تخطيط المستندات وتحليل الصور. يوفر مجموعة أدوات لاستخراج المعلومات الهيكلية وأنماط التخطيط من المستندات الممسوحة ضوئياً والصور الرقمية، وتحويلها إلى هياكل بيانات برمجية للتحليل الآلي.
The main features of layout-parser/layout-parser are: Document Layout Analysis, Layout Parsing Toolkits, Document Analysis Models, Document Structure Analysis, Hierarchical Representations, Tabular Grid Mapping, Document Region Detectors, Table Structure Reconstructions.
Open-source alternatives to layout-parser/layout-parser include: kreuzberg-dev/kreuzberg — Kreuzberg is a document extraction engine that converts PDFs, Office files, images, and over 90 other formats into… pymupdf/pymupdf — PyMuPDF is a comprehensive PDF manipulation library and document analysis tool. It serves as a text extraction tool,… grobidorg/grobid — Grobid is a machine learning system designed to transform academic and scientific PDF publications into structured… oomol-lab/pdf-craft — pdf-craft is an OCR-based document parser and structure extractor designed to convert PDF files into structured data,… funstory-ai/babeldoc — BabelDOC is a technical document translation system designed to translate PDF files while preserving their original… bytedance/dolphin — Dolphin is a multimodal layout analyzer and image-to-structure converter that transforms photographed or digital…
Kreuzberg is a document extraction engine that converts PDFs, Office files, images, and over 90 other formats into clean, structured text and metadata. It is built around a compiled Rust core that can be used as a native library, a command-line tool, a REST API server, or a WebAssembly module for browser-based processing. The system is designed to run entirely on self-hosted infrastructure, with no data leaving the user's environment. What distinguishes Kreuzberg is its breadth of integration surfaces and its pipeline architecture. It exposes extraction capabilities through native bindings fo
Grobid is a machine learning system designed to transform academic and scientific PDF publications into structured XML. It functions as a PDF to XML parser and scholarly metadata extractor, identifying and normalizing titles, authors, affiliations, and bibliographic references from research papers. The system utilizes a deep learning document segmenter to divide raw PDFs into functional regions and employs a bibliographic reference resolver to match citations against external registries for metadata enrichment and DOI resolution. It supports a full machine learning model training pipeline, al
BabelDOC is a technical document translation system designed to translate PDF files while preserving their original layout and styling. It functions as a layout-preserving translator that utilizes large language models to convert content into target languages, specifically tailored for scientific and technical documents. The system distinguishes itself through specialized handling of academic content, including the identification and preservation of mathematical formulas and complex layout structures. It ensures technical accuracy by employing glossary-driven terminology enforcement, using so
pdf-craft is an OCR-based document parser and structure extractor designed to convert PDF files into structured data, Markdown, or EPUB ebooks. It utilizes optical character recognition and statistical analysis to identify document hierarchies and extract text and structured content. The system features specialized rendering for mathematical formulas and tables, using heuristic reconstruction to convert tabular data into digital formats. It includes a document structure extractor that builds tables of contents by analyzing font sizes, linguistic patterns, and language model title detection.