3 repos
Tools that transform diverse document formats into unified, machine-readable text or structured data.
Explore 3 awesome GitHub repositories matching data & databases · Document Format Converters. Refine with filters or upvote what's useful.
This project is an AI-powered document processing engine designed to transform diverse file formats into structured Markdown. By leveraging multimodal language models, it performs complex layout analysis and semantic text extraction, allowing for the conversion of both unstructured files and scanned images into machine
This project is a comprehensive educational repository designed to help developers master the core mechanics, runtime behaviors, and browser-native capabilities of the JavaScript language. It provides a structured knowledge base that covers fundamental language features, such as prototype-based inheritance and event-lo
Docling is a modular framework designed for document parsing, layout analysis, and structured data extraction. It transforms unstructured files and web content into a unified, hierarchical data model that preserves the spatial and semantic relationships between text, tables, images, and layout elements. By normalizing