1 repo
Algorithms that apply geometric heuristics and spatial analysis to reassemble fragmented text blocks into coherent document structures.
Explore 1 awesome GitHub repository matching content management & publishing · Layout Reconstruction Algorithms. Refine with filters or upvote what's useful.
MinerU is a document parsing pipeline designed to transform unstructured files into machine-readable, structured data. It utilizes deep learning models to perform layout analysis, identifying document regions and extracting complex content such as mathematical expressions. By combining these neural network inferences w