1 repo
Tools that convert source documents into structured intermediate representations.
Distinguishing note: Focuses on the internal tree representation used for document transformation.
Explore 1 awesome GitHub repository matching content management & publishing · Document Parsers. Refine with filters or upvote what's useful.
Pandoc is a universal document converter that translates content between a wide range of markup and binary formats. It functions by parsing input documents into a unified intermediate abstract syntax tree, which serves as the foundation for consistent manipulation and transformation across diverse output types. The system is distinguished by its modular reader-writer pipeline, which decouples input parsing from output generation to allow for granular control over document structure. Users can programmatically manipulate this intermediate tree through a robust filter system, supporting both ex
Provides a unified intermediate tree structure for consistent document manipulation and transformation.