30 open-source projects similar to martinblech/xmltodict, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Xmltodict alternative.
This project is a Node.js library for bidirectional conversion between XML strings and JavaScript objects. It functions as an XML parser that transforms XML content into structured data and an XML serializer that generates formatted strings from JavaScript data objects. The toolkit includes a data transformer that applies custom processing functions to tags and attributes during the conversion process. It manages XML namespaces and supports the definition of custom root elements to maintain document structure during generation. The system handles XML data parsing, string generation, and name
This repository contains the HTML specification, which defines the core standards for web page structuring, content organization, and document rendering. It establishes the fundamental algorithms for state-machine-based tokenization, tree construction for the document object model, and origin-based security isolation. The specification provides a framework for defining custom elements with independent lifecycles and registries. It also details the requirements for cross-document communication, session history management, and the synchronization of interface properties with content attributes.
htmlparser2 is a collection of tools for high-performance markup parsing, DOM manipulation, and incremental stream processing. It functions as an HTML and XML parser that converts markup strings into structured object trees, alongside a streaming markup parser designed for memory-efficient processing of large documents. The project includes a DOM manipulation library for querying, modifying, and serializing document object model trees. It also provides a web feed parser to extract structured metadata and entries from RSS, RDF, and Atom feeds. The library covers broad capabilities in data par
pugixml is a lightweight C++ XML parser and DOM-based library used for parsing, manipulating, and saving XML documents. It provides a portable toolset for reading XML data from files, strings, or memory buffers and converting them into an in-memory document object model. The library includes a dedicated XPath 1.0 engine for extracting specific nodes and data through path expressions. It distinguishes itself through customizable memory management, allowing heap operations to be redirected to user-defined allocation functions, and the ability to perform in-place buffer parsing to reduce memory
bpmn-js is a browser-based BPMN 2.0 web modeler and rendering engine used for creating, editing, and visualizing business process models. It functions as an XML process modeler that parses BPMN 2.0 XML data into interactive visual diagrams within a web application. The project distinguishes itself as a business process visualizer with capabilities for process flow simulation, which tracks token movement to mimic real-time execution. It also supports diagram version comparison to identify changes between model iterations and provides a layered overlay interface for binding metadata and custom
docx is a JavaScript and TypeScript library for the programmatic generation and manipulation of Word documents. It serves as an OOXML document generator, allowing developers to create formatted office files through code instead of manual editing. The library enables document automation across both Node.js and web browser environments. It supports client-side document export, allowing users to generate and download files directly in the browser without a backend server. Capabilities include the ability to define page layouts, margins, and orientation. Users can programmatically insert documen
Open-XML-SDK is a library for programmatically creating, modifying, and validating Office documents based on the Open XML standard. It functions as an office file generator and XML document parser, enabling the manipulation of word processing, spreadsheet, and presentation files. The library allows for the generation and updating of document content and structure without requiring the native office applications to be installed. It utilizes strongly typed classes and a schema-validated approach to ensure that created files remain compatible and correctly structured. The project provides capab
python-docx is an OOXML document manipulation library used for creating, reading, and updating Microsoft Word files. It functions as a generator for building formatted documents and a parser for extracting text, metadata, and structural elements from existing files. The project provides a comprehensive style management system for defining and applying character, paragraph, and table styles within OpenXML documents. It allows for the programmatic control of document appearance through an object-oriented approach to the underlying XML schema. Capabilities cover a wide range of document generat
SpringSide 4 is an enterprise Java reference architecture and utility library built on the Spring Framework. It provides a pragmatic, best-practice application stack for building RESTful web services, web applications, and data access layers, along with a curated collection of high-performance utility classes for common operations like text, date, collection, reflection, concurrency, and I/O handling. The project distinguishes itself by combining a complete reference application scaffold with production-oriented infrastructure. It includes a JPA-based data access layer that automatically tran
Freeplane is a Java-based mind mapping software and knowledge management system used to create hierarchical visual maps and interconnect ideas. It serves as a visual information organizer that transforms text-based notes into navigable spatial maps to facilitate non-linear thinking processes. The application features a swing-based visual canvas for rendering interactive concept maps and complex node-based layouts. It utilizes an XML-based document organizer to serialize map structures and node attributes into hierarchical files for persistent storage. The tool covers several core capability
htmlq is a suite of command-line utilities for querying and extracting data from HTML documents using CSS selectors. It functions as a query language tool for HTML structures and attributes, providing a way to retrieve specific information from documents via the terminal. The tool provides capabilities for extracting text content, specific HTML attributes, and document fragments. It includes an HTML document formatter for cleaning and reformatting output with consistent indentation, as well as utilities for stripping tags to isolate plain text. The software handles structural HTML processing
This library is a PHP source code tokenizer and static analysis tool that converts raw PHP code into discrete tokens and structured XML representations. It functions as a serializer that transforms token streams into a machine-readable format for programmatic analysis and source tree manipulation. The project uses stream-based XML serialization and fragment-based buffer writing to maintain low memory overhead when processing large files. It allows for custom XML namespace configuration to ensure schema compatibility and avoid naming collisions during the transformation process. The toolkit c
Nokogiri is an XML and HTML parsing library that builds navigable document trees from strings, files, or URLs using native C parsers for speed and standards compliance. It provides a CSS selector engine that translates CSS3 selectors into XPath expressions for querying nodes, an XPath query interface with namespace support, a document manipulation toolkit for modifying parsed documents, XSD schema validation, and XSLT transformation capabilities. The library wraps libxml2 and libxslt C libraries with Ruby bindings for high-performance parsing, and integrates Google's Gumbo parser for standard
xlwings - Make Excel fly with Python!
Tablib is a Python library designed for importing, exporting, and manipulating tabular datasets. It functions as a multi-format data converter and manager, allowing users to move information between different file standards. The library supports data transformation across CSV, JSON, YAML, and Excel formats. It provides a programmatic interface to manage these datasets by adding rows, filtering columns, and segregating records. The system uses a common internal representation and adapter-based mapping to normalize diverse input sources. This allows for consistent reading and writing routines
A Python library to extract tabular data from PDFs
The OWASP Cheat Sheet Series is a comprehensive, community-driven repository of concise security best practices and defensive coding patterns. It serves as a centralized knowledge base for developers and security professionals, providing actionable guidance to secure applications across the entire software development lifecycle. The project covers a vast array of security domains, ranging from fundamental web application hardening and authentication protocols to specialized controls for modern infrastructure and artificial intelligence systems. What distinguishes this project is its decentral
This project is a PHP archive distribution system designed for hosting and deploying versioned releases. It functions as an installation tool and distribution manager that handles the downloading of archives from remote repositories or URLs into local project directories. The system ensures installation security by acting as an integrity verifier, validating archives using hash checksums and digital signatures. It includes a version resolver that matches requested constraints against available releases to ensure environment consistency. The project manages resource organization through XML m
Spout is a spreadsheet file processing library and multi-format generator designed for reading and writing CSV, XLSX, and ODS files. It functions as a stream-based parser that processes large spreadsheet files incrementally to avoid loading entire documents into memory. The library provides capabilities for programmatic spreadsheet generation and data extraction. It supports custom content styling, allowing for the application of fonts, backgrounds, borders, and number formats to individual cells or rows. Beyond basic file input and output, the project covers workbook manipulation through se
This is a Go library for reading and writing XLSX files, providing a toolkit for spreadsheet generation and data extraction. It functions as an Office Open XML parser and generator, enabling the creation of workbooks with support for styles, formulas, and metadata. The project features a data mapper that uses Go struct tags and reflection to automatically align spreadsheet rows with structured data. It also includes a validation engine for defining input constraints, such as dropdown lists and error alerts, to control user data entry. The library covers a broad range of capabilities, includi
CherryTree is a hierarchical note-taking application and rich text document editor designed for organizing information within a nested tree of nodes. It functions as a code-integrated documentation tool and an encrypted knowledge base, utilizing SQLite or XML for local data storage. The project distinguishes itself by integrating developer-centric capabilities, such as syntax highlighting and the ability to execute code blocks directly via an external terminal. It also provides password-protected encryption to secure stored data and prevent unauthorized access to the information tree. The so
drawio is a web-based diagramming tool and cross-platform visual designer used for creating flowcharts, network maps, and technical schemas. It functions as a vector graphics editor and an XML-based diagramming engine that allows users to design and export scalable graphics. The software supports a wide range of technical design tasks, including infrastructure mapping for server layouts and the creation of visual aids for technical documentation. It enables the import of diagram files from other tools to facilitate cross-tool migration.
SVGKit is a graphics framework for the iOS and macOS ecosystems designed for rendering high-performance scalable vector graphics. It functions as a library that utilizes native hardware acceleration to display and interact with vector graphics on Apple platforms. The project provides a programmatic interface for editing vector elements and writing updated files back to disk. It also includes tools to convert vector graphics into rasterized bitmap image formats for use in standard image views. The framework handles the translation of XML-based documents into a hierarchy of hardware-accelerate
This project is a PHP implementation of a CSS selector engine that transforms CSS selector strings into compatible XPath expressions for locating elements within documents. It serves as a converter and expression generator that maps CSS selection logic to the XPath query language. The library processes selectors through a pipeline involving lexer-based tokenization and recursive descent parsing to create an abstract syntax tree. It utilizes pattern-matching logic to handle child and sibling relationships, translating CSS pseudo-classes and selectors into functional XPath logic. These capabil
Wikiextractor is a Wikipedia dump parser and dataset preprocessor designed to extract plain text and metadata from MediaWiki database dumps. It functions as a converter that transforms these archives into structured document files or line-delimited JSON objects for use in text corpora and machine learning datasets. The utility includes a MediaWiki template expander that resolves complex template placeholders into their full text representation. It also supports the isolation and extraction of specific individual pages from a full archive without requiring the processing of the entire dataset.
XMPPFramework is an Objective-C communication framework and networking library used to build instant messaging applications for Mac and iOS. It provides a programmatic foundation for implementing XMPP clients, managing real-time message exchange, and processing structured XML streams between network endpoints. The framework features a modular plugin architecture and extension system that allows for the integration of custom communication capabilities and protocol extensions into the data stream. It distinguishes itself with specialized support for OMEMO end-to-end encryption and a state-based
GnuCash is a double-entry accounting software designed for personal and small-business financial management. It tracks assets, liabilities, income, and expenses using a bookkeeping system that ensures financial accuracy. The platform functions as a multi-currency bookkeeping system and a SQL-based financial ledger, persisting accounting data in relational databases or XML files. The system is distinguished by its extensibility as a Python-scriptable accounting tool, providing Python bindings and a REPL for automating tasks and creating custom reports. It also serves as an investment portfolio
CsvHelper is a library for reading and writing comma-separated value files by mapping data to custom class objects. It functions as a parsing library and data mapper that converts flat-file text into structured data objects and serializes internal data sets back into standard CSV files. The project emphasizes memory efficiency through a parser that optimizes resource consumption. It utilizes field value caching and an interned string cache to store repetitive values, which reduces memory overhead when processing large datasets. The library provides a configuration-driven parsing engine that
EasyExcel is a Java processing library designed for reading and writing XLS, XLSX, and CSV files. It functions as a memory-efficient spreadsheet parser, an object-relational mapper that binds spreadsheet columns to Java class fields, and a stream-based exporter for handling high-volume data. The library distinguishes itself through a streaming model that processes large files row-by-row via listeners to prevent heap memory overflow. It also operates as a template engine, allowing the population of predefined spreadsheet files with dynamic data while preserving original layouts and styles. Br
Tika is a content analysis toolkit and Java library designed for detecting and extracting metadata and text from thousands of different file types. It functions as a universal document text extractor and metadata extraction engine, converting complex files into plain text or XHTML. The system employs a specialized MIME type detector that identifies document formats using magic bytes and metadata to determine the correct parser. It serves as an OCR integration gateway, connecting to external text recognition tools to extract content from image files. The project covers a broad range of extrac