unioffice is a comprehensive document processing suite that provides a PDF document processor, an Open XML document library, a document security toolkit, and a document content extractor. It is designed to programmatically create, read, and modify Word, Excel, and PowerPoint files, as well as generate and edit PDF documents.
The project is distinguished by its native language implementation of the Open XML standard, which removes native binary dependencies to simplify container deployments. It features advanced capabilities for digital document security, including hardware-based PDF signing, content encryption, and sensitive information redaction using regular expressions.
The library covers a broad range of capabilities including the generation and manipulation of spreadsheets with formulas and charts, the creation of presentations, and the editing of Word documents. It also provides tools for PDF form automation, HTML to PDF conversion, PDF/A compliance validation, and AI-powered structured data extraction from unstructured documents.