Papra is a self-hosted document management system designed for digital archiving, organization, and retrieval. It serves as a centralized platform for storing files with a focus on security, providing an encrypted file archive using AES-256-GCM and a programmatic interface for managing documents and metadata via a REST API, SDK, and command line tools.
The system distinguishes itself through an automated document ingestion engine that imports files via email forwarding, monitored folders, and webhook listeners. It further enhances discoverability by acting as an OCR document indexer, extracting text from images and scanned documents to enable full-text search across all archived content.
The platform covers a broad range of capabilities, including identity management via OAuth2, role-based organizational partitioning for collaborative spaces, and content-based deduplication. It supports diverse storage backends and provides tools for encryption key rotation and metadata filtering.
The software is delivered as a containerized deployment, allowing for installation and orchestration via Docker.