30 open-source projects similar to artefactual/archivematica, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Archivematica alternative.
ArchiveBox is a self-hosted web archiving system designed to capture and preserve permanent static copies of webpages, media, and PDFs on personal infrastructure. It functions as a digital content curator and personal web archive manager, allowing users to import URLs from bookmarks, RSS feeds, and browser history to create a centralized, searchable knowledge base. The project is distinguished by its ability to archive private, paywalled, or login-protected content using browser cookies and authenticated session persistence. It ensures long-term availability by saving pages in multiple concur
NAPS2 is a suite of document scanning software consisting of a desktop application, a command-line interface tool, and a networked scanner server. It serves as an interface for capturing images from scanners via TWAIN and WIA drivers, organizing those captures into digital documents, and exporting them to various file formats. The project distinguishes itself by providing a networked scanner server that shares local hardware across a network for remote image capture. It also includes a command-line tool for automating document capture and image processing workflows through scripts and termina
Lidarr is a music library automation manager and client controller that monitors artists for new releases and automates the acquisition of music via BitTorrent and Usenet. It serves as a metadata organizer and integrator, connecting download clients with media servers to maintain a complete and up-to-date digital music collection. The system differentiates itself through automated library maintenance, such as scanning for missing tracks to fill gaps and monitoring for higher-quality versions of existing files to perform automatic quality upgrades. It uses configurable naming patterns to renam
ArchivesSpace, the archives management tool
Open-source, web application for archival description and public access.
Beancount is a plain-text double-entry accounting system. It enforces zero-sum transactions, organizes accounts into a hierarchical five-type tree, and verifies balances at specific dates using precision-derived tolerances. Transactions are recorded in plain-text files with a strict syntax that supports currency-specific rounding, automatic interpolation of missing amounts, and comprehensive metadata including tags, links, and payee annotations. Beyond core bookkeeping, Beancount offers investment portfolio tracking with lot-based cost basis management, configurable booking strategies (FIFO,
Fava is a web-based dashboard and query tool for visualizing and analyzing financial records stored in Beancount plain-text ledger files. It serves as a double-entry bookkeeping viewer and plain-text accounting dashboard that renders ledger files as interactive reports, searchable financial tables, and visual tools for exploring balance sheets and income statements. The project distinguishes itself through a specialized BQL query interface that executes SQL-like queries against postings to extract specific financial data and trends. It includes a financial data visualization system for genera
Social reading and reviewing, decentralized with ActivityPub
Baserow is a no-code relational database and application builder that allows users to create structured data tables and business tools through a visual interface. It functions as a headless REST API data backend and a self-hosted data workspace, providing a platform for managing collaborative databases while maintaining full control over data residency. The platform integrates large language models to serve as an LLM-powered data platform, capable of generating database structures, record content, and technical workflows from natural language. It also acts as a Model Context Protocol server,
Aggregates RSS and web content(Calibre recipe), sends to Kindle, and includes an e-ink optimized online reader.
CKAN is an open-source data management platform that provides the foundation for building data portals. It supports the full lifecycle of datasets—from creation and organization to publishing, cataloging with faceted search, and interactive data visualization—all through a web interface. The platform is built on a modular architecture that includes a plugin-based extensibility system, a harvesting framework for importing metadata from external sources, and a standardized RESTful JSON API for programmatic access to datasets and metadata. The web interface is rendered using the Jinja2 templatin
Cataloguing and data/media management application
Deluge BitTorrent client - Git mirror, PRs only
Redash is a self-hosted analytics platform and SQL data visualization tool. It provides a web-based SQL query editor for writing, executing, and scheduling database queries, and functions as a business intelligence dashboard for monitoring metrics via visual widgets. The platform distinguishes itself through its data source connectors, which integrate with various SQL, NoSQL, and API-based stores to retrieve information for analysis. It enables self-service analytics by allowing users to run queries with dynamic parameters and supports shared data reporting via public links or embedded dashbo
ZeroNet is a decentralized web browser and server that enables the hosting and accessing of websites without central servers. It functions as a peer-to-peer content distribution network that utilizes BitTorrent and Bitcoin cryptography to replicate and share site data across a distributed network of users. The system emphasizes censorship-resistant publishing and privacy through the integration of hidden services to anonymize network traffic. Site identity and content updates are managed via a cryptographic system using public-key pairs instead of centralized account passwords. The platform
DVC is a data versioning tool and pipeline orchestrator designed to track large datasets and machine learning models. It functions as a system for managing large data artifacts by storing lightweight metadata in version control while keeping the actual binaries in a separate cache. The project serves as an experiment tracker and remote storage synchronizer, enabling the execution and comparison of machine learning iterations based on hyperparameters and performance metrics. It provides a bridge for pushing and pulling these large data artifacts between local environments and cloud or on-premi
Buku is a personal bookmark manager that provides a command line interface, a portable bookmark database, and a self-hosted server for organizing web links. It functions as a command line knowledge base for saving, tagging, and searching web resources. The system features a portable, mergeable database that supports AES-256 encryption and is designed for cross-device data synchronization. It includes a RESTful API and a self-hosted web interface, allowing users to manage their collection via a browser or programmatic requests. Capabilities include automatic metadata fetching to populate page
Mathesar is a no-code database manager and PostgreSQL GUI that provides a visual interface for managing relational database structures and records. It functions as a low-code data platform for administering schemas, tables, and relationships without the need to write manual SQL commands. The platform allows for the creation of shareable forms to collect data and the management of file attachments linked directly to database records. It includes a PostgreSQL administration tool for controlling database roles, user permissions, and data validation rules. The system covers relational data model
Mealie is a self-hosted recipe management platform designed for personal data ownership and household meal planning. It functions as a digital kitchen assistant that allows users to import, organize, and digitize culinary content from websites, images, and videos into a structured, searchable database. The application supports multi-user collaboration through household management, enabling shared access to recipes and meal plans while maintaining distinct permissions. The platform distinguishes itself through extensive automation and integration capabilities. It features a programmatic interf
Bittorrent software for cats
OCRmyPDF is a command-line tool designed to transform scanned documents into searchable, selectable PDF files. It functions as a document processing pipeline that adds a hidden text layer to image-based files while simultaneously optimizing the document's file size and image quality. By preserving the original visual fidelity of the input, it ensures that digitized documents remain accessible to screen readers and search engines. The project distinguishes itself through a modular architecture that supports custom plugins and the integration of external recognition engines, allowing users to t
Omeka S is a web publication system for universities, galleries, libraries, archives, and museums. It consists of a local network of independently curated exhibits sharing a collaboratively built pool of items, media, and their metadata.
Pi-hole is a self-hosted network utility that functions as a DNS sinkhole server to provide network-wide ad blocking. By acting as a dedicated network gateway, it intercepts and discards requests for known advertising, tracking, and malicious domains across an entire local network, preventing unwanted content from loading on any connected device. The software operates through a lightweight background daemon that handles high volumes of concurrent DNS queries with minimal resource overhead. It utilizes a host-file injection mechanism to redirect traffic toward its local filtering engine and ap
The free and open-source Download Manager written in pure Python