# haujetzhao/capswriter-offline

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/haujetzhao-capswriter-offline).**

4,770 stars · 436 forks · Python

## Links

- GitHub: https://github.com/HaujetZhao/CapsWriter-Offline
- awesome-repositories: https://awesome-repositories.com/repository/haujetzhao-capswriter-offline.md

## Description

CapsWriter-Offline is a suite of desktop tools that operates without an internet connection, combining local media browsing, voice dictation, audio and video transcription, and 360-degree media viewing into a single application. The project's core identity centers on providing offline functionality for both media handling and speech-to-text workflows.

What distinguishes it is the integration of voice dictation with a persistent local storage layer that saves every audio recording and daily transcript logs, along with a rule-based text normalization engine that converts spoken number phrases and user-defined substitutions using phonetic matching and regex. Recognized speech can be routed to a language model for polishing or role-specific processing based on predefined names. For media, the tool offers a transactional file operation manager for moving, renaming, and deleting files with undo support, and a panoramic media rendering engine that displays equirectangular 360-degree video and images with draggable viewport and device tilt interactions.

Additional capabilities include thumbnail generation with caching and manual refresh or purge, a customizable grid display for browsing images with adjustable sorting and column count, and EXIF metadata display. For audio and video files, speech can be extracted to produce subtitles, plain text, and timestamps for offline analysis.

## Tags

### Artificial Intelligence & ML

- [Audio and Video File Transcription](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-and-video-file-transcription.md) — Extracts speech from audio and video files to produce subtitles, plain text, and timestamps for offline analysis.
- [Audio Transcription](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcription.md) — Extracts speech from audio and video files to produce subtitles, plain text, and timestamps. ([source](https://cdn.jsdelivr.net/gh/haujetzhao/capswriter-offline@master/README.md))
- [Hold-to-Dictate Mechanisms](https://awesome-repositories.com/f/artificial-intelligence-ml/hold-to-dictate-mechanisms.md) — Captures speech while a key is held and inserts the transcription into the active application, saving all audio locally.
- [Offline Media Transcribers](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-transcription/automated-video-transcribers/offline-media-transcribers.md) — Extracts speech from media files and generates subtitles, plain text, and timestamp data offline.
- [Role-Based Text Polishing](https://awesome-repositories.com/f/artificial-intelligence-ml/llm-based-text-processing/llm-text-humanizers/role-based-text-polishing.md) — Routes recognized speech to a language model for polishing or role-specific processing based on predefined names.
- [Role-Based Delegations](https://awesome-repositories.com/f/artificial-intelligence-ml/llm-based-text-processing/role-based-delegations.md) — Routes recognized speech to a language model for role-specific polishing based on predefined names. ([source](https://cdn.jsdelivr.net/gh/haujetzhao/capswriter-offline@master/README.md))
- [Numeral Normalizations](https://awesome-repositories.com/f/artificial-intelligence-ml/numeral-normalizations.md) — Converts spoken number phrases into numeric equivalents using pattern matching rules. ([source](https://cdn.jsdelivr.net/gh/haujetzhao/capswriter-offline@master/README.md))

### Part of an Awesome List

- [Audio and Transcript Logs](https://awesome-repositories.com/f/awesome-lists/ai/voice-dictation/audio-and-transcript-logs.md) — Saves every voice recording as an audio file and maintains daily transcript logs for offline reference. ([source](https://cdn.jsdelivr.net/gh/haujetzhao/capswriter-offline@master/README.md))
- [Key-Held Activations](https://awesome-repositories.com/f/awesome-lists/ai/voice-dictation/key-held-activations.md) — Captures speech while a key is held and inserts the transcription into the active application. ([source](https://cdn.jsdelivr.net/gh/haujetzhao/capswriter-offline@master/README.md))
- [360 Media](https://awesome-repositories.com/f/awesome-lists/media/360-media.md) — Loads and displays 360-degree videos and images with draggable viewport and tilt interaction. ([source](https://haujetzhao.github.io/Panorama-Viewer-HTML/))
- [Comprehensive Media File Organizers](https://awesome-repositories.com/f/awesome-lists/devtools/file-and-directory-management/file-and-directory-moves/comprehensive-media-file-organizers.md) — Moves, renames, and deletes media files with undo support and thumbnail caching to speed up repeated access.
- [Grid Display Customizations](https://awesome-repositories.com/f/awesome-lists/devtools/table-view-management/grid-view-browsings/grid-display-customizations.md) — Ships a configurable grid display with adjustable sorting, column count, and thumbnail quality. ([source](https://haujetzhao.github.io/Gallery-Viewer-HTML/))

### Data & Databases

- [Voice and Transcript Log Persistence](https://awesome-repositories.com/f/data-databases/local-persistence-layers/voice-and-transcript-log-persistence.md) — Records every voice input as an audio file and maintains daily transcript logs for offline reference.
- [Thumbnail Caches](https://awesome-repositories.com/f/data-databases/data-engineering-infrastructure/caching-performance/caching-strategies/thumbnail-caches.md) — Generates and caches smaller image previews to accelerate repeated browsing access. ([source](https://haujetzhao.github.io/Gallery-Viewer-HTML/))
- [Cache Management Operations](https://awesome-repositories.com/f/data-databases/data-engineering-infrastructure/caching-performance/caching-strategies/thumbnail-caches/cache-management-operations.md) — Provides manual thumbnail cache management with reload, redraw, and purge operations. ([source](https://haujetzhao.github.io/Gallery-Viewer-HTML/))
- [Phonetic and Regex Replacements](https://awesome-repositories.com/f/data-databases/text-pattern-matching/text-search-and-replace/interactive-text-replacers/phonetic-and-regex-replacements.md) — Applies user-defined phonetic and regex replacement rules to normalize recognized speech text. ([source](https://cdn.jsdelivr.net/gh/haujetzhao/capswriter-offline@master/README.md))

### Development Tools & Productivity

- [Offline Media Browsers with Dictation](https://awesome-repositories.com/f/development-tools-productivity/offline-media-browsers-with-dictation.md) — Combines offline media browsing with voice dictation for captioning and transcripts.
- [Transactional File Operation Managers](https://awesome-repositories.com/f/development-tools-productivity/transactional-file-operation-managers.md) — Executes file moves, renames, and deletions as atomic transactions with an undo log to revert accidental changes.

### Graphics & Multimedia

- [Audio Persistence Speech Pipelines](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/audio-processing-systems/audio-processing/speech-to-text-pipelines/audio-persistence-speech-pipelines.md) — Captures microphone input, passes it to a speech-to-text engine, saves raw audio files, and writes daily transcription logs to local storage.
- [Panoramic Media and Image Browsing](https://awesome-repositories.com/f/graphics-multimedia/panoramic-media-and-image-browsing.md) — Browses and views 360-degree photos and videos on a local filesystem with EXIF metadata and customizable grid display.
- [Caching Thumbnail Generation Layers](https://awesome-repositories.com/f/graphics-multimedia/image-editing-processing/image-editors/image-cropping-tools/on-demand-thumbnail-generation/caching-thumbnail-generation-layers.md) — Generates smaller image previews on demand, stores them in a cache with expiry, and supports manual refresh or purge.
- [Panoramic Media Viewers](https://awesome-repositories.com/f/graphics-multimedia/local-media-viewers/panoramic-media-viewers.md) — Provides an immersive viewer that loads and navigates 360-degree videos and images with drag and tilt controls.
- [Folder Browsers](https://awesome-repositories.com/f/graphics-multimedia/media-category-browsing/folder-browsers.md) — Loads and displays images from a chosen local directory for immediate browsing. ([source](https://haujetzhao.github.io/Gallery-Viewer-HTML/))
- [Panoramic Media Rendering Engines](https://awesome-repositories.com/f/graphics-multimedia/panoramic-media-rendering-engines.md) — Renders equirectangular 360-degree video and images with draggable viewport and device tilt interaction using WebGL.

### Software Engineering & Architecture

- [LLM Delegation Mechanisms](https://awesome-repositories.com/f/software-engineering-architecture/llm-delegation-mechanisms.md) — Ships a mechanism that routes recognized speech to an LLM for polishing based on predefined role names.
- [Custom Text Normalizers](https://awesome-repositories.com/f/software-engineering-architecture/string-validation-and-normalization/speech-to-text-normalizers/custom-text-normalizers.md) — Applies user-defined rules with phonetic matching or regex to convert spoken text, including numeral normalization.
- [Phonetic and Regex Text Normalizers](https://awesome-repositories.com/f/software-engineering-architecture/string-validation-and-normalization/speech-to-text-normalizers/phonetic-and-regex-text-normalizers.md) — Applies phonetic fuzzy-matching and regex to convert spoken number phrases and user-defined substitutions into normalized text.

### Content Management & Publishing

- [Directory Browsing](https://awesome-repositories.com/f/content-management-publishing/directory-browsing.md) — Provides a file tree browser for navigating local directories and selecting media files. ([source](https://haujetzhao.github.io/Gallery-Viewer-HTML/))
- [Manual File Organizers with Undo](https://awesome-repositories.com/f/content-management-publishing/media-management/file-management-systems/manual-file-organizers-with-undo.md) — Moves, renames, and deletes files safely with undo support for accidental actions. ([source](https://haujetzhao.github.io/Gallery-Viewer-HTML/))

### User Interface & Experience

- [Customizable File Grids](https://awesome-repositories.com/f/user-interface-experience/customizable-file-grids.md) — Ships a configurable file grid that supports adjustable sorting, column count, thumbnail quality, and scroll sensitivity.
