BiliNote is a tool that converts video URLs into structured, organized notes. It works by extracting video content and metadata from major platforms, transcribing audio to text entirely on-device using a local speech recognition model, and then summarizing the transcript with a language model to produce clean notes that can include screenshots and timestamp links.
What sets BiliNote apart is its configurable AI backend, which lets you choose and switch between different language model providers for generating summaries. All transcription happens offline and locally, preserving privacy and enabling use without an internet connection. The tool also maintains a versioned history of every note generated, allowing you to review, compare, or restore earlier versions. For further exploration, generated notes are indexed into vector embeddings, enabling you to ask natural language questions and retrieve relevant passages with answers via RAG.
Beyond its core workflows, BiliNote supports content capture through a browser extension and provides a unified pipeline for ingesting video from multiple platforms, extracting metadata, and coordinating transcription and summarization into a single automated process.