node-DeepResearch is an autonomous web research engine that uses large language models to iteratively search, read, and reason over web content to answer complex questions. It provides a chat-based interface that displays real-time reasoning steps and final answers, and can be configured to focus exclusively on academic papers by limiting searches to academic repositories.
The research engine operates through an agentic search-read-reason loop that repeatedly searches, reads, and reasons until a stopping condition is satisfied. It enforces a token budget to cap total consumption and failed attempts, guaranteeing a final answer. It ranks discovered URLs by combining relevance, authority, recency, and diversity, and selects the most relevant text segments from documents via chunked passage selection. For structured long-form reports, it iteratively builds an outline, searches per section, and refines for narrative coherence.
Beyond web text extraction, the engine extracts content from PDFs, generates textual captions for images on web pages using a vision language model, and formats search results for language model consumption. It can route reasoning steps to a local language model for structured output. The application is packaged as a Docker container with environment-variable configuration for portable deployment and also exposes an OpenAI-compatible API endpoint for programmatic client access.