# MODSetter/SurfSense

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/modsetter-surfsense).**

14,816 stars · 1,410 forks · Python · Apache-2.0

## Links

- GitHub: https://github.com/MODSetter/SurfSense
- Homepage: https://www.surfsense.com
- awesome-repositories: https://awesome-repositories.com/repository/modsetter-surfsense.md

## Topics

`aceternity-ui` `agent` `agents` `ai` `chrome-extension` `extension` `fastapi` `langchain` `langgraph` `nextjs` `notebooklm` `notion` `ollama` `perplexity` `python` `rag` `slack` `typescript`

## Description

SurfSense is a self-hosted platform designed for building retrieval-augmented generation pipelines and managing private knowledge bases. It functions as a containerized research stack that allows users to index diverse data sources and query them using language models, ensuring that all information retrieval is grounded in specific source citations.

The platform distinguishes itself through its modular architecture, which supports the integration of custom tools and diverse language models via a unified abstraction layer. It facilitates secure, collaborative research environments by implementing role-based access control for shared knowledge bases, while also providing built-in text-to-speech capabilities to convert chat logs and documents into audio content.

Beyond its core retrieval functions, the system includes comprehensive support for data ingestion from various file formats and web sources. It utilizes vector-database-backed indexing to maintain high-dimensional search capabilities and employs asynchronous background processing to handle resource-intensive tasks like media transcoding and document indexing without interrupting system responsiveness.

## Tags

### Artificial Intelligence & ML

- [Retrieval Augmented Generation Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/retrieval-augmented-generation-pipelines.md) — Builds retrieval-augmented generation pipelines that combine private document repositories with language models for accurate, cited answers.
- [Retrieval-Augmented Generation Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/retrieval-augmented-generation-frameworks.md) — Builds retrieval-augmented generation pipelines that process diverse data sources for accurate, grounded information retrieval.
- [Natural Language Querying](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/language-tools/natural-language-querying.md) — Retrieves accurate answers from stored data using natural language questions with direct source citations. ([source](https://cdn.jsdelivr.net/gh/MODSetter/SurfSense@main/README.md))
- [Model Abstractions](https://awesome-repositories.com/f/artificial-intelligence-ml/model-abstractions.md) — Provides a unified interface for interacting with diverse local and cloud-based language models through a common protocol.
- [AI Tool Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-tool-integrations.md) — Extends automated research capabilities by defining unique functions that allow language models to interact with external services.
- [Model Configurations](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/generative-ai/llm-model-integrations/model-configurations.md) — Configures connections to local or cloud-based language models and embedding services for improved retrieval accuracy. ([source](https://cdn.jsdelivr.net/gh/MODSetter/SurfSense@main/README.md))
- [Model Capability Extensions](https://awesome-repositories.com/f/artificial-intelligence-ml/model-capability-extensions.md) — Expands automated research abilities by defining custom functions that allow language models to interact with external data sources. ([source](https://cdn.jsdelivr.net/gh/MODSetter/SurfSense@main/README.md))

### Data & Databases

- [Knowledge Management](https://awesome-repositories.com/f/data-databases/knowledge-management.md) — Creates searchable repositories from diverse data sources to enable efficient information retrieval for professional research teams.
- [Vector Databases](https://awesome-repositories.com/f/data-databases/vector-databases.md) — Utilizes vector-database-backed indexing to enable semantic similarity searches and precise source citation during query execution.
- [Vector Document Indexing](https://awesome-repositories.com/f/data-databases/database-management-systems/database-engines/vector-databases/vector-document-indexing.md) — Utilizes vector-database-backed indexing to maintain high-dimensional search capabilities for private knowledge bases.
- [Data Ingestion](https://awesome-repositories.com/f/data-databases/data-ingestion.md) — Processes and indexes diverse file formats and web sources to build a searchable repository of information. ([source](https://cdn.jsdelivr.net/gh/MODSetter/SurfSense@main/README.md))

### DevOps & Infrastructure

- [Self-Hosted AI Environments](https://awesome-repositories.com/f/devops-infrastructure/deployment-management-strategies/execution-platforms-and-targets/deployment-environments/self-hosted-ai-environments.md) — Deploys a containerized, self-hosted research stack for private language model execution and data processing.
- [Self-Hosted Deployment Platforms](https://awesome-repositories.com/f/devops-infrastructure/self-hosted-deployment-platforms.md) — Deploys the entire research and chat stack within a containerized environment to maintain full control over data privacy. ([source](https://cdn.jsdelivr.net/gh/MODSetter/SurfSense@main/README.md))
- [Self-Hosted AI Infrastructure](https://awesome-repositories.com/f/devops-infrastructure/self-hosted-ai-infrastructure.md) — Deploys self-hosted research and chat stacks within isolated environments to maintain data sovereignty.

### Scientific & Mathematical Computing

- [LLM-Powered Research Interfaces](https://awesome-repositories.com/f/scientific-mathematical-computing/research-analysis-workflows/research-and-data-analysis-tools/research-and-analysis-tools/llm-powered-research-interfaces.md) — Integrates language models with document indexing and custom tools to provide a searchable, citation-backed research interface.

### Security & Cryptography

- [Role-Based Access Control](https://awesome-repositories.com/f/security-cryptography/role-based-access-control.md) — Enforces granular permissions on shared knowledge bases and system settings to facilitate secure collaborative research.

### Software Engineering & Architecture

- [Plugin Execution Engines](https://awesome-repositories.com/f/software-engineering-architecture/plugin-execution-engines.md) — Invokes external functions through a standardized interface to extend the reasoning and data-gathering capabilities of language models.
- [Team Collaboration Tools](https://awesome-repositories.com/f/software-engineering-architecture/team-collaboration-tools.md) — Facilitates secure team collaboration by managing access to shared knowledge bases through role-based permissions.
- [Microservice Orchestration](https://awesome-repositories.com/f/software-engineering-architecture/microservice-orchestration.md) — Deploys modular system components within isolated container environments to ensure consistent execution and secure data handling.
