← All repositories

zylon-aiprivate-gpt

57,116 stars7,611 forksPythonapache-2.01 view
privategpt.dev

Private Gpt

Features

  • Retrieval Augmented Generation EnginesA backend service that processes local documents to provide context-aware conversational responses using both local and cloud-based language models.
  • Local Model RuntimesRun language models locally using specific configuration profiles to manage model parameters, context window sizes, and hardware-specific settings for private, offline processing.
  • Privacy-First AI BackendsA modular service architecture designed to run language models and document processing entirely on local infrastructure for data security.
  • Text Generation ServicesGenerate text completions from a prompt by incorporating ingested document context and system instructions to provide relevant and accurate output for the user in real time.
  • Retrieval-Augmented Generation PipelinesProcesses documents into vector embeddings and stores them to provide relevant context for language model completion requests.
  • Context-Aware Chat InterfacesGenerate conversational responses by automatically retrieving relevant document context and applying prompt engineering to execute completion models for high-level interactions.
  • Document RetrievalRetrieve relevant text segments from stored documents based on a search query while optionally filtering by document identifier and including surrounding context for each search result.
  • Private Retrieval Augmented GenerationBuilding secure, local-first applications that answer questions based on private document collections without sending sensitive data to external cloud providers.
  • Document Ingestion PipelinesManage document ingestion by automatically parsing, splitting, extracting metadata, and generating embeddings for storage within a retrieval augmented generation pipeline.
  • Reranking StrategiesImprove retrieval accuracy by pre-selecting the most relevant documents from an initial set before passing them to the generation process for final answer construction.
  • Document Intelligence PipelinesAutomating the ingestion, parsing, and vectorization of diverse file formats to enable semantic search and intelligent analysis across internal knowledge bases.
  • Document Ingestion PipelinesA data processing workflow that extracts text from diverse file formats and converts them into searchable vector representations for retrieval.
  • Vector Database OrchestratorsA management layer that handles document ingestion, text chunking, and vector embedding storage across various database providers for semantic search.
  • Local Language Model HostingRunning large language models on private hardware to maintain full control over data privacy, security, and infrastructure costs.
  • Contextual Retrieval ServicesRetrieve relevant text chunks from ingested documents based on a specific query to facilitate custom retrieval and generation logic for specialized tasks.
  • File Ingestion ServicesExtract text chunks and metadata from files and store them to provide searchable context for subsequent chat and completion requests within the system.
  • Local Document IngestionIngest local folders of documents into the system for querying with options to watch for file changes and log processing results.
  • Supported File FormatsProcess a wide range of document types including text, office documents, images, and code with automatic fallback to plain text for unsupported formats.
  • Document Ingestion PipelinesParses raw files into structured text chunks and metadata to enable efficient semantic search and retrieval during query execution.
  • Vector Database AbstractionsUses a modular interface layer to support multiple storage backends like local disk, PostgreSQL, or specialized vector databases.
  • Application Configuration ManagersConfigure the application by selecting language model, embedding, and vector store providers, and manage dependencies using environment-specific profiles.
  • Text Ingestion ServicesConvert raw text into a searchable document by processing its chunks and retrieving a unique identifier for filtering future completion requests during the retrieval process.
  • External Model IntegrationsConfigure the application to use external cloud-based language models by defining specific profiles with API keys, base URLs, and model identifiers.
  • Local Infrastructure SetupsRun the application entirely on local infrastructure by selecting local language model, embedding, and vector store providers and downloading necessary model files.
  • Text Embedding GeneratorsGenerate vector representations of text strings to enable consumption by machine learning models and various analytical algorithms for advanced search or classification tasks.
  • Streaming Response ArchitecturesStreams generated text tokens from the language model to the user interface in real time to reduce perceived latency.
  • Enterprise Vector Database IntegrationsConnecting language models to scalable, production-grade vector storage backends to manage large-scale document retrieval and contextual information processing.
  • Vector Database IntegrationsConfigure a vector store by specifying connection details like host, port, and authentication keys within the application settings file.
  • Chroma IntegrationsConfigure a vector store by installing the required dependencies and enabling the database in the application settings file for local disk-based storage.
  • PostgreSQL Vector StoresConfigure a PostgreSQL vector store by installing the required dependencies and providing database connection credentials and schema details in the application settings file.
  • Text Embedding GeneratorsGenerate vector embeddings for arbitrary text input to support custom pipeline implementations and advanced search or analysis workflows across different data sources.
  • Vector DatabasesConfigure a vector store by installing the required dependencies and providing server connection details and security settings in the application settings file.
  • Chat Completion ServicesGenerate conversational text by processing message history and document context while streaming the output to the user in real time to ensure immediate and relevant feedback.
  • Multi-Provider Model IntegrationsConnects to both local and cloud-based language models through a unified interface to balance privacy and computational performance.
  • Hardware Profile DeploymentsDeploy the service using various hardware profiles including CPU-only or GPU-accelerated configurations to match specific system capabilities and performance requirements.
  • Execution ModesSelect between query, search, and chat modes to control how the system uses ingested documents and conversation history to generate responses.
  • Reranking Retrieval LogicsRefines the initial set of retrieved document chunks using a secondary scoring pass to improve the accuracy of generated answers.
  • Execution ProfilesExecute the application using environment-specific profiles to manage local or cloud-based model inference, including support for various GPU-accelerated configurations.
  • Document Deletion OperationsRemove a previously stored document from the system by providing its unique identifier to the deletion endpoint for permanent removal from the search index.
  • Document Retrieval InterfacesRetrieve a list of all stored documents including their unique identifiers and metadata to enable precise filtering of context for chat or completion requests.
  • Document Deletion APIsDelete specific documents from storage by sending a request to the ingestion API endpoint designed for document removal.
  • Text Summarization ServicesSummarize provided text using a language model with options to include ingested document context, custom instructions, and streaming responses for real-time output.
  • System Prompt ConfigurationsConfigure the system prompt for the language model to define specific roles, expertise, or response criteria for chat interactions.