# pathwaycom/llm-app

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/pathwaycom-llm-app).**

59,341 stars · 1,431 forks · Jupyter Notebook · MIT

## Links

- GitHub: https://github.com/pathwaycom/llm-app
- Homepage: https://pathway.com/developers/templates/
- awesome-repositories: https://awesome-repositories.com/repository/pathwaycom-llm-app.md

## Topics

`chatbot` `hugging-face` `llm` `llm-local` `llm-prompting` `llm-security` `llmops` `machine-learning` `open-ai` `pathway` `rag` `real-time` `retrieval-augmented-generation` `vector-database` `vector-index`

## Description

This project is a data processing engine and AI application platform designed for building production-grade machine learning workflows. It provides a unified programming model that handles both historical batch data and live stream ingestion, enabling the development of real-time ETL pipelines and scalable data transformation workflows.

The framework distinguishes itself through differential dataflow execution, which propagates only changes through a pipeline rather than recomputing entire datasets. It supports distributed state management across worker nodes and utilizes incremental stream processing to trigger computations only when source data updates. These capabilities are paired with a specialized vector search framework that maintains low-latency access to evolving knowledge bases for retrieval-augmented generation.

The platform facilitates enterprise AI integration by connecting large language models to private data sources. It includes pre-built application templates to assist in the deployment of high-accuracy retrieval systems and scalable data pipelines.

## Tags

### Data & Databases

- [Data Processing Frameworks](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-processing-frameworks.md) — Delivers a high-performance environment designed for large-scale data ingestion and complex transformation tasks.
- [Differential Dataflow Engines](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-processing-frameworks/differential-dataflow-engines.md) — Propagates incremental updates through directed graphs to avoid full dataset recomputation during query processing.
- [Unified Batch and Stream Processing Engines](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-processing-frameworks/unified-batch-and-stream-processing-engines.md) — Merges historical batch records and live data streams into a single programming model for consistent processing logic.
- [ETL Workflows](https://awesome-repositories.com/f/data-databases/data-pipeline-orchestration/etl-workflows.md) — Automates the extraction, transformation, and loading of large data volumes to prepare information for downstream analysis.
- [Real-Time Data Processors](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-processing/distributed-processing-frameworks/real-time-data-processors.md) — Ingests and processes information from diverse sources in real-time to ensure continuous visibility into changing data.
- [Vector Search Frameworks](https://awesome-repositories.com/f/data-databases/database-management-systems/database-engines/vector-databases/vector-search-frameworks.md) — Supports low-latency retrieval of evolving knowledge bases for retrieval-augmented generation applications.
- [Event-Driven Data Pipelines](https://awesome-repositories.com/f/data-databases/data-integration-synchronization/event-driven-data-pipelines.md) — Triggers automated data movement and reconciliation based on incoming events to maintain up-to-date information pipelines.
- [Vector Semantic Indices](https://awesome-repositories.com/f/data-databases/search-indexing-technologies/search-indexing/data-indexing-strategies/vector-semantic-indices.md) — Transforms unstructured data into high-dimensional vector spaces to enable rapid similarity searching.

### Networking & Communication

- [Distributed State Management](https://awesome-repositories.com/f/networking-communication/distributed-systems-p2p/distributed-computing/data-synchronization-consistency/distributed-state-management.md) — Coordinates consistent application state across multiple worker nodes to facilitate horizontal scaling for complex transformations.

### DevOps & Infrastructure

- [AI Application Platforms](https://awesome-repositories.com/f/devops-infrastructure/infrastructure/application-compute-platforms/ai-application-platforms.md) — Hosts production-grade workflows that seamlessly integrate live data streams with machine learning model inference.

### Part of an Awesome List

- [Development Platforms](https://awesome-repositories.com/f/awesome-lists/ai/development-platforms.md) — Library for building real-time AI-powered data pipelines.
- [LLM Development Frameworks](https://awesome-repositories.com/f/awesome-lists/ai/llm-development-frameworks.md) — Library for building real-time LLM-enabled data pipelines.
- [RAG Frameworks](https://awesome-repositories.com/f/awesome-lists/ai/rag-frameworks.md) — Production-ready framework for real-time indexing and retrieval.
- [Retrieval Augmented Generation](https://awesome-repositories.com/f/awesome-lists/ai/retrieval-augmented-generation.md) — Templates for building enterprise search and data pipelines.

### Artificial Intelligence & ML

- [Enterprise AI Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/artificial-intelligence-research/enterprise-ai-integrations.md) — Connects large language models to private business data sources to facilitate secure and scalable automated insights.
- [Retrieval Augmented Generation Systems](https://awesome-repositories.com/f/artificial-intelligence-ml/artificial-intelligence-research/retrieval-augmented-generation-systems.md) — Combines real-time information retrieval with generative models to produce context-aware and accurate responses.

### Software Engineering & Architecture

- [Reactive & Event-Driven Systems](https://awesome-repositories.com/f/software-engineering-architecture/software-architecture/architectural-patterns/reactive-messaging/reactive-event-driven-systems.md) — Decouples data ingestion from processing logic using non-blocking mechanisms to handle high-throughput event streams.
