This project is a Python framework for building autonomous, event-driven agent systems. It provides a unified runtime for orchestrating multi-agent workflows, managing persistent conversation state, and executing code within secure, isolated sandbox environments. The framework is designed to handle complex task delegation, allowing agents to invoke other agents as tools while maintaining context across multi-turn interactions. The framework distinguishes itself through its deep integration with the Model Context Protocol, enabling agents to connect to external data sources and remote services
Pipecat is a framework and software development kit for building real-time multimodal AI agents and speech-to-speech systems. It utilizes a frame-based data pipeline to route audio, video, and text through a modular sequence of processors, enabling the orchestration of low-latency conversational AI. The project is distinguished by its ability to coordinate complex multimodal services, including speech-to-text, language models, and text-to-speech, within a single pipeline. It features semantic voice activity detection for natural turn-taking, state-machine conversation flows for dialogue manag
PyChatGPT is a Python library for sending prompts to language model APIs and receiving generated text responses programmatically. It serves as an API client that integrates language model capabilities into Python applications. The project includes an automated authenticator to retrieve and refresh access tokens without requiring a web browser. It also features a proxy-enabled network client that routes API requests through proxy servers to bypass network restrictions and manage rate limits. To maintain continuity, the library provides conversation state management that tracks chat history an
GLM-4 is a large language model and fine-tuning framework designed for human-like text production, complex reasoning, and multilingual conversation. It functions as a multimodal system capable of processing high-resolution visual content and as a long-context model designed to analyze documents with a context window of up to one million tokens. The project differentiates itself through a function calling interface that enables AI agent development by connecting the model to external APIs and real-time web browsing. It includes specialized capabilities for generating functional programming cod
Bard-API is an asynchronous Python wrapper and client for interacting with Google Gemini. It functions as a stateful conversation manager and multimodal interface, allowing users to send text and image prompts to a language model and retrieve responses.
The main features of dsdanielpark/bard-api are: Gemini Integrations, Asynchronous Clients, Multimodal Input Processing, Conversation State Management, Conversation State Persistence, Session-Cookie Wrappers, Multimodal Analysis Tools, Visual Content Analysis.
Open-source alternatives to dsdanielpark/bard-api include: openai/openai-agents-python — This project is a Python framework for building autonomous, event-driven agent systems. It provides a unified runtime… pipecat-ai/pipecat — Pipecat is a framework and software development kit for building real-time multimodal AI agents and speech-to-speech… rawandahmad698/pychatgpt — PyChatGPT is a Python library for sending prompts to language model APIs and receiving generated text responses… zai-org/glm-4 — GLM-4 is a large language model and fine-tuning framework designed for human-like text production, complex reasoning,… howdyai/botkit — Botkit is a multi-platform chatbot framework designed to build conversational bots that operate across different… mayooear/gpt4-pdf-chatbot-langchain — This project is a framework for building custom AI chatbots capable of PDF document analysis. It implements Retrieval…