30 open-source projects similar to dsdanielpark/bard-api, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Bard API alternative.
This project is a Python framework for building autonomous, event-driven agent systems. It provides a unified runtime for orchestrating multi-agent workflows, managing persistent conversation state, and executing code within secure, isolated sandbox environments. The framework is designed to handle complex task delegation, allowing agents to invoke other agents as tools while maintaining context across multi-turn interactions. The framework distinguishes itself through its deep integration with the Model Context Protocol, enabling agents to connect to external data sources and remote services
Pipecat is a framework and software development kit for building real-time multimodal AI agents and speech-to-speech systems. It utilizes a frame-based data pipeline to route audio, video, and text through a modular sequence of processors, enabling the orchestration of low-latency conversational AI. The project is distinguished by its ability to coordinate complex multimodal services, including speech-to-text, language models, and text-to-speech, within a single pipeline. It features semantic voice activity detection for natural turn-taking, state-machine conversation flows for dialogue manag
PyChatGPT is a Python library for sending prompts to language model APIs and receiving generated text responses programmatically. It serves as an API client that integrates language model capabilities into Python applications. The project includes an automated authenticator to retrieve and refresh access tokens without requiring a web browser. It also features a proxy-enabled network client that routes API requests through proxy servers to bypass network restrictions and manage rate limits. To maintain continuity, the library provides conversation state management that tracks chat history an
GLM-4 is a large language model and fine-tuning framework designed for human-like text production, complex reasoning, and multilingual conversation. It functions as a multimodal system capable of processing high-resolution visual content and as a long-context model designed to analyze documents with a context window of up to one million tokens. The project differentiates itself through a function calling interface that enables AI agent development by connecting the model to external APIs and real-time web browsing. It includes specialized capabilities for generating functional programming cod
This project is a framework for building custom AI chatbots capable of PDF document analysis. It implements Retrieval Augmented Generation to connect a large language model to private document data. The system utilizes graph-based agent orchestration to control conversation flow and decision logic. It maintains context across interactions through thread-based state management and delivers AI responses to the user interface via real-time streaming. The project covers PDF document ingestion through chunk-based processing and vector-store retrieval. It includes mechanisms for query-based data r
langchaingo is an LLM application framework for Go designed for building language model-powered applications and autonomous agents. It serves as an orchestration library and tool integration framework that allows developers to link prompt sequences and model calls into complex, multi-step workflows. The project provides a toolkit for implementing retrieval-augmented generation pipelines by processing unstructured documents and retrieving relevant context via vector search. It includes a dedicated integration layer for indexing high-dimensional embeddings and performing similarity searches acr
LLaVA-NeXT is a multimodal large language model framework and training toolkit designed to process interleaved images and video sequences to generate text. It functions as a visual language model that combines vision encoders with language models to perform complex reasoning, question answering, and video understanding. The system is capable of analyzing high-resolution images and temporal video frames to describe events, summarize actions, and reason across multiple visual inputs. It supports the interpretation of documents and charts, spatial environment analysis, and the generation of desc
Botkit is a multi-platform chatbot framework designed to build conversational bots that operate across different messaging services using a unified interface. It provides a core system for multi-platform development, utilizing a platform adaptation layer to translate service-specific API payloads into a standardized internal format. The framework features a conversational dialog manager that coordinates multi-turn interactions through state-tracking, branching logic, and scripted flows. It employs a message processing middleware pipeline to intercept, normalize, and enrich incoming and outgoi
Agency Swarm is a multi-agent orchestration framework and development kit designed to coordinate specialized AI agents through defined communication patterns and handoffs. It functions as a system for managing agent swarms, providing an API gateway to expose these coordinated collectives as production-ready HTTP endpoints. The project distinguishes itself through its Model Context Protocol integration layer, which connects agents to external data sources and capabilities. It implements specialized orchestration patterns, such as the orchestrator-worker model and role-based delegation, to tran
This repository is a sample library and development kit for building conversational bots using the Bot Framework SDK. It provides a collection of task-focused code examples, templates, and implementation guides to help developers create interactive chat interfaces and dialogue flows. The project focuses on integration patterns for the Bot Framework, offering specific examples for implementing custom middleware, identity authentication, and the connection of external bot skills. It includes reference implementations for multi-channel chatbot templates that allow a single agent to operate acros
OpenHands is an autonomous agent framework designed for software engineering workflows. It provides a modular platform for orchestrating AI agents that reason, plan, and execute tasks within isolated, containerized development environments. By integrating with standard version control and development tools, the system enables agents to autonomously navigate codebases, implement features, and resolve issues through iterative reasoning and tool execution. The platform distinguishes itself through a model-agnostic orchestrator that connects diverse language models to a unified tool registry. It
This project is a comprehensive framework for building AI-powered applications, providing a unified toolkit for orchestrating language models, autonomous agents, and interactive user interfaces. It serves as a central library for managing the entire lifecycle of AI interactions, from initial prompt generation and model provider abstraction to complex, multi-step reasoning and tool execution. The framework distinguishes itself through its deep integration with frontend development, specifically by enabling generative user interfaces that render dynamic components directly from model outputs. I
This project is a cross-platform chatbot framework designed to integrate generative artificial intelligence models into messaging services. It provides a unified architecture for building and deploying automated bots that maintain consistent conversation state, user identity, and interaction logic across multiple messaging platforms from a single codebase. The framework distinguishes itself through a modular adapter system that normalizes platform-specific webhooks and events into a standardized internal schema. It includes a comprehensive toolkit for constructing rich, interactive user inter
AI0x0.com is a multimodal AI desktop assistant and cross-application wrapper. It provides a floating interface overlay that integrates large language models into any active software application to facilitate global querying and text automation. The system distinguishes itself through the ability to process real-time screen captures for visual analysis and utilize a voice pipeline for hands-free speech-to-text and text-to-speech interaction. It further enables direct AI content injection by simulating keyboard input to insert generated responses into active software fields. The project includ
Narrator is an artificial intelligence system that converts real-time video feeds into natural language audio descriptions. It functions as a multimodal vision narrator and scene descriptor, using computer vision to transform environmental data from a camera into synthetic speech. The tool operates as a pipeline that captures periodic images from a feed and uses a multimodal large language model to analyze visual events. These analyses are then converted via text-to-speech synthesis into a voiceover that describes real-world activities and surroundings. The system supports automated environm
CodexBar is a macOS menu bar application that monitors AI provider usage limits, credit balances, and reset schedules. It retrieves coding plan usage data from Alibaba Cloud services using either API keys or browser session cookies as fallback authentication, and displays the information through visual credit gauges, CLI commands, and desktop status bar integrations. The application distinguishes itself by supporting multiple authentication methods, including automatic cookie import from Safari, Chrome, or Firefox, as well as API key and access token authentication. It provides real-time trac
ERNIE is a development toolkit for training, fine-tuning, and deploying large language models built on the PaddlePaddle deep learning platform. It provides a comprehensive suite of core components, including an inference server for vision and language models, a training and fine-tuning toolkit, and a framework for building retrieval-augmented generation systems using private knowledge bases. The project features multimodal AI models capable of reasoning across text, images, and video to perform complex visual understanding and information extraction. It distinguishes itself through specialize
Langroid is a multi-agent orchestration framework and tool integration suite designed for building complex AI applications. It serves as a multi-modal integration layer that connects diverse local and remote language models with an agentic retrieval-augmented generation system. The project distinguishes itself through a collaborative message-exchange paradigm, allowing specialized agents to delegate tasks hierarchically and coordinate via structured communication. It features an advanced state management system for conversational AI, including the ability to rewind and prune conversation hist
gptme is a multi-agent orchestration platform designed for autonomous software engineering, terminal-based AI integration, and RAG-enhanced code navigation. It enables the deployment of persistent agents and specialized subagents to decompose complex tasks and execute parallel technical workflows. The system distinguishes itself through a combination of vision-based GUI automation for controlling desktop applications and surgical patching mechanisms for targeted source code modifications. It utilizes git-based memory management to maintain a versioned history of agent identities, lessons, and
Learn_Prompting is an educational project focused on prompt engineering, providing the principles and techniques required to craft effective inputs and improve the quality of generative AI outputs. The project covers advanced prompting strategies to enhance reasoning, reliability, and output quality. This includes techniques for task decomposition, chain-of-thought reasoning, and the use of few-shot and zero-shot guidance. It also addresses model security through the study of prompt hacking, vulnerability analysis, and privacy auditing to prevent sensitive data leaks. The scope extends to th
ollama-python is a Python client for interacting with large language models. It provides an interface for sending prompts to receive text and chat completions, as well as a dedicated client for generating numerical vector embeddings from text. The project includes a wrapper that emulates the OpenAI API, allowing applications built for that standard to interact with local models. It also provides a non-blocking asynchronous client for executing concurrent requests. The library covers the full model lifecycle, including the ability to pull, create, list, and delete models within a local enviro
Kilocode is an autonomous engineering platform designed to orchestrate AI agents for complex software development tasks. It functions as a comprehensive system for automating coding, testing, and repository management by integrating directly with your codebase and terminal. The platform provides a unified gateway for model orchestration, allowing for the management of agentic workflows, event-driven automation, and persistent session state across distributed development environments. The platform distinguishes itself through its federated task management and policy-based access control, which
Rikkahub is an AI model aggregator and frontend interface that provides a unified platform for interacting with multiple large language model providers. It serves as a retrieval-augmented generation chat client with a provider-agnostic gateway, allowing users to switch between different models and endpoints. The platform features a character persona manager for importing structured character cards and behavior settings to define specific interaction styles. It includes a sandboxed code execution environment with a portable Linux agent for running technical scripts and commands within the chat
Feroxbuster is an HTTP directory brute forcer and web resource enumerator designed to discover hidden files and directories on web servers. It functions as a recursive URL scanner that identifies unlinked endpoints and API resources by combining wordlist-based scanning with automated crawling. The tool operates as a proxy-aware fuzzer, allowing network requests to be routed through HTTP or SOCKS proxies for traffic interception or anonymity. It utilizes recursive directory crawling to automatically queue discovered paths and find nested content. The system includes capabilities for discovery
LARK is a development toolkit for training, fine-tuning, and deploying large language models and multimodal models based on PaddlePaddle. It functions as a comprehensive framework that includes an LLM training orchestrator, an inference server, and a multimodal model framework for processing text, image, and video inputs. The project features a retrieval-augmented generation system for building conversational applications that integrate web search and private knowledge bases. It provides specific capabilities for multimodal reasoning and complex logic, enabling the extraction of structured da
This project is a PDF data extraction tool and document preprocessor designed to convert PDF files into structured formats such as Markdown, JSON, and HTML. It functions as an OCR document parser for scanned files, an accessibility automator for generating PDF/UA compliant metadata, and a loader for AI orchestration frameworks like LangChain. The software distinguishes itself through specialized handling of complex document elements, including the conversion of mathematical formulas into LaTeX and the generation of natural-language descriptions for charts and images. It utilizes recursive seg
cpr is a C++ networking library that provides a high-level HTTP request client. It functions as a wrapper around libcurl to simplify the process of sending and receiving data from web servers, specifically managing GET and POST calls and multipart form uploads. The library provides both synchronous and asynchronous execution models, allowing network requests to run on background threads to prevent application freezing. It integrates with the C++ Standard Library to map low-level pointers to standard strings and containers, utilizing RAII for automatic resource management. The project covers
C++ is a high-level HTTP client library and wrapper for libcurl. It provides a C++ interface for making network requests, managing network sessions, and implementing data transfers. The library distinguishes itself by offering an asynchronous HTTP client capable of executing non-blocking requests via callback interfaces. It also functions as a multipart form uploader for transmitting files and structured data, as well as an SSE stream handler for processing real-time server-sent events over persistent connections. Its broader capabilities cover secure web communication through SSL encryption
This project is a collection of structured study notes and conceptual breakdowns designed for the AWS Certified Cloud Practitioner exam. It serves as a technical reference and study guide, organizing cloud service details and architectural principles to assist in certification preparation. The knowledge base is built using markdown files and includes curated cheat sheets and interactive mind-map visualizations. These tools map complex certification topics into visual hierarchies to enable drill-down study paths and rapid revision. The materials cover a wide range of cloud capabilities, inclu
The Gemini Cookbook is a comprehensive collection of implementation patterns, code samples, and development guides designed for building applications with Google Gemini models. It serves as a central resource for developers to integrate multimodal generative artificial intelligence into their software, providing the necessary frameworks to manage model interactions, stateful workflows, and structured data extraction. The repository distinguishes itself by offering specialized toolkits for autonomous agent orchestration, enabling the construction of agents that can execute code, browse the web