35 个仓库
Systems that perform multi-step tasks and maintain state without manual intervention.
Distinguishing note: Focuses on the execution of multi-step plans rather than the planning logic itself.
Explore 35 awesome GitHub repositories matching artificial intelligence & ml · Autonomous Task Execution. Refine with filters or upvote what's useful.
Hermes-agent is an autonomous AI agent framework and runtime designed to execute complex tasks and synthesize new skills from execution traces. It includes a provider-agnostic gateway for routing requests across multiple model backends and a serverless runtime that suspends idle agent instances and resumes them on demand across containers and virtual machines. The project provides a desktop automation toolset that controls native GUI workflows on Linux by querying accessibility APIs and injecting input events. It further distinguishes itself with the ability to generate procedural skills from
Provides a dedicated runtime for independent AI agents to complete complex multi-step jobs without manual intervention.
Auto-GPT is an autonomous agent framework designed for creating and deploying AI agents that use large language models to plan and execute complex goals independently. The system provides a comprehensive environment for managing the entire agent lifecycle, from initial design and testing to live production deployment. The project features a low-code workflow designer that allows users to define agent behaviors by connecting functional blocks in a visual interface. It includes an agent marketplace for discovering and deploying pre-configured agent templates and a standardized evaluation tool t
Enables agents to independently plan and carry out multi-step actions to achieve specified user goals.
Auto-GPT is an autonomous agent framework that uses large language models to decompose complex goals and execute multi-step tasks without human intervention. It functions as a workflow automation tool that chains language model tasks and manages memory to achieve specific objectives. The project features a visual agent designer that allows users to define behaviors and goals by connecting functional blocks through a graphical interface. It employs a vector database memory system to recall information across different sessions and a sliding-window buffer for immediate short-term context. The
Chains language model outputs to autonomously execute multi-step goals without human intervention.
This project is an autonomous agent framework designed to integrate large language models with popular messaging platforms. It functions as a middleware platform that enables automated, multimodal interactions by decomposing complex user goals into sequential plans, executing them through external tools, and maintaining persistent context across sessions. The framework distinguishes itself through a modular skill architecture and a hybrid memory system. Users can extend system capabilities by installing custom logic modules from community hubs or generating them through natural language. The
Agent framework decomposes user intent into multi-step plans, invokes system tools, and maintains long-term memory across sessions to complete complex objectives.
Langchain-Chatchat is a system for building retrieval-augmented generation applications and autonomous AI agents. It integrates a knowledge base management system and an agent framework to enable language models to interact with private documents and execute multi-step tasks through external tools. The platform supports local deployment of language models on private infrastructure to operate without an internet connection. It includes a multimodal AI platform that combines vision models for image analysis with text-to-image generation capabilities. The system provides a web-based conversatio
Enables language models to independently select and execute tools to resolve complex, multi-step queries.
Awesome Copilot is a comprehensive framework for autonomous software development, providing the infrastructure to orchestrate multi-agent teams and automate complex coding workflows. It functions as a centralized platform for managing AI-driven development, enabling developers to deploy specialized agents that interact with local files, terminal commands, and external APIs to execute end-to-end software delivery tasks. The project distinguishes itself through its focus on governance and extensibility, offering a suite of security controls, policy-based execution guardrails, and audit trails t
Assigns well-defined issues to an agent that operates in an isolated environment to implement solutions and generate pull requests.
Letta is a framework for building, deploying, and managing autonomous AI agents that maintain persistent state across long-term interactions. It provides a comprehensive suite of primitives for defining agents with configurable personas, modular memory blocks, and tool-use capabilities, enabling them to retain user preferences and conversation history over extended sessions. The platform distinguishes itself through its advanced memory management and orchestration capabilities. It allows agents to autonomously update their own memory, perform retrieval-augmented generation, and coordinate com
Maintains persistent state to automatically continue work toward defined objectives without manual intervention.
Claude Code is a command-line interface and multi-agent orchestration framework designed for autonomous software engineering. It enables AI agents to perform codebase modifications, debugging, and Git workflow management while coordinating multiple specialized agents to decompose and execute complex engineering tasks in parallel. The system distinguishes itself through a high degree of isolation and safety, utilizing Git worktrees to create independent working directories for concurrent agents and implementing a tiered permission system that combines user rules, project policies, and OS-level
Performs multi-step tasks and maintains state autonomously in the background without requiring direct user input.
gpt-oss is an open-weight large language model and reasoning engine designed for complex reasoning and agentic workflows. It functions as an AI agent framework and model serving API, allowing for local deployment and the hosting of standardized interfaces to expose model completions and internal reasoning processes. The project distinguishes itself as a quantized inference engine, utilizing tensor parallelism and weight quantization to run high-parameter models on limited hardware. It features a reasoning model that employs chain-of-thought processing to solve multi-step logical tasks. The s
Performs autonomous multi-step tasks such as function calling, web browsing, and Python execution.
GitHub Copilot is an AI-powered development platform designed to integrate large language models directly into coding environments. It functions as an interactive assistant and an agentic workflow orchestrator, enabling developers to automate code generation, perform automated code reviews, and execute complex, multi-step development tasks through natural language prompts. The platform distinguishes itself through its autonomous agent capabilities, which allow for repository-level research, implementation planning, and code modifications across multiple files. It supports a modular architectu
Executes multi-step tasks and maintains state without manual intervention to complete complex operations.
DeepResearch is an autonomous research agent framework designed to orchestrate multi-step information gathering and complex reasoning tasks. The platform functions as an agent orchestration system that manages the entire lifecycle of autonomous research, from initial planning and web navigation to the synthesis of evidence-backed reports. The framework distinguishes itself through a specialized training pipeline that supports the development and fine-tuning of autonomous models using reinforcement learning and structured knowledge graph synthesis. By employing parallel agent coordination, the
Executes complex information gathering tasks autonomously through iterative reasoning and observation.
Agent Zero is an autonomous AI agent framework designed to execute complex, multi-step workflows by managing its own environment, persistent memory, and external tool interactions. It functions as a Python-based automation library that enables agents to write code, execute terminal commands, and perform system-level tasks independently. The system is built to handle large-scale operations through hierarchical agent delegation, allowing for the coordination of subordinate agents to maintain focus and context. The platform distinguishes itself through a focus on secure, isolated execution and s
Operates as an independent agent that gathers information, writes code, and runs terminal commands to complete complex objectives.
Onyx is an enterprise-grade AI platform designed for knowledge management, search, and autonomous agent orchestration. It functions as a centralized system that aggregates unstructured organizational data, enabling secure, context-aware retrieval and interaction across internal documents and communication history. By integrating retrieval-augmented generation with multi-model orchestration, the platform provides a unified interface for teams to query internal knowledge bases and execute complex, multi-step business processes. The platform distinguishes itself through a focus on private infras
Enables autonomous agents to reason through complex objectives and perform sequential actions to complete business processes.
This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer. The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
Deploys autonomous frameworks that plan and execute action sequences with minimal human intervention.
Kilocode is an autonomous engineering platform designed to orchestrate AI agents for complex software development tasks. It functions as a comprehensive system for automating coding, testing, and repository management by integrating directly with your codebase and terminal. The platform provides a unified gateway for model orchestration, allowing for the management of agentic workflows, event-driven automation, and persistent session state across distributed development environments. The platform distinguishes itself through its federated task management and policy-based access control, which
Performs multi-step tasks and maintains state without manual intervention in non-interactive environments.
AutoResearchClaw is an agentic system designed to automate the scientific research process. It functions as an autonomous research agent and workflow automator that manages the entire lifecycle of a project, from initial hypothesis generation and literature review to experimental execution and the production of LaTeX-formatted academic papers. The system distinguishes itself through a multi-agent research pipeline that utilizes structured debates for hypothesis refinement and peer review. It employs a branch-and-merge architecture to explore parallel research directions and integrates human-i
Runs domain-specific research code in sandboxes with self-healing capabilities to produce quantitative physics models and data.
GenericAgent is an LLM agent framework and autonomous system controller designed to manage local systems, web browsers, and hardware interfaces through action and observation loops. It functions as a tool orchestrator that routes model calls to local executors, enabling the automation of complex tasks on a host machine. The project is distinguished by its self-evolving AI agent capabilities, which convert successful execution paths into reusable procedural scripts and skill trees to reduce future reasoning overhead. It employs a context optimization engine that utilizes layered memory hierarc
Processes a queue of natural language tasks sequentially during user inactivity and generates reports.
h2oGPT is a self-hosted platform designed for running large language models and executing retrieval-augmented generation workflows locally. It provides a comprehensive web interface that allows users to index private document collections into searchable databases, enabling context-aware question answering and summarization without exposing sensitive data to external services. The platform distinguishes itself by offering a modular architecture that supports both local model execution and connections to external inference servers. It facilitates the development of autonomous agents capable of
Perform multi-step tasks by delegating actions to external models and tools within an experimental environment to test complex logic.
PentestGPT is an autonomous security testing framework that leverages large language models to plan, execute, and coordinate end-to-end penetration testing engagements. By functioning as an autonomous agent, the system automates the entire testing lifecycle, from initial reconnaissance and vulnerability analysis to the generation of custom exploits and the execution of post-exploitation tasks. The platform distinguishes itself through a multi-agent orchestration system that coordinates specialized AI agents to collaborate on complex, multi-stage attack chains. It integrates multimodal context
Executes autonomous penetration tests by planning and adapting to target responses without manual step-by-step guidance.
Bytebot is an LLM desktop automation framework and virtual Linux desktop environment. It enables AI agents to plan and execute mouse and keyboard actions on a virtual computer using natural language, allowing for autonomous desktop automation and the integration of legacy systems that lack native APIs. The system operates as an LLM API gateway and a Model Context Protocol server, routing requests across multiple language model providers with integrated load balancing and rate limiting. It provides isolated, containerized environments where agents use visual reasoning to interpret screenshots
Executes autonomous, multi-step intelligence tasks programmatically and tracks progress via real-time updates.